Knowledge warehouses are on the coronary heart of a corporation’s determination making course of, which is why many companies are transferring away from the siloed strategy of conventional information warehouses to a contemporary information warehouse that gives superior capabilities to fulfill altering necessities. At Google Cloud, we regularly work with clients on information warehouse migration tasks, together with serving to HSBC migrate to BigQuery, decreasing greater than 600 studies and several other associated purposes and information pipelines. We’ve even assembled a migration framework that highlights the right way to put together for every part of migration to scale back threat and outline a transparent enterprise case up entrance to get help from inner stakeholders.
Whereas we provide a data management maturity model, we nonetheless obtain questions, particularly round the right way to put together for migration. On this put up, we’ll discover a number of essential questions that come up throughout the preliminary preparation and discovery phases, together with the influence of modernizing an information warehouse in actual life and how one can higher put together for and plan your migration to a contemporary information warehouse.
Tackling the preparation part
An enterprise information warehouse has many stakeholders with a variety of use instances, so it’s essential to determine and contain the important thing stakeholders early within the course of to verify they’re aligned with the strategic targets. They will additionally assist determine gaps and supply perception on potential use instances and necessities, which will help prioritize the very best influence use instances and determine related dangers. These choices can then be authorized and aligned with enterprise metrics, which normally revolve round three principal elements:
Individuals. To ensure you’re getting enter and buy-in on your migration, begin with aligning management and enterprise homeowners. Then, discover the abilities of the challenge staff and finish customers. You would possibly determine and interview every useful group throughout the staff by conducting workshops, hackathons, and brainstorming periods. Keep in mind whereas discussing points to contemplate the right way to safe proprietor sign-off by setting success standards and KPIs, corresponding to:
Expertise. By understanding the present technical panorama and classifying current options to determine impartial workloads, you possibly can extra simply separate upstream and downstream purposes to additional drill down into their dependency on particular use instances. For instance, you possibly can cluster and isolate completely different ETL purposes/pipelines primarily based on completely different use instances or source-systems being migrated to scale back the scope in addition to underlying dangers. Equally, you possibly can couple them with upstream purposes and make a migration plan which strikes dependent purposes and associated information pipelines collectively.
Along with understanding present migration applied sciences, it’s key that you’re clear on what you’re migrating. This consists of figuring out acceptable information sources with an understanding of your information velocity, information regionality, and licensing, in addition to figuring out enterprise intelligence (BI) techniques with present reporting necessities and desired modernizations throughout the migration. For instance, you would possibly need to transfer that day by day report about gross sales to a real-time dashboard. You may also need to resolve if any upstream or downstream purposes ought to be changed by a cloud-native software and may very well be pushed by KPIs under:
TCO of latest answer vs. performance beneficial properties
Efficiency enhancements and scalability
Danger of lock-in vs. utilizing open supply
Course of. By discussing your course of choices, you possibly can uncover dependencies between current elements and information entry and governance necessities, in addition to the power to separate migration elements. For instance, you need to consider license expiration dependencies earlier than defining any migration deadlines. Processes ought to be established to make efficient choices throughout migration and guarantee optimum progress inline, utilizing KPIs corresponding to:
Danger of knowledge leakage and misuse
Income development per channel
New companies launched vs. price of launching them
Adoption of ML-driven analytics
A robust understanding of the processes you plan to place in place can open up new alternatives for development. For instance, a widely known ecommerce retailer wished to drive product and companies personalization. Their current information warehouse atmosphere didn’t present predictive analytics capabilities and required investments in new expertise. BigQuery ML allowed them to be agile and apply predictive analytics, unlocking elevated lifetime worth, optimized advertising funding, improved buyer satisfaction, and elevated market share.
Getting into the invention part
The invention course of is principally involved with two areas: enterprise necessities and technical info.
1. Understanding enterprise necessities
The invention strategy of an information warehouse migration begins with understanding enterprise necessities and normally has quite a lot of enterprise drivers. Changing legacy techniques has implications in lots of fronts, starting from new staff ability set necessities to managing ongoing license and operational prices. For instance, upgrading your present system would possibly require your entire firm’s information analysts to be re-trained, in addition to new further licenses to be bought. Quantifying these necessities, and associating them with prices, will permit you to make a realistic, truthful evaluation of the migration course of.
Alternatively, proposing and validating potential enchancment beneficial properties by figuring out gaps within the present answer will add worth. This may be performed by defining an strategy to boost and increase the prevailing instruments with new options. For instance, for a retailer, the power to ship new real-time reporting will enhance income, because it supplies important enhancements in forecasting and diminished shelf-outs.
This retailer realized that shelf-outs had been costing them hundreds of thousands in misplaced gross sales. They wished to seek out an efficient answer to foretell stock wants precisely. Their legacy information warehouse atmosphere had reached its efficiency peak, in order that they wished a cloud providing like BigQuery to assist them analyze huge information workloads rapidly. Because of migrating, they had been capable of stream terabytes of data in actual time and rapidly optimize shelf availability to save lots of on prices and get different advantages like:
Enterprise challenges that had been beforehand perceived as too tough to unravel will be recognized as new alternatives by re-examining them utilizing new applied sciences. For instance, the power to retailer and course of extra granular information can support organizations in creating extra focused options. A retailer might look into seasonality and gauge buyer habits if Christmas Day falls on a Monday versus one other day of the week. This may solely be achieved with the power to retailer and analyze elevated quantities of knowledge spanning throughout a few years.
Final however not least: Educating your customers is vital to any expertise modernization challenge. Along with studying paths outlined above this may be performed by defining eLearning plans for self examine. As well as, employees ought to have time to be hands-on and begin utilizing the brand new system to study by doing. It’s also possible to determine exterior specialised companions and inner champions early on to assist bridge that hole.
2. Technical info gathering
With a view to determine the execution technique, you’ll need to reply the next query: Will your migration course of deal with an answer layer or an end-to-end lift-and-shift strategy? Going by among the factors under could make this determination easier:
Determine information sources for up and downstream purposes
Determine datasets, tables and schemas related to be used instances
Define ETL/ELT instruments and frameworks
Outline information high quality and information governance options
Determine Identity and Access Management (IAM) options
Define BI and reporting instruments
Additional, you will need to determine among the useful necessities earlier than making a choice round purchase or construct. Are there any out-of-the-box options obtainable out there that meet the necessities, or will you want a custom-built answer to fulfill the challenges you’ve recognized? Ensure you know whether or not this challenge is core to your online business, and would add worth, earlier than deciding on the strategy.
When you’ve concluded the preparation and discovery part, you’ll have some stable steering on which elements you’ll be changing or refactoring with a transfer to a cloud information warehouse.
Go to our web site to learn more about BigQuery.
Due to Ksenia Nekrasova for contributions to this put up.