The Hitchhiker’s Guide to Data Modernization

7-The Hitchiker’s Guide to Data Modernization

By: Nisfan Nawaz, CTO/CIO
Grace Federal Solutions, LLC 


Data is the lifeblood of an organization. Accurate, timely and relevant data leads to better decisions, giving organizations the competitive edge that is needed more than ever in today’s rapidly changing business climate. Organizations across vertical and horizontal industries are often challenged with legacy architectures and siloed designs that have crept into being due to rapid expansion, unintentional neglect, or an awareness crisis. “We’ll get to it in the next round of planning” syndrome. An intentional Data Strategy is put on the backburner as other priorities take center stage. The perpetuating result: executives are plagued with a trust issue. Can I trust my data?

Data Analytics is only as good as the underlying data. The engaging Data Visualizations are only as good as the Data Analytics driving them, which in turn, is only as good as the foundational data.

Enter Data Modernization. The need to modernize is typically triggered by business imperatives, often tied in with Business Transformation initiatives. Some examples of business drivers making the case to modernize an organization’s Data Architecture are: data center lease expiration, the need to rapidly integrate acquisitions, hardware and software upgrades/support end of life, and cybersecurity considerations.

If done right, Data Modernization can capitalize on Artificial Intelligence, enhance customer experience, drive revenue and substantially reduce costs. A modern data architecture streamlines and facilitates access to data, offers all the benefits that come with cloud-based solutions and meets data privacy and security regulations.

Data Modernization – The Journey

It’s a journey, so appreciate the scenery. Or it’s a marathon and should not be a sprint. Also, we don’t want to be modernizing our data in a vacuum. Meaning, it is a business outcome driven initiative with a direct line to the organization’s technology strategy informed by the business strategy.

  1. The journey begins with a clearly articulated Data Strategy that is defined as part of the overarching business strategy and goals. It’s likely driven through a Business Transformation initiative
  2. Continue with Data Architecture and Data Management considerations
  3. Plan and execute your migration to the new architecture. Keep it dynamic with a feedback loop
  4. Let the fun and games begin: Data Visualization and Advanced Analytics

The Guide

Data Strategy and Other Things

Recall that Data Strategy is informed by Business Strategy. Identify the people, processes and technology needed to achieve organizational goals for data and analytics. Determine and redesign processes to ensure organizational data is of the highest quality. Assess the technology to store, share and analyze data.  Identify the data needed to achieve organization goals and outcomes. Where will the data be sourced? What’s needed to ensure data integrity and data quality?

A key item of the Data Strategy toolbox is Data Governance. Broadly defined as establishing and governing the processes, roles and responsibilities for data quality, security and compliance. Common Data Governance practices are: defining standards and processes, establishing data categories or taxonomies, appointing data stewards, designing and implementing data quality rules, remediating data quality issues, implementing security and privacy standards.

Work the Data Strategy in the context of achieving business outcomes, such as, improving data-driven decision-making, delighting customers with a frictionless experience and reducing Total Cost of Ownership (TCO) and IT debt. Moving away from expensive proprietary database solutions to open source options can dramatically cut overall IT expenditure. For example, database software licenses can be a large piece of overall IT costs, so can be a prime candidate for cost reductions, if that is an overall business driver.

Progressively elaborate to a deeper understanding of the business challenges and key requirements. For example, is there an optimal understanding of customer retention rates and buying behavior? What AI capabilities will deliver the desired experience? Are current data assets (databases and data architecture) highly proliferated, heterogenous and expensive? Are there inconsistencies when retrieving the “same” data from different applications? Is there a data redundancy crisis? 

Data Architecture and Data Management

Armed with a solid Data Strategy, continue the Data Modernization journey designing a flexible and scalable Enterprise Data Architecture. The architecture should support the business outcomes and needs discovered during the strategy phase. Design a resilient and scalable architecture that anticipates future business models. Based on the need, give consideration to both structured (relational databases such as PostgreSQL) and unstructured (data lakes) data needs. A key Data Modernization business driver is the need to store unstructured data, such as, images, videos, voice recordings, social media posts, handwritten clinical notes (healthcare use cases). Often there is a need for real-time analytics of unstructured data.

Cost, security, performance and growth factors will play into the choice and mix of cloud-based, on-premise, and hybrid solutions. Cloud data migration essentially follows a re-hosting, re-platforming then re-factoring happy path. 

  1. Re-host – AKA lift and shift, refers to moving databases to the cloud as-is, keeping the source and target database engines the same. Think IaaS
  2. Re-platforming – minimally alter to take advantage of the cloud. Offers greater cloud efficiencies in terms of resources, speed, and cost. Allows for code portability and faster, shorter updates. Think PaaS
  3. Re-factoring – materially alter applications to services with new code that’s cloud-native. Allows for mixed technology stacks, rapid adoption of new cloud capabilities, and application agility and scaling. Think Containers and Microservices

Add flavor, and in the process save costs, with Open Source Software solutions and optimization feedback loops to maintain operational excellence. The modernization aspiration is agility, cloud-native, serverless and open source. Modernization transforms the technology footprint away from onsite hosting on servers with data(base) and application solutions that are often proprietary to next-generation services.

Next-gen solutions, such as Microsoft Azure DB, Google Cloud SQL or Amazon Aurora run open-source PostgreSQL and MySQL databases. Cloud-native technologies are optimized for availability, disaster recovery, security, scalability, and performance. Open source solutions have the added benefit of database and application portability across multiple clouds with minimal changes. Consider current and future use cases for multi-cloud strategies in your Data Architecture.

Master Data Management solutions, though expensive in some cases, can be a panacea for chronic and pervasive data integrity and data quality issues. Weigh the costs and benefits when introducing MDM as a component of the architecture. Large organizations with voluminous transactions can reap massive rewards by effectively designing and implementing Customer, Product, and Company Data Models with a well-orchestrated MDM repository. 

A Data Warehouse (and her domain-centered little sister, Data Mart) is an instrumental component of the data architecture for any organization seeking to garner strategic insights from historical and real-time transactions. Unstructured data feeding into the Data Warehouse from a Data Lake combined with an MDM solution, mentioned earlier, all glued together with a solid integration technology (Dell Boomi makes a good argument – for those remnants of the architecture left behind on-prem) makes a winning team when it comes to data quality, integrity, relevance, and timeliness.

A stellar data architecture done for both operational/ODS/OLTP and OLAP supports reliable data analytics and business intelligence, and Big Data insights that optimize organizational data science efforts, as well as enterprise reporting.

Plan and Execute your Migration to the New Architecture

It’s time now to kick off the planning phase. Data Modernization is more than just about data. There is a nontrivial coupling between the infrastructure (servers and networks), databases running on the infrastructure, and the applications that are dependent on the databases. Data traverses this universe – applications to databases supported by the infrastructure. Planning the migration to modernized architecture is a holistic exercise.

Broadly, it starts with a critical Assessment Phase, followed by the Migration Approach and Tactics. Throughout the planning and execution cycle, keep Optimization Considerations front and center.

During the Assessment Phase, cast a wide net to discover and take inventory of application stacks, databases, operating systems, servers, data warehouses, data lakes, and the like. 

  • Identify key dependencies between applications and databases. See if there are non-utilized databases that can be decommissioned
  • Grade the complexity and measure the utilization and workload of legacy applications and databases
  • Note also any hardware, software, and version compatibility requirements
  • Capture logs that could potentially be played back in the new architecture
  • This is the ideal time to optimize software and hardware licensing costs informed by utilization and workload metrics
  • Identify low complexity applications and databases that are great candidates to migrate first
  • Run a feature utilization analysis of Enterprise Edition features. The results of this analysis may reveal that Standard Edition features are adequate – Enterprise Editions of databases can be significantly more expensive than the Standard Edition
  • Identify end of life (EOL) databases and operating system versions
  • Look for ways to reduce current database storage and size. Can older datasets be archived? Can less frequently accessed data be compressed?
  • Understand utilization patterns to optimize compute consumption. The goal is to optimize cloud computing costs by tuning top expensive queries and optimally scheduling database workloads by time frame prior to migration. Take advantage of the pay for use flexibility offered by cloud solutions

The Migration Approach should aim to manage risk by organizing the transition into progressively complex phases. Start with a Proof of Concept (POC) for a low complex scope item. The findings from the POC can feed the planning processes for the remainder of the migration. Typically, migration phases will follow a low, medium, and high complexity methodology.

Capitalize on what the cloud offers and the pay-as-you-go model by adopting open source software solutions, containerizing code, and consolidating databases. Explore opportunities to proactively optimize the new architecture through policies, continuous monitoring, and security and vulnerability assessments.

Finally, execute the plan to migrate to the new architecture. The high-level steps are:

  1. As needed, convert the source schema to work in the target environment
  2. Migrate the source schema, then migrate the source data to the target
  3. For minimal downtime migrations, sync target schema and data with source. Validate target transactions and activity against legacy system
  4. The transition from the source to the new environment
  5. Iteratively remediate applications as necessary, optimizing via a continuous feedback loop of testing and fixing

Data Visualization and Advanced Analytics

Though not technically an aspect of Data Modernization, Data Visualization, Advanced Analytics, and Enterprise Reporting functions can significantly reap the rewards of modern data architecture. The old adage, garbage in equals garbage out, has some relevance here. It’s about trust. Having confidence in the data. Data quality, integrity, and reliability are positive side-effects of modernization efforts, through thoughtful data architecture, design patterns, and migration best practices.

Modern cloud-based architectures come with analytics and Business Intelligence platforms that adapt and scale with the needs of the business. Growing data demands are satisfied seamlessly by harnessing new technology. Organizations can maximize the potential of their data and monetize data and other assets with AI and Machine Learning algorithms, setting the stage for a successful Data Science strategy.

In Conclusion

I. 42

“The answer to life, the universe, and everything” maybe 42 – this was a hitchhiker’s guide after all. However, Data Modernization is somewhat nuanced. Data Modernization does not happen in a vacuum. It is guided by Business Strategy and Technology Strategy.

II. Business Benefits

Data Modernization can capitalize on Artificial Intelligence, enhance customer experience, drive revenue and substantially reduce costs.

III. It’s a Journey

Spend ample time on the architecture and design. Adopt an iterative approach to planning and execution. Mitigate risk by commissioning a POC and migrating low complex data sources first.

IV. Consolidate and Optimize

Take this opportunity to consolidate data sources where appropriate and optimize workload and utilization prior to migrating to the new architecture. Doing so will contribute positively to the Total Cost of Ownership by ensuring compute, memory and storage are commensurate with the demand cycles on the cloud-based architecture. Keep in mind that optimization is also applicable to process improvements as well as deploying the right human talent to be working on the right things at the right time.

V. Garbage In, Garbage Out

The process of modernizing data architecture cleans house and leads to improved data quality. This is an important outcome because it creates the foundation for trust and confidence in the data – supporting visualization, advanced analytics, enterprise reporting, machine learning, and AI. Machine learning and AI are only as good as the data feeding them. So make the data nutrient-rich for AI that one day may arrive at  “the answer to life, the universe, and everything”. Hopefully, it’s something other than 42!

Leave a Comment

Your email address will not be published.