Data Mash-ups: the Reality of a Digital Transformation

While seldom ideal, Digital Transformations typically end up being delivered on top of or next to existing systems; many of which are legacy systems developed with traditional software and engineering practices. So before beginning a digital transformation, understanding the roles those systems play is critical. Understanding the fundamental differences in those systems from a data perspective is also critical

The “systems of record” are those that serves as the foundation for the business.  Examples are HR, Payroll, and Supply Chain. Those systems are typically composed of discrete, proprietary technologies, expensive enterprise software, and support data developed before the current digital era. These systems are tightly controlled and integrate with each other through coupled APIs and implement strict consistency models (e.g., if any step in the end-to-end string of APIs fails, the whole transaction fails).  That is the very essence of strict consistency—all or nothing.

On the other hand are the newer “digital systems” built using open current engineering practices and embracing principles of openness, API-first, always-on, and infinite elasticity. These newer systems borrow fundamentals from the Internet world and are designed to address not just scale but also data dimensionality.

In the past, applications were developed with a rigid methodology and forced highly structured data collection and storage models. Today’s digital-era applications are data-driven and because the data of interest exists in a wide variety of locations, forms and volume it is the data that drives the application development. While some of the data may appear to be structured, the vast majority of data will be unstructured (e.g., free text, voice, video) and will need fluid, scalable, and adaptable data stores.

A challenge of a digital transformation is the embracing of legacy structured data and the newer unstructured data.  Adding to the effort is the dispersion of data across silos of an organization and in different formats. Common to both types of systems is the unprecedented rate of growth of all data.

One approach for addressing the massive amounts of unstructured data is a NoSQL data store. NoSQL, introduced first in 1998 and generally meaning “not only SQL”, has been gaining popularity due to the impacts of data generated from Facebook, Google and others. While the NoSQL data stores allow for some simplistic structured query, they excel in addressing unstructured data and provide better performance than traditional database.

When implementing a NoSQL approach it is important that the advantages of availability, partition tolerance and speed to not overshadow the critical role of consistency.  Without consistency in the overall data capture and storage efforts the growth of disparate data will severely hamper analytics efforts.

The reality is that legacy systems and digital systems must coexist. Digital systems are an absolute requirement for enhancing the customer experience where response times are key. Legacy systems will continue as the “systems of record” and must be embraced. With smart engineering, coupling of the two environments can be effected such that the digital systems operate above the legacy systems only getting data when necessary.