Abstract
Data integration brings data together from different systems to increase its value to any business. Without data integration, there is no way to access the data gathered in one system from another, or to combine data sets so they can be processed, analysed and acted on. Data also is the foundational enabler of end-to-end automation, a hallmark of all digital companies and a huge boost to customer experience, faster time to market, working within an ecosystem and operational efficiency, among other things.
Importantly, data integration helps to clean and validate the data that businesses rely on. Companies need their data to be robust, free of errors, duplication, and inconsistencies, which is a big ask when a company is running maybe hundreds of databases with the data often held in silos and incompatible formats.
It becomes even more of a challenge as the extended enterprise becomes increasingly common – that is companies using a platform-based ecosystem to provide products and solutions with partners in B2B and B2B2X business models. In this operational model, companies also need to offer their data to third parties in a secure, controlled and timely way.
Companies need a proper integration strategy to make their data more usable and relevant, avoiding pitfalls such as heavily customized integrations that fit today’s needs but ossify rapidly and become obstacles when flexibility and fluidity in data’s use are what’s required.
This white paper looks at what data integration is and the various tools and approaches that are available for future-proof data integration – such as APIs, data virtualization, containerization and microservices, cloudification, analytics and business intelligence, and working with a data integration specialist firm such as Torry Harris Integration Solutions (THIS). It concludes with critical success factors.
Introduction
One of the defining characteristics of a digital company is that it bases decisions on information and intelligence derived from data. Yet this apparently straightforward principle of using data to inform business and operational decisions is anything but simple to follow. This is because of the volume, variety, velocity and veracity aspects shown below. Also, the greatest value is derived from combining data sets to extract value and insights – hence data integration is a multi-faceted and complex discipline.
Volume
Companies generate immense amounts of data from sources including their networks, services and customers, as well as gather it from other sources like social media |
Variety
The data comes in many formats, is often incomplete, siloed, incompatible, structured and unstructured. |
Veracity
Data should be a ‘single source of truth’ – the cleansing needed to get it to that point is not necessarily simple. |
Velocity
We need the speed that data is produced at rises all the time. The question is but how fast a firm can act on it in real time or close to it. |
Value is derived from integrating data to gain insights, improve CX, efficiency, profitability & TTM
|
It is becoming more complex as the extended enterprise becomes increasingly common – that is companies using a platform-based ecosystem to provide products and solutions with partners in B2B and B2B2X business models. They need to expose their own data and consume data from others to interoperate successfully.
It’s worth noting that when data integrations fail, it is rarely solely due to technical issues so much as poor planning, strategy and execution – and sometimes tools can seem more of a hindrance than a help. The starting point always has to be for enterprises to have a clear vision of what they want to achieve through data integration, and resist the temptation of a shiny new piece of technology providing the reason to kick off a project.
After the vision, comes the strategy to make it happen, and other preparatory steps. For instance, the quality of data is critical – garbage in, garbage out applies here just like everywhere else. Companies need to think about data cleaning – removing records that are incomplete, duplicated, out of date and so on – before they start integration.
The data lifecycle

It is also crucial to see data integration as part of the wider transformation canvas. It cannot be viewed in isolation. That custom integration of two monolith systems might work really well now, but what happens when business needs change and the data must be exposed and combined with other systems’ or parties’ data sets? That customization could become a most inflexible barrier?
Never lose sight of the fact that data is the key enabler of automation – another hallmark of a digital company – and that goal is end-to-end automation, not isolated automated islands, to deliver operational efficiency, the best customer experience, shorter time to market, and increased profitability.
What is data integration?
Google describes data integration as the process of pulling data together from different sources to gain a unified and more valuable view of it, so that businesses can make better decisions faster. Data integration can consolidate all kinds of data – structured, unstructured, batch and streaming – for everything from basic querying of inventory databases to complex predictive analytics.
According to Gartner Research, data integration involves a whole series of practices, architectures, techniques and tools. In the first instance, this is to achieve consistent access to an enterprise’s many sources of data and then for the delivery of data to meet all the data consumption requirements of applications and processes. Naturally these applications and processes are prone to change over time as business and operational needs change.
Even accessing the data can be difficult because it is typically in many, often incompatible, formats and stored in siloes, many of which were not designed with sharing and combining data in mind.

Also, for a long time, vendors developed specific data integration tools for particular sectors, rather than generic ones. In the recent past, most effort has been put into extract, transform, load (ETL) tools. Other subsections of tools include for data replication and enterprise information integration (EII), with vendors optimizing tools for particular approaches to data integration. There are also tools for data quality, data modelling and adapters that can be applied to data integration.
The specific-sector approach resulted in a highly fragmented tool market for data integration which added to the complexity of integrating data in large organizations because they were forced to buy portfolios of tools from many vendors to assemble all the capabilities they required. Also, different teams use different tools, with little consistency but a lot of overlap and redundancy, and no common management of metadata.
To read more