Organizations should be able to freely choose where they store their data and run their applications, argues this data virtualization proponent.
Most organizations probably began their experimentations with cloud computing by first running part of their applications in the Cloud and the remaining on-premises. In other words, they took a hybrid cloud approach, operating cloud computing environments that combine on-premises, private cloud, and third-party public cloud.
Multi-cloud is simply an extension of that approach, whereby organizations run some of their applications on-premises and others on different cloud environments, tapping on services from different public cloud service providers.
Naturally, organizations select cloud providers based on the applications they already have in place. Very likely, organizations chose Azure because they were using Microsoft Dynamics, Office 360, or SQL Server. Others may have opted for the Google Cloud Platform as they run data science applications; or they may go for AWS as they are using S3 storage.
A multi-cloud strategy enables organizations to customize their technology infrastructures in a way that suits their own unique business processes. It enables organizations to benefit from the advantages that each platform brings and minimize the impact of their shortcomings.
Multi-cloud challenges
The most critical challenge in a multi-cloud environment is that data stored with one cloud service provider will be siloed from the data stored in another. This means that when organizations initially set up a multi-cloud environment they will not be able to gain a holistic view of their data.
The traditional data-integration strategy, in which data from the multiple cloud systems is transferred to a single repository, such as a data warehouse, using extract, transform, and load (ETL) processes, is complex and time-consuming, and because they are batch-oriented processes, they cannot integrate the data in real time.
Additionally, although cloud service providers provide very good security within their own environment, security can be an issue when organizations have to access disparate data across the different clouds. There is always the risk of exposing the data to unintended users.
Embracing data heterogeneity
To overcome the first challenge, organizations will need to deploy a robust cloud data integration solution. For years, data integration tools have reliably delivered integrated data from systems like Accounting and Payroll, CRM, Enterprise Resource Planning (ERP), etc., to specific destinations such as a data warehouse.
Currently, with the growth of the Cloud, data integration has also evolved to integrate data from on-premises and cloud systems. Modern data integration and data management solutions, such as data virtualization, can efficiently integrate data spread across public and private clouds, and deliver the integrated data to systems that reside either on-premises or within the cloud environment itself.
Unlike traditional data integration solutions, which rely on data replication to move data from disparate systems into a consolidated repository, data virtualization provides real-time views of the data in its existing locations.
Data virtualization also overcomes that second challenge of security. Because data virtualization is implemented as an enterprise data-access layer, it provides a natural gateway that enables organizations to manage security protocols across multiple cloud systems from a single point of control. With data virtualization, access to data is limited to the intended end users only in line with their authentication levels.
Modern organizations should be able to freely choose where they store their data and run their applications, be it on-premises, in the cloud, in a hybrid mode across both on-premises and cloud, or even across multiple cloud environments.
Especially in the current pandemic-hit business landscape, data virtualization provides organizations with the agility and flexibility to take a data management approach that best suits their business requirements, and not be restricted by concerns over security or data management.
Data virtualization in action
To get a sense of how data virtualization works in the real world, consider the case of a leading global resources company. Over time, this company has spread its enterprise data across multiple continents, across disparate on-premises systems and multiple cloud sources.
The company’s data teams has been running daily, weekly, and monthly reports for a wide variety of different divisions, such as mining, oil and gas, human resources, finance, and health and safety. But when analysts need to access data from multiple sources, the team has to first copy data into a new physical location. This can cost the company crucial reporting time and resources. The company is also running advanced analytics projects in AWS, but for many use cases, the company needs to merge this data with the data in several on-premises sources.
Now, let us say that the company implements an enterprise-wide data virtualization layer across its four major facilities around the globe, embracing all data sources. This layer will seamlessly integrate the data between the multiple cloud systems and the on-premises systems.
Data virtualization will dramatically accelerate access to dashboards, fundamentally enabling self-service access to the firm’s business users. The company has established the data virtualization layer as the universal data access point for all enterprise data, making a much wider variety of enterprise data available for future business reporting and advanced analytics.
To gain the best of multiple worlds of data, many companies are beginning to leverage data virtualization, which has been evolving for many years, to seamlessly integrate data between multiple clouds.