Data mesh is indeed a conceptual approach to data architecture and management, much like Agile is an approach to project management and software development.
It is designed to address the complexity and scale of the data landscape in modern organizations. Data mesh decentralizes data ownership and responsibility, spreading it across domain-oriented, cross-functional teams rather than having a centralized data team.
However, there are a number of reasons why starting with technology products and solutions could lead to failures when implementing data mesh:
- Misalignment between product and concept: If the products or solutions being used do not align well with the principles of data mesh, they can create obstacles instead of facilitating the implementation. Many traditional data tools are designed with centralized models in mind, which may not suit the decentralized nature of data mesh.
- Vendor bias: Vendors may have their own interpretations of what data mesh means, shaped by the capabilities of their own products. They may promote a version of data mesh that fits their product offering, but that does not necessarily align with your organization's understanding or requirements. This could lead to a misaligned implementation that does not deliver the expected benefits.
- Overemphasis on technology: Technology is a tool that facilitates implementation, but it is not the entirety of the solution. Focusing too much on technology can lead to neglecting other important aspects of data mesh, such as organizational change, people and culture, processes, etc. Implementing a successful data mesh requires changes to how teams operate, how data is governed, and how data responsibility is managed. These changes can be difficult and require a lot of attention.
- Lack of readiness: Jumping straight into technology might mean that the organization is not ready from a maturity perspective to adopt data mesh. The readiness includes understanding the principles of data mesh, having the necessary data governance framework, and getting buy-in from the entire organization.
- Inflexibility: Vendor-specific solutions often come with a specific way of doing things, which may not always be adaptable to a company's unique requirements. Data mesh should be flexible and adaptable to the specific needs of the organization, and starting with a rigid technology solution may hinder that.
Instead, organizations should start by understanding the principles and philosophy of data mesh, getting the necessary buy-in, and adapting their processes and culture.
Only then should they consider which technology solutions can best support their implementation of data mesh.
Data Mesh and Central Data Store
The data mesh concept indeed deviates from the traditional centralized data lake or data warehouse model. It was designed to solve the problems arising from the massive scale and complexity of managing data in large organizations. The approach of data mesh fundamentally rethinks the data architecture, promoting a decentralized, domain-oriented design where data is treated as a product, owned and handled by cross-functional teams.
In a data mesh, data ownership is distributed to individual business domains, and those domains are responsible for the quality, security, and usability of the data they produce. It's about providing teams with the autonomy to use and manage their own data, rather than relying on a central team to provide data as a service. In this context, data is considered a product, and a product owner within each domain is accountable for the data product's quality, usability, and fitness for use.
Now, vendors may advertise their products as "data mesh compatible" or promote the idea of a "central data lake" within the context of data mesh. It is crucial to understand that these solutions could be helpful in some aspects of implementing data mesh, but they do not constitute the data mesh concept in its entirety.
A centralized data lake could still exist in a data mesh environment, but its role is more of a federated data catalog where it tracks metadata about where different data products live, their schemas, quality metrics, etc. It does not necessarily store all of the data itself, which may instead live in a variety of locations controlled by the individual domains.
So, when vendors promote their solutions, they may focus on the parts of the data mesh where their tools can add value. This could be around aspects such as data governance, data cataloging, data quality management, data security, etc. Their tools could help in implementing a data mesh, but they are not "the" data mesh. The data mesh is a conceptual approach to data architecture and management, not a specific technology or tool.
When considering vendor solutions, it's important to make sure that they support the principles of data mesh, like decentralized data ownership and governance, data as a product, and a domain-oriented approach. And remember, the successful implementation of a data mesh requires not just the right tools, but also significant changes in organizational structures, culture, and processes.