A data mesh is a set of principles for modern data architecture. It does so by splitting all of the data in an enterprise from a central cluster architecture into separate domains. These domains may have duplicate data, but each domain transforms it into a shape suitable for that particular domain. This methodology employs the "serving and pull" model rather than a centralized cluster's "push and ingest" model.
The data mesh is a network that allows for an exchange of data between the enterprise, this allows its users to know what is happening at all times. It also lets us scale data architectures in two main ways.
Technical capacity: It promotes data democratization which makes it easier for people from other teams to access and use the data as needed. Moreover, it uses its domain system rather than a monolith data system which creates a faster query method and does not create a single point of failure. Finally, its use of the data observability mechanism allows other teams to have a better view of the data being passed, enabling a transparent and optimized system due to potential help from other teams.
Organization capacity: With the data mesh method, each team manages their products within their domain, giving them more autonomy over their products. This increases freedom and creativity while also lowering reliance on a central team. However, if need be, data mesh allows cross-functional collaboration between different teams and domains. Expanding on that, enterprises can scale their data capabilities by using experts from multiple domains to create continuous growth and learning.
The data mesh methodology employs four main principles on which it operates.
Domain ownership by domain: This mandates that data is decentralized to domains and away from a central data team. It encourages teams to take responsibility for their data while giving them access to their domain-driven distributed architecture, analytical and operational data ownership.
Data as a product: This makes it so that each team owns its data as a product. This means they are responsible for its quality, cohesiveness, discoverability, trustworthiness, and security. This approach acts like there are consumers for the data beyond the domain, so developers have to manage their data properly.
Self-serve data platform: This sets the infrastructure to Data Infrastructure-as-a-Platform. This allows for domain-agnostic functionalities, which increases its useability by creating a seamless consumption and creation process for data products.
Federated governance: This creates a standardized interoperability of all its data which promotes organizational and industrial rules and regulations to be adhered to.
The data mesh architecture introduces a new way to store and engage with data as microservices were. Microservices were created to break down monolithic code into smaller, self-contained modules. Just as so, data mesh creates a method to break down the central data storage system into smaller self-contained domain-driven systems that come with its benefits, as explained before. Data mesh is opening a new way for us to view our data and can be the new gold standard and make our processes more efficient and adaptable.
Note: To learn more, visit the official website.
What problem does data mesh aim to address?
Slow data processing
Centralized data governance
Data visualization challenges
Lack of data storage options
Free Resources