Topology of a Data Product Team. The success of your data mesh journey… | by Eric Broda | Oct, 2022

By Jessie Hobb On Oct 5, 2022

The success of your data mesh journey will be dictated by your data product team structure rather than your technology choices. Here is what you need to know to setup your data product teams.

I have found that the organizational changes are the hardest part of an enterprise data mesh journey. And unless you plan for them and respect the inherent challenges in changing people’s behaviour, your journey will be long and gruelling.

In this article I discuss the teams involved in a data mesh and define the structure and roles for a data product team.

Zhamak Dehghani, data mesh’s creator, highlights in her outstanding book, Data Mesh: Delivering Data Driven Value at Scale, that data mesh is a “sociotechnical” approach to sharing and access data. This means that while the technical aspects get much of the attention (in fact, most of my articles address the practical technical aspects of data mesh) there are also significant organizational (the “socio” part of “sociotechnical”) considerations.

Think about the changes that data mesh suggests: Data ownership transitions from centralized to decentralized, architecture migrates from monolithic to distributed, and accountabilities change as organizations move from a top-down to a federated governance model.

Simply put, the organizational aspects of data mesh are significant. And these changes point to a simple proposition: your enterprise data mesh journey will change the way data teams are organized and the way they work.

Matthew Skelton’s and Manual Pais’ outstanding book, Team Topologies, describes the ecosystem of teams required to deliver software. Dehghani also uses the Team Topologies approach in identifying teams.

So, first, let’s introduce Skelton and Pais’ teams:

Stream Aligned Team is the primary delivery team which has end-to-end accountability for delivery of a software product or service. Each stream-aligned team interacts with other types of teams (platform, enabling, and complicated subsystem teams, below).
Platform Teams, that provide the tools, utilities, and technical services that makes it easier for Stream Aligned Teams to do their job, typically delivering “X-as-a-Service” capabilities.
Enabling Teams, that act as “consultants” that help Stream aligned Teams to overcome obstacles, typically interacting collaboratively in short bursts.
Complicated Subsystem Teams, that off-load from Stream Aligned Teams delivery of work that requires unique skills or has significant complexity. I find that it many cases it makes sense to provide a “wrapper” around these subsystems to make it easier for data product teams to consume.

Now that we have laid the basic team structure foundation, let’s apply these concepts to data products and the broader data mesh.

A Data Product Team is a Stream Aligned Team. It is responsible for the end-to-end delivery of services (ingestion, consumption, discovery, observability, etc.) required by the data product. Like a Stream Aligned Team, the data product team has a clear scope and boundaries (management for an identified database, set of tables, of files etc.), an accountable owner, and a skilled team to deliver necessary capabilities.

A data product team interacts with several teams:

Producer teams, that manage the source of data that is ingested by the data product Team. This team can be structured as a traditional source system team or, in an enterprise data mesh as a Stream Aligned data product team.
Consumer teams, that access and use data offered by the data product team. Like producer teams, this team can be structured as a traditional data analytics team or, in an enterprise data mesh as a Stream Aligned data product team.
Platform Teams provide “X-as-a-Service” capabilities to make easier for the data product team to ingest, consume, and share data. In my experience, it is quite common in large enterprises to see data product teams supported by several types of Platform Teams, which in many large organizations may include cloud, API, security, and network teams.
Enabling teams, which help the data product team overcome short-term obstacles or address specific needs; In many large organizations this enabling teams may include steering groups, enterprise governance and architecture teams, and training groups.
Complicated Subsystem teams which address problem domains that require deep knowledge that could not be built nor supported by an individual data product team; for example, large enterprises may include teams addressing master data management and mainframe technologies.

But what does a data product team look like? Of course, there is no “standard” data product team. Each data product may require skills specific to the technology stack they are using, or by the type of interoperable interfaces that consumers demand, or by the pipeline techniques used to ingest data. Still, the size of data products teams typically varies between 8–15 (Skelton and Pais recommend a maximum of 15 in any Stream Aligned team such as a data product team).

The data product team is led by a “data product owner”. Dehghani states that this role is “accountable for the success of a domain’s data products in delivering value, satisfying and growing the data users, and maintaining the life cycle of the data products”. They are responsible for establishing the direction and roadmap for the data product, acquiring the funding for the data product, and identifying and liaising with the various stakeholders and other teams that interact with the data product team. Simply put, the data product owner is the face of, and spokesperson for, the data product team.

There are usually several roles that report to the data product owner:

Metadata Management, that define, govern, and manages metadata for the data product. This team/role would have expertise with metadata repositories, and would maintain the integrity of data product semantics, glossary, and knowledge graph.
Data Management and Security, which manages, secures, and governs data in the data product; This team/role has expert knowledge about the business domain, metadata, and security needs of the data managed by the data product.
Consumption Services, that builds, supports, and evolves the services needed to consume data managed by the data product. This group usually has skills in developing the interoperable interfaces offered by the data product.
Ingestion Services, that builds, supports, and evolves the services needed to ingest data managed by the data product. This team/role has the pipeline, SQL, and data engineering skills used to get data into the data product.
Release Management, that manages, communicates, and socializes the overall roadmap and evolution of the data product. This team/role has multiple skills: DevSecOps skills to integrate with the enterprise release process, as well as the communications skills required to broadcast changes in the data product.

Structuring your team, and organization, for success is crucial to accelerating your data mesh journey. Hopefully this article provides you with the organizational building blocks to build your data product teams.

***

This article assumes that you have a very high-level understanding of data mesh. Additional information is available here (data products), here (data mesh patterns), here (data mesh architecture), here (data mesh principles) and here (lessons learned). For interested readers, a full set of my data mesh articles are available here.

***

All images in this document except where otherwise noted have been created by Eric Broda (the author of this article). All icons used in the images are stock PowerPoint icons and/or are free from copyrights.

The opinions expressed in this article are mine alone and do not necessarily reflect the views of my clients.