Data Driven Management: The Why, Who, What and How?
While many organizations aspire to become data-driven, a significant portion of them tends to focus narrowly on the technical aspects of data, treating it primarily as a technical asset. Consequently, their investments and initiatives often center around technology-driven efforts. However, it’s imperative to recognize that technology serves as a means to an end.
I believe that it’s crucial to address 4 fundamental questions before plunging into the detailed technical implementation plan:
- Why should we harness data within our organization? In other words, what benefits can data bring to our organization?
- Who are the key stakeholders involved in working with data? This group spans a wide spectrum, including business users, executive management, customers, as well as technical roles like data engineers and data scientists.
- What systematic approach can be taken to transform raw data into tangible value? This process involves a step-by-step blueprint for deriving value from data.
- How can we effectively achieve our data-driven aspirations? This involves the integration of technology and a robust data management practice, aligning the data perspective with the overarching business and organizational objectives.
In this article, I will delve deeper into these pivotal questions and explore how a well-defined data strategy can play a vital role in ensuring that the answers to these questions align seamlessly with the strategic goals and projects of your company.
Why?
I always advocate for uncovering the underlying motivations behind your data-driven management ambitions. What drives the value of working with data? Addressing the “Why” question aids in gauging the potential return on your forthcoming data projects. Clarifying your data ambitions will provide valuable guidance for numerous subsequent decisions.
A commonly employed model for illustrating value drivers is Bain & Company’s B2B and B2C value pyramids. From this framework, several value drivers that I frequently leverage in data projects include:
- Transparency — Utilizing dashboards to offer a transparent and objective view of company metrics. For instance, a well-crafted marketing report can render critical campaign metrics transparent to a wide audience of marketers and campaign stakeholders.
- Cost Reduction — Reporting tools empower in-depth analysis of process performance, facilitating the identification of areas where cost reduction is achievable.
- Integration — Data projects enable the integration of data from diverse sources, generating new perspectives and insights. This integration can help detect anomalies and provide end-to-end process insights.
- Availability — Prompt-driven AI tools (or chatbot interfaces) are virtually always accessible, allowing you to request information around the clock.
- Time Savings — AI-driven applications often automate repetitive tasks that occur at high volumes. For example, they can automatically assign service desk tickets to the appropriate employees or summarize social media sentiment in a concise dashboard.
- Sustainability — Data-driven tools can elucidate ESG-related metrics based on both internal and external data sources. This assists organizations in advancing their sustainability objectives.
I’d like to answer the “Why” question using concrete use case scenarios. Describe concretely the scenario where you would like to use data and what the according value will be. This will help later to identify “Who” will be using data in which process.
Who?
Answering the “Who” question helps assess the cultural context of your current and envisioned organization. Data alone cannot create value; you require individuals to lead your organization toward data insights and take actions based on data-driven results. Understanding your prospective data users is essential for making informed choices regarding tools, infrastructure, and building a data-driven culture.
Here are some examples cultural characteristics that will influence answers on the later “What” and “How” questions:
- Software Crafting DNA — Does your company have a tradition of building and maintaining its own software? Do you plan to continue investing in development? In this case, collaborating with an internal data engineering team could be a prudent choice.
- IT Outsourcing Culture — Is your company accustomed to outsourcing IT solutions? If so, I recommend considering a ‘buy’ approach instead of developing in-house solutions.
- Room for Initiative — Does your organization promote bottom-up employee initiative? If the answer is yes, fostering a self-service data culture, where employees can generate their own insights, would be beneficial.
- Specialized Domain Knowledge — Does your organization possess unique domain knowledge that is difficult to find elsewhere? In such a scenario, it would be worthwhile to identify those domains and to invest in specialized, tailor-made data products.
- Specialized Technical Knowledge — Is specialized technical knowledge around data-related topics such as cloud, databases or data pipelines already available in the organization? The organization’s maturity in such technical domains will be crucial in selecting how to roll out the right data platform.
- Analytical Basis — Are spreadsheet analysis already common sense for decision making in your organization? Then it would be a good idea to take those users one step further and providing them with more mature tooling to do better and deeper analysis.
What?
In broad terms, the “What” questions can be addressed by following the fundamental steps outlined in the DIKW Pyramid, which involves the process of refining raw data into wisdom, passing through stages of information and knowledge.
I prefer to break down these generic stages into three slightly distinct components:
- Data Products: For the “What” phase, it’s essential to identify the specific data products required to facilitate the use cases outlined in the “Why” stage. These data products can be thought of as reusable datasets, such as those containing information about orders, invoices, or customers.
- Data Tools: Determining which tools will be employed by the designated data individuals (as identified in the “Who” stage) is a critical aspect of the “What” phase. Assess whether your use case scenarios necessitate specialized statistical tools or if you can seamlessly integrate data-driven algorithms into existing digital applications. The concept of data usability closely ties into the selection of data tools.
- Raw Data: It’s also important to pinpoint the raw data required to construct the identified data products. The identification of raw data and data products often involves an iterative process, where the availability of raw data inspires the creation of new data products and vice versa.
Now that you have a comprehensive understanding of both the inputs (raw data) and outputs (data products), you can proceed to identify the necessary data transformations, technically referred to as ‘pipelines,’ needed to convert raw data into these products. During this phase, it becomes particularly pertinent to capture the functional requirements associated with these pipelines. Example functional requirements are:
- Data Freshness: Determine the desired freshness level for your data products. Do you require real-time information, or is an hourly update sufficient?
- Data Volume and Velocity: Evaluate the data volumes that need to be transformed and integrated, along with the rate at which this data volume is expanding.
- Data Variety: Identify the diverse range of raw data sources that need transformation. Consider whether these sources consist mainly of structured data within relational database systems (RDBMS) or if they also encompass unstructured data types like documents, videos, and images.
How?
Now that you have clarity on the answers to the “Why,” “What,” and “Who” questions, the next step involves addressing the “How.” In this phase, you’ll determine the optimal technical architecture and develop strategies for effectively managing your organization in light of these newfound data-driven capabilities.
Technology
The objective is to discover technical products that align with your goals and harmonize with the “Why,” “What,” and “Who” aspects of your organization. For instance, it would be impractical to opt for custom development-centric data approaches if you lack a strategy to internalize development capabilities within your organization.
When establishing a data platform, a range of technological choices must be carefully considered. While this list is not exhaustive, it highlights some key trade-offs:
- Buy vs. Build: It is essential to clearly delineate which components of your data platform should be built in-house and which ones are better suited for purchase or rental. Re-inventing solutions for already solved problems often yields little value, whereas creating a unique selling proposition (USP) within your platform can be highly beneficial. Of course, the cost and performance implications of services and tools play a significant role in these decisions.
- Type of Data Transformations: Consider who needs the capability to integrate and transform data. If your target users are data engineers dealing with large data volumes, using technologies like Spark with Python or Scala for pipelines is a sensible choice. For less complex scenarios, low-code solutions or SQL may be more suitable. The required data freshness also factors into this decision, as certain pipeline technologies are well-suited for real-time scenarios, while others are not.
- Computing Engine: The volume and nature of your data, along with the freshness requirement, dictate the choice of computing engine. Using extensive big data computing power for small datasets often proves inefficient in terms of cost and performance. Additionally, transforming unstructured data before loading it into a relational database management system (RDBMS) is typically considered an anti-pattern. In such cases, a computing architecture supporting a lakehouse pattern may be more appropriate.
- Cloud or On-Premises: Similar to the buy vs. build decision, it’s crucial to assess which parts of your data platform are well-suited for the cloud and which are not. Factors to consider include your organization’s cloud vision and maturity. Additionally, analyze the future operational expenditure (opex) associated with cloud services versus the traditional capital expenditure (capex) of on-premises solutions.
- Integrations: Facilitating the seamless flow of data to the right place at the right time within your organization is vital for generating value. Therefore, defining data integration patterns is a critical step. Common options include data access via APIs, database connections, or the creation of data topics within streaming pipelines.
Technical decisions are put into action through the realization of a data platform, which extracts value from data. This data platform comprises a blend of technical products: (cloud) services, off-the-shelf software products, and tailor made technical solutions. This resulting software stack should meet the criteria necessary to fulfill the objectives outlined in response to the “Why” question.
Data Management
Establishing a robust technological foundation is a critical step in launching your organization’s data initiatives. Effective data management is the linchpin that ensures the sustained success of your comprehensive data practices. Within the framework of an ambitious data-driven organization, several key data management factors come into play:
- Roles and Responsibilities: A pivotal aspect of data management involves creating a clear model outlining roles and responsibilities for data handling. This framework answers the fundamental question: Who is accountable for specific data? Once defined, this structure identifies the go-to individuals for addressing inquiries about particular data sets or resolving data quality issues. Data owners, however, do not operate in isolation. A RACI model, for example, can be constructed to complement and fulfill roles and responsibilities.
- Data Quality: The effectiveness of any data-driven product or subsequent technology utilization hinges on the underlying quality of the data. An essential data management practice revolves around constant monitoring of data quality. Data owners (as mentioned in the previous point) are encouraged to initiate measures aimed at enhancing data quality.
- Data Discovery: Whether data management is distributed or centralized, every data management team should possess a comprehensive overview of the diverse data elements available within the organization. Initiatives are put in place to ensure that individuals across the organization know where to locate specific data sets.
- Supporting Data Processes: Working extensively with data often involves labor-intensive processes. These processes include data access management (acquiring consent from data owners to access specific datasets), data sharing (sharing data products with others), and documenting data security levels. A well-structured data management practice devises strategies to streamline and optimize these supporting data processes.
While the primary goal of data management is not technology-driven, tools can significantly expedite the maturation of data management practices. These tools encompass a wide spectrum, from data catalogs for visualizing data models to data quality tools for monitoring data quality status, as well as specific tools designed to facilitate various data processes. Although different vendors may position these tools differently, their underlying concept is largely rooted in metadata management.
Data Strategy
I recommend adopting an agile data strategy, as it’s unrealistic to expect that the answers to the “Why,” “Who,” “What,” and “How” questions will remain static for extended periods, often changing over weeks or months. While embracing change is encouraged, it can be challenging to maintain a comprehensive view across all the dynamic elements. This is why I advise seeking a methodology that makes the connections between “Why,” “Who,” “What,” and “How” tangible.
The strategy I prefer centers on utilizing a straightforward but incredibly powerful “digital planning board” featuring three vertical lanes:
- How: This lane focuses on technical products such as pipelines, computing resources, databases, and data management initiatives that facilitate the subsequent “What” and “Why.”
- What: This lane pertains to data products containing the necessary data to actualize use case scenarios.
- Why: In this lane, you identify the value drivers to which the use cases contribute.
Additionally, I recommend labeling the items on the far right side of the planning board with the associated personas (the “Who”). This practice ensures that the delivered value aligns with your organization’s culture and DNA.
By drawing connections from left to right, you can quickly grasp the relationships between value drivers and enabling initiatives. This strategic approach facilitates gaining support for transformative “How” projects that will ultimately deliver significant value to the “Why” aspect.
Conclusions
In this article, I have demonstrated that data-driven management extends beyond the purely technical implementation of a data platform. By addressing four core questions — the “Why,” “Who”, “What,” and “How” — we can discern how technical implementations and data management initiatives align with the broader value drivers of the organization. A well-defined data strategy plays a crucial role in preserving this alignment and ensuring that the cultural harmony between technical implementations and the organization’s DNA is maintained.
Questions? Feedback? Connect with me on LinkedIn or contact me directly at Jan@Sievax.be!
This article is proudly brought to you by Sievax, the consulting firm dedicated to guiding you towards data excellence. Interested in learning more? Visit our website! We offer a Data Strategy Masterclass that provides a deeper understanding of the world of data strategy.