Fabric Adoption Readiness

The happy path to implementation.

Carl Follows
Version 1
12 min readOct 9, 2023

--

Revolutionise your data capabilities and unlock a new era of data-driven success with Microsoft Fabric and Version 1 — contact us today.

You’ve read all the blog posts about Fabric — Microsoft’s latest data platform offering, and are buzzing with the potential. Having experienced previous promises of nirvana, you realise that expectations are likely inflated above actual reality; so you want to ensure your organisation is truly ready to adopt Fabric. But how to ensure a smooth uptake and maximise value?

Fabric is not a departmental solution, but a cross-organisation capability. Being Software as a Service (SaaS), it can be quick to provision; the key is ensuring your organisation and employees are ready to adopt.

A lot will depend on your existing cloud readiness; for instance, integration with Entra ID (formerly Azure Active Directory). For this blog, I’ll assume you’re already managing a tenant in Azure, have a landing zone and are familiar with some of the other services available. Otherwise, I would recommend you start with Microsoft Cloud Adoption Framework for Azure.

Photo by Scott Van Hoy on Unsplash

Strategy

Data Culture

The benefits any organisation will derive from a data analytics solution depend on how data-savvy the individuals are who will have to use it.

  • Is there an understanding of data ownership and willingness to share?
  • Are users trusted to interpret complex data or is analysis predefined?
  • Do they have the skills and a desire for better tools to manipulate data?
  • Will they embrace the opportunity or expect others to deliver?

Fundamentally, as you give people the data and tools to analyse it, ensure everyone feels empowered to answer questions and make decisions. Enhance your culture through communication, training and fostering collaboration opportunities between data stewards.

Business Alignment

Whilst a data platform is a necessary foundational service for organisations wishing to become data-driven, it typically requires multi-year investments before becoming embedded enough to transform the way you work.

To ensure the implementation stays on track and keeps building momentum, understand and document your motivations and vision. They may seem obvious now, but ensuring there is an agreed consensus will help ongoing sponsorship and support from the stakeholders during the journey.

Define some of the early business outcomes that will cement buy-in. For example: optimisation of a troublesome process, automation of a specific decision or building a data science capability.

Operating Model

The decision on how much to centralise or decentralise your analytics capability will be the first decision that demands consideration.

Historically data was a specialist role within IT departments due to the complexity of gathering and manipulating it. Microsoft Fabric is the latest step in the democratisation of data analytics and the rise of the citizen engineer. This technological advance promotes a paradigm shift in the responsibility for the gathering, transforming and analysing of data.

Depending on your organisation’s size and structure there are several ways to define an operating model for analytics:

  • Centralised: With the data and visualisation owned by a single team.
  • Data Hub: Centrally managed open data storage with exploration and governance capabilities to enable self-service BI.
  • Data Fabric: Robust data integration platform with automated metadata discovery, maximising the accessibility of existing data stores into a consistent experience.
  • Data Mesh: Domain-orientated ownership, with self-serve infrastructure and federated governance to facilitate data as a product.

Whilst all of these can be delivered through an implementation of Fabric, it was designed to promote decentralised domain ownership that scales well with larger organisations. If you are planning on maintaining a centralised team of technical specialists to run your data platform then the Platform as a Service (PaaS) offerings like Synapse Analytics Workspace may be more appropriate. Whichever ownership you select, document your operating model and ensure your employees are given the skills to succeed.

Technology Alignment

There are always multiple options to fulfil business needs, and not all of them demand a new technology. Before you decide to purchase a COTS (Commercial Off The Shelf) product you must evaluate it to ensure its functionality fulfils your needs, whilst remaining compliant with non-functional requirements of IT governance. Consider business continuity, regional hosting and your strategic direction for cloud platforms. Since vendors charge for data egress; colocation can offer cost savings.

Once you have defined what you hope to achieve and the operating model you will use, consider the criteria by which you will evaluate Fabric. Who will perform the assessment and are they representative of the ultimate user base?

Photo by Marius Masalar on Unsplash

No matter which operating model, or decision on SaaS Vs. PaaS you choose, the Lakehouse is the storage model which supports the most flexible data platform. Centralising your data in a Lakehouse in an open delta table parquet format supported by a variety of compute engines brings the agility to adapt rapidly. The separation of storage from compute cost and the ability to scale from an initial small start lowers the threshold of investment for proofs of value; embrace this to increase the cadence of innovation and reduce time to value. If you’re still investing in Synapse or Databricks platforms, then you don’t need to abandon all to start again; the Lakehouse provides a natural future migration route across to Fabric.

Plan

Existing Estate

Any organisation is likely to have a myriad of systems and data platforms already implemented, each of which will need an impact assessment of how a move to Fabric would change current practices. Whilst ultimately having all data integrated will be beneficial, the cost and benefit for each item will vary; build a roadmap of data sources with dependencies. Understand the existing region and platform each source is hosted on along with any complexity due to data sensitivity or network isolation.

Success in implementing a new technology depends on the capability and willingness of the people. Identify which data stewards are early adopters that will benefit the most, ensure they are aligned to the vision and have the skills and opportunity to make it work.

Note that OneLake enables you to quickly create shortcuts out to your existing data stores so you can expose them to the Fabric experiences without having to refactor existing data pipelines. This supports a faster migration to Fabric but could result in excessive data movement charges when used long-term. Develop recommendations for their appropriate usage and ability to monitor the cost.

Even with a widespread desire to adopt Fabric, there could be contractual SLAs or external compliance regimes that must be adhered to. These could affect the speed with which Fabric can be rolled out. Identify and engage existing data stewards early in the process to catalogue and mitigate these pitfalls.

Data Governance

One of Fabric’s greatest strengths is bringing data together into OneLake, segregated into Business Domains and Functional Workspaces. This will be a boon for the Chief Data Officer (CDO) and Information Security teams providing them with greater visibility and control over data assets. Datasets can be certified, promoted and marked with sensitivity labels that are inherited by any items that consume the dataset, and the lineage of data can be visually tracked. This allows the CDO to focus on managing data rather than discovery and integration.

Getting the most benefit from the purview catalogue and lineage tools provided in Fabric requires a defined data governance model with data stewards identified in each domain. Existing policies and procedures will likely need to be assessed and revised to maximise the benefit of OneLake. The workspace is the unit of access governance in Fabric, understand how you will identify the types of data available to each workspace and how data stewards will ensure content and user access rules are adhered to.

OneLake unified management and governance

Financial Governance

Fabric simplifies the cost control of data analytics by introducing the concept of capacity; a common compute necessary to perform all workloads. Capacity administrators are responsible for scaling, allocating this capacity to the workspaces for consumption and scheduling pauses for when not in use.

Capacities are provisioned within the Azure Portal allowing each to be recharged back to the owning business unit (use tags to identify ownership) and can have a different administrator who controls which workspaces can consume the capacity units. Define budget controls, and escalation routes and understand how to recharge back to functional domains.

The Microsoft Fabric Capacity Metrics app provides a breakdown of capacity units by workspace name \ item type \ item name so you can monitor and understand where the capacity is being consumed.

Alongside the capacity charge is a charge for the OneLake storage, which is priced comparable to ADLS Gen 2 storage. This will be small by comparison with the capacity and should be considered a central cost as both producers and consumers of data benefits.

Power BI

Whilst the Power BI service has become rebranded to Fabric, depending on the type of capacity license you already had provisioned, and the tenant settings in the admin portal will decide whether your users can create any of the new fabric items within existing Power BI workspaces.

The additional experiences which Fabric introduces warrant a review of existing workspaces, and how they fit into your target operating model and ensure alignment with your federated governance.

Skills

Fabric provides tailored experiences for the different persona, Data Engineering, Data Science, Power BI and others. Much of the functionality is no-code / low-code with Copilot (Microsoft’s generative AI) also coming to Fabric soon, helping users develop engineering faster. Even with this support, users will need data literacy, an understanding of the Fabric capabilities, and some experience with the ubiquitous data languages of SQL and Python will benefit the more technically minded.

Understand the skills, experience and aspirations of your existing workforce, how they align to the Fabric persona and identify the gaps against your operating model.

Consider setting up a centre of excellence community that can help build awareness, provide architectural and data modelling patterns or offer advice on writing Python, SQL or DAX for complex transformations. Successfully implementing Fabric is more about fostering communities of citizen engineers than a technical change.

Dozens of people raising a barn together
Photo by Randy Fath on Unsplash

Portability

Because of the use of the open standard delta table parquet format, if you choose to move away from Fabric, the technical cost to extract yourself is lower. Most of the running cost is in the capacities, which can be paused. The data is stored in an ADSL Gen 2 storage account which can be accessed from many other tools reducing the migration effort.

Ready

Evaluation

Depending on your usage of Power BI licenses and capacity will affect your access to Fabric. Review how to start a fabric trial to get 60 days to evaluate the software and your usage of it.

At this point consider using shortcuts to allow you to expose any existing data lakes into OneLake without significant engineering commitment.

Landing Zone Network Security

As with all cloud-hosted applications, you will need to consider how to ensure that all traffic between Fabric and your network is secure. Being a data platform you need to consider securing both ingress traffic from your existing data producers and egress out to data consumers.

Whilst Power BI supports Private Endpoints these are not yet available in Fabric. Similarly, the self-hosted integration runtime used by Azure Data Factory to action copy activities within a private network is not yet available in the Fabric implementation of Data Factory; so pipelines can’t access on-premise data. The Power BI on-premise data gateway is implemented for the Dataflow Gen2, which means you can use that to get data into the lake, and then orchestrate onward with pipelines.

For many corporate organisations, these will be non-negotiable prerequisites before entering live service. However, don’t assume you have to wait for these limitations to be resolved before beginning your adoption process, as there is a good chance they will be implemented before you are ready (check the roadmap), and much can be evaluated with non-sensitive data.

DevOps

With Fabric being a SaaS (Software As A Service) offering, there is currently no need for deployment of Infrastructure as Code (IaC). This potentially may change for private integration with any existing landing zone; not yet available in the product.

Each workspace can and should be connected to a Git repo to allow all content to be committed into source control to track changes and enable engineers to revert items to a previous version. As with Power BI, there is the opportunity for the promotion of workspace content from development workspaces into production workloads through deployment pipelines. Define governance standards for the workspaces which need these deployment pipelines, and what data may reside in each workload environment.

Manage

The implementation of your operating model begins with the decisions on the domains, workspaces and capacities that require provisioning at initial deployment. Whilst this will likely rapidly change, consider the naming conventions and allocate individuals accountable for the administration of each aspect.

As the sharing of data becomes more ubiquitous analyses will be developed across datasets hosted by different workspaces owned by other teams. Ensure there are rules ensuring datasets are certified before being consumed into production analyses and how the freshness of these datasets is monitored.

Ensure your monitoring team is expecting and ready to monitor the usage of Fabric and understand who’s accountable for controlling costs and data. Use this monitoring to track adoption rates across the business.

Adopt

Provision

Fabric is enabled in the Microsoft 365 admin centre for either the entire tenant or specific security groups.

Capacity is then purchased and scaled appropriately within the Azure Portal where an administrator is also assigned to each capacity. You will need to set which is your default capacity for workspaces that have no specific capacity defined.

Capacities are provisioned in specific regions and your data in OneLake will reside in the region based on the capacity it is created from. This matters both due to the proximity to your users, affecting performance, and compliance with any data hosting regulations. By allowing data sharing from your Lakehouse into other workspaces that might be running on different compute in other regions, realise that you are allowing the data to cross regional boundaries. Ensure you have policies in place to control the regionality of capacities and their data.

Centre of Excellence

Constitute a community of Fabric users to facilitate knowledge-sharing and consolidate best practices. Include technical experts in the different experiences within Fabric who can provide guidance and business experts who can help engagement and facilitate wider adoption. This will support the late majority who may be more sceptical about the benefits of a data-driven organisation or cannot see how to naturally pivot towards one.

Fabric introduces significant capabilities that people will use in different ways. Whilst a decentralised operating model enables each department to take responsibility for control of its data, there will be a thirst for patterns that will ensure all can make effective use of Fabric to elevate their data analytics maturity and monitor the patterns adopted to ensure no bad practices.

Use your federated governance structures and centre of excellence to ensure collective decisions on approved architectural patterns.

Summary

Fabric brings some powerful data engineering capabilities into a user-friendly paradigm, lowering the technical threshold for who can deliver data analytics. By centralising data into OneLake and simplifying compute power in shared capacity it enables organisations to maintain a strategic picture of how they can derive value from their data.

The key to success is implementing a federated governance that defines departmental accountability for data management through workspace access, and capacity cost.

Revolutionise your data capabilities and unlock a new era of data-driven success with Microsoft Fabric and Version 1 — contact us today.

Further Reading

Whilst there are yet few Fabric implementations to learn from, the lessons in this story are built from over 20 years of experience building data platforms informed by the following resources:

About the Author:
Carl Follows is a Data Analytics Solution Architect at Version 1.

--

--

Carl Follows
Version 1

Data Analytics Solutions Architect @ Version 1 | Practical Data Modeller | Builds Data Platforms on Azure