Data Intelligence Platform

From DWH to Data Platform to Data Intelligence Platform

Artsiom Yudovin
Analytics Vidhya
6 min readFeb 19, 2024

--

DWH

DWH

Unlock the power of data with DWH, your gateway to seamless information integration! In the dynamic landscape of today’s products, data is the heartbeat, and a Data Warehouse (DWH) is the pulse that keeps it alive. Imagine a centralized hub where data comes together from various sources, meticulously organized and ready for insightful analysis. DWH is not just an acronym. It’s the cornerstone of a product’s vibrant life, enabling you to collect, organize, and extract meaningful insights from the data that fuels innovation. Generally, you get a lot of benefits from using DWH.

For instance:

  • You will have all the data in one place.
  • You will be able to clean and structure your data.
  • You will analyze the historical data.
  • You will handle the complex queries.

As the company grew, the need for a transformed DWH became apparent to support the growing requirements.

Data Platform

Data Platform

In the realm of evolving data management strategies, the next crucial step involves the establishment of a robust data platform. But what exactly is a data platform, and how does it differ from the traditional Data Warehouse (DWH)?

To embark on this transformative journey, we must first consider the emerging requirements that prompt the shift from a conventional DWH to a more advanced data platform.

The data landscape has expanded significantly, encompassing a broader variety of information. Data sources, once relatively uniform, have now become diverse and abundant. Simultaneously, the intricacies of data processing and curation have surged, presenting new challenges in managing the vast influx of data.

One notable shift is the increased demand for data mart preparation, indicating a growing need for specialized datasets tailored to specific business units or analytical purposes. This trend highlights the necessity for a more flexible and adaptive data infrastructure.

Furthermore, the business environment calls for near real-time data processing capabilities. Organizations increasingly recognize the importance of timely insights, necessitating a departure from traditional batch-oriented processing towards more instantaneous data handling.

In this dynamic landscape, it is crucial to democratize data access. The data platform should empower the organization’s broader spectrum of individuals to work with data efficiently. This democratization of data access ensures that insights are not confined to a select few but are accessible to a broader audience, fostering a data-driven culture across the organization.

In summary, the transition from a conventional DWH to a comprehensive data platform is prompted by

  • the expanding variety of data
  • the proliferation of dissimilar data sources
  • the complexity of data processing
  • the rising demand for specialized data marts
  • the imperative for near real-time processing
  • the need for increased accessibility to data

When you’ve faced points from the list above, you can start thinking about moving from DWH to Data Platform.

The data platform is a complete solution for ingesting, processing, analyzing, and presenting data generated by the system for processing and infrastructure of the modern digital organization.

I propose to set up the following main components to build the data platform. Here are the minimum features you should have from my point of view.

  • Data Acquisition is responsible for data gathering and collection from different sources.
  • Data Lake is responsible for storing and keeping data.
  • Data Catalog helps us with search, discovery, and metadata management.
  • Data Pipeline is the brain of our data platform. It helps us to build any analytics pipeline or automatization to increase the stability of the data platform.
  • Data Presentation is responsible for querying our data using SQL engine or integration with the BI tool.

Each component has its development lifecycle with our responsibilities.

Data Intelligence Platform

Data Intelligence platform

In today’s ubiquitous discussions about artificial intelligence (AI), its profound influence extends across every aspect of our lives, including data platforms. Gone are the days when one could envision these platforms in isolation, as AI now plays a pivotal role in shaping their operational landscape.

Since we have the list of components that the data platform consists of. Which is the component affected by AI the most?

The most affected by the AI component is the data presentation. Let’s see the responsibility of data presentation. It provides access to and queries our data using the SQL engine or integration with the BI tool. Here, AI can be super helpful. One of the most common pitfalls of working with data is that businesses can work with data themselves because they lack experience with SQL or BI tools. AI should remove this obstacle because, from this moment, you can use an AI model to work with data using natural language. You don’t need to know SQL. You need to communicate with AI generative SQL to generate SQL queries that answer your question.

The same situation applies to BI tools. Most BI tools start offering AI for dashboard generation. You use natural language to explain what kind of dashboard you would like and what the BI tool suggests.

For instance, we can see how AWS develops its services as QueryEditor for AWS Redshift and Amazon QuickSight. Both these services have already been integrated with Amazon Q.

Amazon Q is a generative AI–powered assistant. Using this assistant and integration with data presentation services such as Redshift and QuickSight, you can use natural language to start working with data.

I’m sure if I see other competitors, they will have or have already the same vector of development.

I want to admit the critical problem here. It’s a data catalog. The data catalog will be the crucial component of the data intelligence engine because it provides all metadata for this engine. The better the data catalog, the better the working intelligence engine. The data catalog will have the same effect on the intelligence engine as on a customer of the data platform. I want to remind you of the responsibility of the data catalog. It helps us with search, discovery, and metadata management. When you want to find something in data, you will go to the data catalog to find information and get helpful metadata to bring you closer to your answer. The same is true for the intelligence engine. All information from the data catalog will increase the accuracy of the intelligence engine. So, you can start improving your data catalog now if you don’t want to face this wall later.

Conclusion

In conclusion, the evolution from DWH to Data Platform to Data Intelligence Platform signifies a strategic response to the changing data landscape. The infusion of AI into data presentation transforms how individuals interact with and derive insights from data. Moreover, the Data Catalog emerges as a linchpin in the success of the Data Intelligence Engine, underscoring the need for early attention and improvement to this crucial component. As organizations navigate this transformative journey, a holistic approach to data management and strategic investments in AI and metadata management will be vital to unlocking the full potential of a Data Intelligence Platform. Of course, AI will affect other data platform components, but the most obvious and valuable will be the data presentation.

Thanks for the read. Do clap👏 , follow me, and subscribe if you find it useful😊.

--

--