The evolution of our data discoverability solution Dtechtive through the CivTech programme of the Scottish Government.

Why Dtime is taking a step back and discovering datasets hidden deep in the web?

Gautham Krishnadas
DTECHTIVE
5 min readDec 11, 2022

--

Brief history

We as a team have many years of combined experience with data in key industries, academia and non-profits. What drives us is our shared passion to build a better future for all, using data.

In its early stage (2020–2021) that also overlapped with the Covid-19 pandemic scare, Dtime was focussed on intelligent data analytics and software development in the Renewable Energy and Agriculture domains. We leveraged techniques such as machine learning and optimisation to develop data-driven software applications in-house as well as for other startups. In short, we were a Data User working downstream of the data pipelines where available data are analysed and insights are derived. During this phase, we would often be bogged down with the pain of finding relevant and good data.

From downstream towards upstream of the data pipelines

Come Autumn 2021 and the CivTech programme of the Scottish Government announced a challenge to help make public and third sector datasets more discoverable through innovative approaches. What motivated the challenge sponsors Digital Directorate, Data & Intelligence Network and NatureScot to look for innovators was the simple fact that existing search engines could not find the datasets already published on the web. As a result, people remain unaware of what datasets exist and where they are published. This slows down data-driven innovation across all sectors. A solution to better data discoverability was necessary in the light of critical problems such as Climate Change and Pandemics that can be addressed through data-informed decisions.

Ms. Shona Nicol, Head of Data Standards at the Scottish Government, explaining the challenge.

Drawing on our past struggles in finding relevant and good data, it was natural for us to think of ways to address this challenge. Following the selection round, we entered the Exploration phase for 2 weeks where the team got an opportunity to understand the challenge up-close. The challenge sponsors identified us as the right team of innovators and invited us to the 4 month long Accelerator phase. Based on many user interviews and workshops facilitated by the CivTech team and the challenge sponsors, following countless hours of internal brainstorming and development, a minimum viable product (MVP) version of our solution Dtechtive was created. For an intelligent search tool with its roots in Scotland, birthplace to the creator of Sherlock Holmes, we couldn’t think of a better name!

Datasets (metadata only) from around 8 prominent open data portals in Scotland were onboarded onto the MVP version. During this period, we were able to gauge people’s interest in our solution and validate how it made lives easier for Data Users. Dtechtive helped Dtime transition from being a downstream Data User to becoming an upstream Data Aggregator.

An early hand sketch of Dtechtive from one of our creative team members Inès Bussat
The Minimum Viable Product (MVP) version of Dtechtive
Presenting the MVP at the CivTech demo day event in Edinburgh, Feb 2022

MVP to Alpha to Beta

Following the MVP demo day event, Dtime entered the Pre-Commercialisation phase of the CivTech programme with additional funding from the challenge sponsors.

During this phase, more streamlined user interviews and structured workshops were conducted to understand the user needs better. The Alpha version of Dtechtive was released in October 2022 with many Data User centric features.

If you are a Data User, we would highly encourage you to create a free account with Dtechtive and play around with these features to provide us feedback (you may use this form to report bugs/issues).

Dechtive Alpha version released in October 2022

While the initial focus for product development was on Data Users, the needs of Data Providers increasingly gained our attention. Data Providers (open and commercial) whose datasets are hidden deep in the web are unaware of who might be interested in their datasets and what they might be searching for. Even if Data Users were to find those datasets, there are not many options for Data Providers to gather insights on data usage (download metrics, search terms used, etc.), use cases and quality. Hence, the ongoing product development is largely Data Provider centric.

As of today, Dtechtive has onboarded datasets from around 15 major open data portals in Scotland and is in the process of aggregating commercial datasets as well. The Dtechtive API (application programming interface) is being integrated into open data provider websites to make data discoverability ubiquitous. We can see that Dtechtive is evolving into a unique marketplace-like ecosystem that links Data Users with Data Providers. This will be more evident in the Beta release planned for January 2023. And hey, we have a new “investigative” logo!

Indicative design for the Beta version planned to be released in January 2023.

Beyond Beta

Can’t disagree with Sherlock Holmes who once said, “it is a capital mistake to theorise before one has data.” Hence, using scalable and intelligent techniques, we are onboarding as many open and commercial datasets as possible to make Dtechtive the most comprehensive Data Search Engine out there. That said, Dtechtive is more than just a Data Search Engine.

While it makes sense to “search” for information (e.g. the news and blogs that Google brings to you) that is usually absorbed as is, “search” is not sufficient for data. This is because, data usually needs to be processed and analysed prior to deriving value from them. Arguably, “search + insights” makes most sense in the case of data. Dtechtive is poised to become the best Dataset Search & Insights Engine in the world. It will also cover the whole spectrum of internal (within organisations/enterprises) to open datasets.

Through Dtechtive, Dtime is solving one of the biggest challenges in the data pipelines that exist today: finding relevant datasets easily and quickly to accelerate data-driven innovation across all sectors, industries, disciplines and domains.

If you would like to receive updates on Dtechtive, you may register here. You may also create a free account using this link to unlock all the available Alpha features.

--

--