DKAN v2 is here!

Kim Davidson
DKAN Blog
Published in
5 min readMay 8, 2020

--

The DKAN team loves our work because we believe in the power of open data to support transparency and accountability in our institutions, to drive policy-making that represents our citizenry and truly works, and to tell the stories of our communities.

When the world first got excited about the potential of open data, we gathered data everywhere we could and created portals to bring resources together for journalists, policymakers, and community organizations to access and solve problems. In the decade that followed, our relationship with data has matured, and we’ve learned so much more about how data becomes insight. We’ve passed laws mandating data sharing, we’ve created data and metadata standards that help us synthesize across sources, and we’ve developed new ways to analyze, visualize, and tell stories with data.

The DKAN team has been hard at work building a next-generation open data portal that both starts from scratch and brings all our learnings from our near-decade as a part of the open data community. We’ve engineered the latest version of DKAN from the ground up to be intuitive, trustworthy, reliable, and extensible.

Our new system is built on the most modern, best fit technology available (including an upgrade to Drupal 8) and is easy to adopt, simple to use, and built for interoperability with other tools in the open data ecosystem.

The Features

The new version of DKAN may be a work in progress, but it already comes with everything you need to start sharing data right away.

Upload data and metadata: Use APIs or our metadata entry user interface to add datasets or resources to your platform. Our entry options not only make it easy to add datasets to the platform, they also validate your submissions against the Project Open Data metadata schema, ensuring standards-compliance from the very start. Changing or replacing the schema is already an option for developers who want to get their hands dirty. We plan to support other common standards out-of-the-box in the near future, and eventually, any standard you’d like.

Harvesting: Bring data from other online sources to appear seamlessly in your own portal, and keep them up to date with the source material with nightly updates. The new system can currently harvest any source that offers a data.json endpoint, with more kinds of harvesting coming soon. We’ve also engineered our harvest feature to easily migrate your existing data into the platform at the very start.

Search: Users can search by keyword and topic (with more options coming soon) using a beautiful, intuitive interface, and they can preview that data to make sure it’s exactly what they need to change the world.

To guide our work, we’ve adhered to a few key principles:

Guiding Principles

Default to open source

We believe strongly in open source for open data for many reasons. Transparency of data requires transparency of the code that houses it. Open source means that we can all build on each other’s work, and everyone benefits, reducing costs for our institutions and continually expanding what’s possible for all of us. And code is stronger and more secure when it’s accessible and community-driven.

That means that in everything we build, we use open source solutions, ensuring our work is secure, transparent, and accessible as possible.

Maximize interoperability

There are so many great tools for working with data out there, and we want to make sure that our systems work with as many of them as possible. That’s why we’ve focused on ensuring that what we build is interoperable with the rest of the open data world. Our schema-focused architecture ensures the data we share is compatible with any other data using that schema, and our API-first focus makes it easy for other services and tools to connect to DKAN.

Decoupled/microservice-friendly

The new DKAN is a future-facing effort, focused on sustainability, maintainability, and extensibility. By building a base that takes its cues from microservice architecture — rather than a single complex codebase — we’ve separated functionality within the architecture as much as possible This means we can build each piece with the best technology available, and we’ve made it easier to upgrade or even entirely replace one element at a time, without affecting the whole system. We’ve also made it possible to grow individual functionality into even more robust services in the future by spinning them off into their own microservices without disrupting the system. Even our front end and back end are decoupled, giving you options to use any front end you’d like.

Engage the open data community

DKAN is a community-driven application, based on the needs of data collectors, consumers, and managers all over the world, and we want to make sure it’s representing your needs. We meet regularly with open data stakeholders at all levels of government, in research and data stewardship, and those using data for good to ensure we’re building something useful, intuitive, and trustworthy. You’ll see us at open data events, answering questions in DKAN Slack, in our public repositories, and we’re always taking your questions via email, phone call, or video conferences.

Transparency and observability of processes

A big part of feeling great about your tools is understanding what they’re up to. Being able to follow along with your site’s processes makes it easier for program staff to know they’re doing the right thing and for developers to troubleshoot any issues. Harvests, imports and user actions all leave behind a clear trail through a combination of Drupal-native logging and DKAN-specific tooling.

Approach all processes as ETL processes

Everything in data portals is about extracting the data from somewhere, transforming it into the format you need it in, then loading it somewhere. By thinking of everything DKAN needs to do as falling somewhere in that process, we can make sure that our engineering decisions ensure the right tool to solve the right problem.

What’s next?

We’re looking forward to sharing updates with you as we continue to develop the new DKAN. Keep an eye on this blog to learn more about everything from what feature is coming up next to how we make complex technical decisions that implement those features. DKAN belongs to all of you, so we can’t wait to hear what you think.

--

--