Daten.Cafe — No Code for Data Literacy

Andreas Kugel
5 min readMay 12, 2023

--

Logo Daten.Cafe

Coming from a computer engineering background I have dealt for many years with data — building data acquisition systems for experimental Physics. This had a lot to do with structure, formats, volumes, encoding etc. to make data transmission as fast as possible. Since a few years I take another view which is more related to content, use-cases, education and civic-tech. I am a member of the OK Lab Karlsruhe which is part of the Code for Germany network initiated by the Open Knowledge Foundation. At the lab in Karlsruhe we try to inform the general public on a variety of topics related to data and digital technology. We do presentations during the Open Data Day event series, talks and workshops on environmental sensors and on how to use public open data. Occasionally we also support the local administration for example by building software prototypes (e.g. CO²Down) or consulting during software projects. Besides the OK Lab activity I do related seminars on data literacy at the KIT and workshops on coding and art at the ZKM (I’m even a bit into artworks).

One of the important questions I try to address is the following:

What is Open Data, how can I use it and what do I gain?

This is not a burning question for many people… But it should be. And data literacy skills are important to find an individual answer to that question. To help people people far from coding and data-science to educate themselves on this topic during workshops and seminars I’ve used a couple of tools. One of my favorites is Orange. It is graphical programming (no-code) and open-source. So two essential requirements are satisfied. However, people still need to install strange stuff (like Python) on their machines. And it doesn’t run on mobiles. I also tried other tools like Knime and PowerBI. All of them have one or the other disadvantage. I want people go to a workshop, create their data-story, take it home and be able to continue working on it on their own with the same tool. Which finally brings me to the core of the story (apologies for the lengthy introduction) - I had to write a purely browser based no-code data-literacy tool myself: Daten.Cafe

Wait!

Don’t click that link. Not yet. Thing is, it’s not finished. At least not all the texts and instructions. A lot of the core functionality is there and can be used and tested, though.

You’re still reading? OK, let me tell you the idea behind it. It is a web-application, so it runs in your browser on any reasonably modern device. It’s open-source, you can find the Github link on the first page. The three pillars of the app are

  • Introduction to fundamental data and statistical features
  • No-code workspace to develop custom operations on data
  • Data story-telling by creating and sharing self-contained projects

You may ask, aren’t there many such apps already there? Why re-invent the wheel? Fact is, I didn’t find so many apps for such tasks for the browser. And I guess the reason is: CORS. While it’s easy to fetch data from external sources from any platform in principle, there is this fundamental issue when you try to fetch data from Javascript code inside a browser. Most servers don’t like the handshake enforced by browsers and thus the transfer fails. So, browser based seems not to be a good choice for any serious general-purpose data science application. There are work-arounds however, which introduce some nuisances but are acceptable in the scope of this project. (Read more on the tutorial page, if you like)

So how does it work? Let me first answer the question from a developer perspective. The web framework is Ionic/Vue. That allows to target Web, Android and IOS. Native apps are possible but against the idea of no need to install something. The flow editor is Cytoscape and data handling is done by Danfojs (basically a Javascript camouflage for Pandas). And yes, Danfojs has company, Plotly for data visualizations and Tensorflow for things far beyond basic data literacy. I have some initial Tensorflow demonstrations ready but they will go into a future version … On top of that is a lot of custom spagghetti-code … I’m just wondering why so much code is required to achieve no-code … And for this CORS issue I mentioned: either have the browser download the file or use a proxy (there’s one in place for registered users).

Now, this is what you can actually do today. The menu has seven sections, let’s stick to 3 (stories), 4 (tutorials) and 5 (workspace) for the moment. Tutorials gives more details on what I describe in the following section. Maybe it’s best to read that one first. The story page will keep interesting examples, including user contributed stories, which can be copied to the workspace and used immediately. There are only a few demo stories for the moment (more real-world examples to come). I’d appreciate any story contributions from your side, you can already create one and send it by e-mail. Click Start Story to copy the first story to the workspace.

Sample story executing. Flow to the left, charts to the right

You find instructions on how to use the workspace on the tutorial page. There is no detailed information on the building blocks themselves, yet.

The workspace is where you assemble your flow (or data pipeline) and have the visualizations. On larger screens the two panes are next to each other. On small screens you see a yellow button to switch from one view to the other. Click the question mark to see tooltips for the toolbar. All new blocks (nodes, in the following) are available through the magic-wand button. There are three groups of nodes: sources, processing and visualizations. By default, you see only a subset — the simple ones. Tick show all options in the settings for the full set. You can click and move each node on the flow canvas. Use the mouse-wheel to zoom in and out. A long (!) left-click on a node or edge opens a node-specific context menu. For example, on the dice you can set frequency and number of generated rows and columns. You can also start and stop the generator. For the line and bar charts you can select X-axis. The add-columns is a bit more complex. In case you feed data from two sides with identical column names you can choose to apply simple math operations on columns. You can also use just one of the inputs or ignore the columns.

This should allow you to get going and try a few things. When you change settings for all options, you can find more inputs nodes, more processing nodes and more visualization nodes. A few of them are a bit special, like the map or the realtime input. I will discuss those in the next part of the story.

In the meantime I’d be happy to collect any feedback on story or tool. Remember, it’s a free educational tool. Pull requests welcome.

--

--

Andreas Kugel

Into computers since 1976. Computer engineering, hardware, software, embedded-systems, FPGAs. Scientist, lecturer. Artist. Civic-tech. Open Data. Open Source.