Data Exploration using Vizician.com

A no-code approach to exploratory data analysis

Jasper Soetendal
6 min readSep 7, 2022

What’s the first thing you do when you receive new data? You probably want to get an understanding, a sense of the most relevant information that’s hidden in the data table. What does each column mean? How are they related? What do rows represent and how do they compare?

Well, that’s data exploration, and staring at a screen chock-full of number-filled cells isn’t the most efficient way of doing this. Let’s use a tool for that! In this blog, I’d like to show how you can use Vizician.com to drag & drop your data and get instant visual data exploration.

Importing your data

You can either cut & paste your data (for example from Excel, Google Sheets, a website, document, or any other table) or drag & drop an Excel, JSON, or Numbers-file.

Copy & Paste your data from Excel

In this blog, I’ll use examples from an example dataset ‘Countries of the World’, composed of data from UNdata (https://data.un.org). You can see this dataset and explore them yourself on this URL: https://vizician.com/viz/QsTwgF/explore/search

Overview of the data

Within seconds this data is processed and you’ll get an overview of the dataset:

In the navigation at the left, you can navigate through all columns, rows, and categories. Vizician will autodetect the type of data (numeric, categories, dates, etc.) and what column(s) are identifiers. (If no unique combination of columns is found, an identifier will be generated.)

On the right, a small histogram is shown for each column to get a quick view of how the data is distributed. Let’s click on a column to explore the column in detail:

Exploring columns

Chart type ‘Highest Scores’ showing rows with the highest values. In this case countries with the highest population. Can be used to aggregate the region as well, to see what the most populated regions are.

For each column different charts can be generated with a single click, and for each type of chart the most relevant options can be changed:

  • All rows to see the values for all rows, in data order or ordered by value.
  • Filtered rows to filter on the value of another, categorical, column
  • Aggregated to aggregate on another, categorical, column
  • Map to show the data on a map if geographical data is included (upload a geoJSON to add geographic information)
  • Highest and Lowest scores to see the Top 10 highest and lowest scores, either for individual rows or aggregated on another, categorical, column
  • Compare to compare the values of this column with the values of any other column
  • Scatter plot to see the correlation with any other column in a scatter plot
  • Box plot to see the variation within a selected category
  • And finally, an advanced mode to further tweak any of the charts above
Chart type ‘aggregated’, showing aggregated data of population per region and split up by subregion

Below the charts some additional information is shown, like a distribution chart, the highest and lowest scores, and the top correlation coefficients with other columns:

More information on the column

Exploring rows

Likewise, the rows can be explored, although there’s only one type of chart showing the values of selected columns for this row. One or more other rows can be added to the chart for comparison.

Below the chart the values for each column are shown, with a visual indication of how much this value is above or below average:

At the bottom of the page, you’ll see a list of this row’s highest and lowest values. First for absolute values and then relative: those columns this row is scoring high and low relative to the other rows. This example shows that Australia is a relatively big country (surface area) and low on population aged 0 to 14 years. It also shows similar rows: based on this dataset (that doesn’t include anything about climate :D) Canada and Finland are most similar to Australia.

Highest & lowest values and similar rows

Correlation

Now you’ve explored all columns and rows, you might be interested in the correlations between the columns. In the ‘Analysis’-section of the Data Explorer, there is an option ‘Correlation Matrix’ which shows the coëfficients for all columns:

Clicking on one of the cells will show you a scatter plot of that two rows:

Scatter plot to explore the correlation between two columns

Grid & Timeline

In a dataset with multiple dimensions (like time/years) or categories, you might want to split the charts. In ‘advanced mode’, all generated charts can be split up using a ‘grid’ or a ‘timeline’.

A grid shows multiple charts at once. In the example below ‘region’ is used to split the charts:

Grid of scatter plots
Example of a grid: scatter plots per region. Now countries are colored by ‘subregion’

In a timeline, the charts are not shown together but are separated on an interactive ‘timeline’ that can be used to slide through the split charts. While this is great to slide through time (see for example this chart on life expectancy over time), it can also be used for other dimensions or categories, like the example below:

The same charts as in the grid, but now on an interactive ‘timeline’

Slides

Browsing through all these charts, every now and then you might want to save an insightful chart. Every chart has an ‘+ add to slides’ button that does exactly that: saving this specific chart as a slide.

On the tab “Slides” these slides are saved and displayed:

Slides for the example dataset “Countries of the World”, providing an interactive way to browse through the data from multiple angles.

Publish

Up until now, all data stayed local in your browser, nothing is uploaded or published to the web. So if you want to keep the data and the slides for yourself, you’re safe!

If you’d like to publish the data and the slides, you can do so via the ‘Settings & Publish’-tab. Here you can publish the data, choosing from open to anyone, hidden (only viewable and editable via a shareable link), or private (for your eyes only).

Interactive slides

By publishing your data, any visitor can explore the data with all generated charts. You can use the slides to provide a more guided experience, including some text and explanation. For each slide, you can set what axis and/or dimensions may be changed by the visitor, for interaction.

See for example the slides of the example dataset which provide an interactive way of exploring the data in nine interactive slides, each providing another angle to look at the data.

Conclusion

Pros:

  • Very easy, instant data exploration
  • Amazing amount of insightful charts to be explored
  • No coding, libraries, or pivot tables required
  • Publish datasets as an interactive data playground, guided by interactive slides or unguided in the Data Explorer.
  • Free

Cons:

  • Doesn’t work for large files (5 MB+)
  • Ads (Premium Ads-free membership expected in 2023)

About Vizician

Vizician is an online tool for instant visual data exploration. For data enthusiasts and data publishers. Visit https://vizician.com/about for more information.

Disclosure: I’m the founder of Vizician

--

--

Jasper Soetendal

Founder @ Braxwell.com, Copaan.nl & Vizician.com — Both strategic advisor and full stack developer, covering a broad spectrum from Business to IT.