From Start to Viz with Conveyor

How to Install, Import, and Visualize your Data with Conveyor

Caleb Keller
Smart Platform Group
15 min readApr 2, 2018

--

Smart Platform Group recently released a product called Conveyor, which can be thought of as an easy solution to input data into the Elastic Stack. Using Conveyor for data imports creates a straightforward solution for the entire Elastic stack for business intelligence purposes.

In this article we are going to walk through all the steps needed to install Conveyor and integrate it with the Elastic stack to create a complete business intelligence dashboard (Image 1) using Conveyor and Kibana.

Image 1 — What we’re going to create

Prerequisites and Installation

Before we get started you’re going to need to make sure you have two tools installed:

  • Docker — To test type docker -v in a terminal or command prompt. Your response should be something similar to Docker version 18.03.0-ce, build 0520e24

Docker RAM Consideration — If you are running this on linux then you shouldn’t have a problem, but if you are running on Mac or Windows you will need to increase the memory allocated to Docker to at least 4Gb.

  • Docker-Compose — To test type docker-compose -v in a terminal or command prompt. You should get a response similar to this one: docker-compose version 1.20.1, build 5d8c71b

☞ Visit our Releases Page to Download the Latest .zip

You’re looking for a file that looks like Conveyor-vX.X.X.zip. Once you’ve found it click on it to download.

☞ Unzip and Open that Folder in a Terminal or Command Prompt

This step is a little bit different for everyone. On a Mac the file is often unzipped for you. But on Windows or Linux you may need to extract it.

☞ Execute the Below Command

You should see Docker start downloading and installing images. If you don’t, make sure the directory you are in is the folder you just downloaded and unzipped. If you have trouble join us on our Gitter for help. If you’re on Linux you may need sudo

Kibana can take quite a while to start up. While you’re waiting for Kibana you will probably see the below message repeated over and over. This is just part of Conveyor waiting for Kibana to be ready.

☞ In your Browser go to http://localhost:5601

This is the default address for Kibana. In the release that I am running for this tutorial here is what it looks like.

Image 2 — Kibana home screen showing Conveyor

There’s already a lot of information on how to use the various pieces of Elastic and Kibana so we’re just going to go straight into Conveyor. Find it on the left hand navigation bar.

☞ Click Conveyor on the Side Navigation Panel

See the arrow in the image above pointing to it. (Image 2)

When you click on it you’ll be greeted with the different data sources Conveyor can pull from. This list is expanding constantly so keep checking for the latest releases.

Image 3 — Primary Conveyor screen showing available data sources.

We’re going to upload a small dataset that contains the information for hundreds of thousands of Kickstarter campaigns. The data is available on Kaggle.

☞ Download the Data Set Here

Specifically we’re using the ks-projects-201801.csv file that you can download from the link above. You’ll need a Kaggle account to get to it, but you can sign in using a previously created social media account if that is quicker for you. Note: there are two files available, one for 2016 and one for 2018, we’re using the 2018 one.

Now that we have our data and Conveyor running, we need to create a channel inside of Conveyor. Since this is a CSV file we’re going to use the Text File Data source.

Importing the Data with Conveyor

☞ Click the +Create Button on the Text File Data Source

On the next screen you’ll have to provide some information about the file. For now just ignore everything except name, description, index, and file to upload. See the options I’ve chosen in the image below (Image 4).

Image 4 — Creating a Text File channel with Conveyor

☞ Scroll to the Bottom of the page and click Finish

Once you have all of the parameters matching the above screen shot, scroll to the bottom of the page and create the channel.

You’ll be redirected to a channel list, which should look like the below (Image 5).

Image 5 — Channel list page in Conveyor

The three dots to the right of each row (Image 5) allow you to make some quick actions like upload more data, delete the channel, and you can also jump to the discover page. If you’re not familiar with Kibana the Discover page is a great place to get an idea of what your data looks like. From there you can search your data and save those searches to return to later.

☞ Hover over the three dots beside your channel and click Discover

This will take you to the discover page shown below. See Image 5 to see which 3 dots we’re talking about.

Image 6 — Discover page in Kibana showing our Kickstarter data.

Let’s break this page down a bit: The numbers below coincide with the red numbers shown in Image 6.

  1. In the top left hand corner you’ll see the word “Kickstarter”. This is the channel of data we just created. This drop down shows all of the index patterns in Kibana. Conveyor created this for you when you created the channel.
  2. Below the index pattern is a list of available fields and the fields you’ve selected to make a table with. I don’t have any fields selected.
  3. The main body of the page shows the raw data. Clicking on one of the drop down arrows expands that row of data. You can use the magnifying plus and minus to include or exclude those values from your search.

Feel free to play around with this page more, but once you’re ready we’re going to start creating some visualizations.

What’s the big deal — We think there is an amazing amount of untapped power in the Elastic stack — both because its hard to get at (something we are working to solve with Conveyor) and because everyone just knows Elastic as a search engine.

Now that we’ve easily imported our data with Conveyor, let’s create a dashboard with Kibana.

Creating the Visualization

☞ Click on Visualize in the side navigation panel

This will take you to a page that lists all of the visualizations you’ve created. Since you haven’t created any, click on Create Visualization and you’ll be taken to a page showing all of the visualizations available in Kibana. (Image 6)

We’re going to end up creating 3 different Metric visualizations, 2 Horizontal bar charts, and a line chart. Let’s go ahead and create our first metric.

☞ Click on the Metric visualization type

See Image 6 if you can’t spot it. You may have to choose the Kickstarter index pattern after selecting clicking on the metric visualization button. This tells Kibana which channel of data we’re creating the visualization for.

Image 6 — Creating a visualization in Kibana

Not much is needed to create our first metric visualization. Our goal is to show the total number of Kickstarter projects in the dataset. Kibana defaults to a count so we don’t have to change that.

☞ Change the Custom Label of the visualization

We don’t have to change the aggregation, but you should give it a better name. Click the arrow next to Metric to expand the options for that metric. Specify a Custom label of Total Projects (Image 7, step 1).

☞ Click the play button to make your changes show up

The play button is located in the upper right hand corner of the Metrics panel. (Image 7, step 2).

☞ Click save in the upper right hand corner

See Image 7 step 3 for help finding the save button. One you click it a menu will drop down allowing you to name your visualization. I named mine Total Kickstarters. Once you’re done with that click save, but stay on this same page (Image 7, step 4).

Image 7 — Getting ready to save the first metric visualization.

Now we’re going to create two more metric visualizations showing the total failed projects and the successful projects. Since we’re just doing variants of the first one we can re-use it, change it’s filter, and save it as a new visualization.

☞ Click the “Add Filter” button directly below the search bar

A menu will pop up allowing you to specify the filter. We want to filter state.keyword to where it is successful. Once you’ve done this click the save button on the filter pop up (Image 8).

Image 8 — Adding a filter to create a new metric visualization.

The number in the metric should’ve changed when you clicked saved. Before you save you’ll want to change the label again.

Now you are ready to save this as a new visualization.

☞ Click save in the upper right hand corner

This time I named mine Successful Kickstarters. Make sure you select the Save as a new visualization option before clicking the save button (Image 9).

Image 9 — Creating the second metric, Save as a new visualization!

Now for some self practice. We need to create the last metric visualization. This time it will be for Failed projects. See if you can create it on your own. If you need some help the basic steps are below:

  • Change the filter to be where state.keyword is failed.
  • Change the label
  • Save the Visualization, but make sure to Save as a new visualization.

Now if you go back to the main visualization navigation page you should have a list that shows the three metric visualizations you created. If you don’t, that probably means you forgot to click the Save as a new visualization option on one of them. Make sure you sort that out before continuing.

Here’s what my visualization list looks like at this stage (Image 10).

Image 10 — Visualization list after creating 3 Metric Visualizations.

Now we’re going to create two bar charts. The first shows us where the majority of the dollars are being pledged and another one to show us if certain categories seem to be more successful.

☞ From the visualization list page click the small + button

This will take you to the screen listing all of the visualization types.This time we’re going to create a Horizontal bar chart. Click on that chart type and then choose the Kickstarter index.

The chart you’re greeted with isn’t very useful, but the interface is very similar to the other chart type. Things will take shape quickly. We need to change at least two things: the Y axis aggregation and we need to split the series.

☞ From the visualization list page click the small + button

Expand the Y-Axis metric, change the Aggregation to sum, and when you do that a new input will appear asking for you to select the field. Select USD pledged. While you’re here give this a sensible label, I choose Total Pledged. (Image 11, Step 1)

Once you do that you can click the Play button just above the Metrics fields. However, not much will have changed.

☞ In the Buckets section click the X-Axis option

In the aggregation pull down choose terms and the field we are using for our terms is going to be main_category.keyword. We also want to expand the size (number of terms shown) to be 15. Give this axis a label, I chose Main Category. Then hit the play button again. (Image 11, Step 2)

It looks like Games is the section getting the most pledged, but design and technology aren’t far behind.

☞ Click save in the upper right hand corner

Name your visualization: I named mine Pledges by Category. Then click save. (Image 11, Step 3)

Image 11 — Creating the first horizontal bar chart for Main Category.

The chart we just created shows the total pledged by main category, but let’s create the same chart for the category.keyword field which is a sub-category. Try this one your own. The mains steps are listed below:

  • Change the field being used for the Bucket on the X-axis.
  • Change the Custom label.
  • Click the save button, make sure to check Save as a new visualization.

The last chart that we’re going to make shows the state of projects over time. We are trying to see if the number of successful projects is going up or down, etc.

☞ From the visualization list page click the small + button

This will take you to the screen listing all of the visualization types. This time we’re going to create a Line chart. Click on that chart type and then choose the Kickstarter index.

We’re going to keep the y-axis as a count, but we need to change the x-axis to show values over time, also called a date histogram. Then we need to split the series by state.

☞ In the Buckets section click the X-Axis option

Choose a date histogram aggregation and the field we are using for our date is going to be launched. We’ll set the interval to be Weekly. Finally, give this axis a label. I chose ‘Date of Launch’. Then hit the play button again. (Image 12)

The first thing you’ll notice is that all of the data is on the right hand side of the chart, but Kibana makes it easy to change the date range. Simply highlight on the chart the area where the data is. Whenever you do this a filter is created.

Image 12 — The line chart after configuration and filtering.

Now let’s split the lines by project state.

☞ In the Buckets section click Add Sub-Buckets

Then click on Split Series. We want a Terms aggregation on the state.keyword field. Then hit the play button. This will create a new line for each state.

The spike in the failed projects around the 7th of July 2014 is what jumps out at me from this chart. Some of the details get lost with overlapping lines and dots, so we’re going to change the chart type to be a vertical bar chart.

☞ Below the Kickstarter label choose the Metrics & Axis Tab

Change Chart type to be bar, and the mode to be stacked. A little bit further down the page, expand the LeftAxis-1 drop down and change the mode to be percentage. Now click play.

Image 13 — The completed line chart (converted to a percentage bar chart)

If all went well you should see a chart similar to the one above (Image 13). There are a lot of things that jump out in this chart:

  • What are the chunk of projects labeled undefined between April 2014 and April 2016?
  • Before June 2014, the success rate seems to be in the upper 30s or lower 40s, but after the spike in failures the week of July 7th 2014, the success rate seems to have reset to the mid 20s?
  • Is the overall success rate increasing again?

Without letting any of those questions distract us, let’s put the entire dashboard together. Before we move on, make sure to save this visualization.

☞ Click save in the upper right hand corner

Name your visualization. I named mine Projects by State Over Time. Once you’re done with that click save.

☞ Click on Dashboard in the side navigation panel

This will take you to a page that lists all of your dashboards you have already created. Click Create a Dashboard to get started.

I like to add all of the charts to the dashboard that I am going to use and then re-arrange them. Let’s do that.

☞ Click add in the menu bar on top of the screen

A menu will pop up showing all of the visualizations you’ve created. Assuming you didn’t create any extras go ahead and click on each one.

Kibana will split the page accordingly and try to arrange the charts intelligently. But since we have different chart types it makes sense for us to custom the arrangement further. You can drag the charts by clicking on their title bars. You can also resize them by clicking and dragging the small arrow in the bottom right hand corner of each panel. Here’s what my dashboard looks like before and after some rearrangement.

Image 14 — Before and after arranging alll of the charts when creating our dashboard.

☞ Click save in the upper right hand corner

I named mine Kickstarter Dashboard. Once you’re done with that click save.

One of the nice features that dashboards offer is the ability to apply filters by clicking on the charts. For example, let’s click on the Games category bar in the bar chart and see how it filters the sub-category chart and the line chart. It looks like the Games category has been more stable than the overall population and maybe even shows a slight trend towards increasing success rates.

Image 15 — Our almost completed dashboard, but we should clean up some of the Axis and Labels.

Now that we’re almost finished it is time for some review. I’ve made a few mistakes along the way (hopefully you didn’t) and there are some opportunities for improvement.

  • I forgot to change the labels on my metric visualizations.
  • The legends and the x-axis labels are taking up way too much space on the horizontal bar charts.
  • The time series chart labels could be improved.

You already know how to fix labels, so do that on anything that you missed. Then let’s remove the legends from the two horizontal bar charts.

☞ Open each of the horizontal bar chart visualization

Do this by going to the Visualize tab in the side navigation panel. Click on the name of one of the charts to open it back up.

☞ Click the collapse arrow by the legend.

To the right side of the chart is the default legend position. Next to the legend is a small grey circle with an arrow in it. Click it to hide the legend. Then save your changes. This will hide the legend by default. Repeat this for the other horizontal bar chart.

For both of the horizontal bar charts, notice how the usd pledged value is showing as a larger number. We can change the default display for this field to abbreviate. Let’s do that.

☞ Click Management in the side navigation panel

This opens up a screen for managing lots of things about Kibana and Elasticsearch. But we want to click on Index Patterns under Kibana. Then choose the Kickstarter index pattern (Image 16).

Image 16 — Kibana Management screen

☞ Click the small pencil button beside the usd pledged field

I had to scroll down before I could see this field.

If the Format is already of type Number then you just need to change the format pattern.

☞ Change the numeral.js format pattern

Make sure the Format drop down has Number selected. Then change the Numeral.js format pattern to be “0a”. See image below for what my screen looks like after all of the changes. There’s a link in the interface if you want to know more about the formatting options. (Image 17)

Image 17 — Changing the number format for the usd pledged field.

Now with all of that done we’ve created a very good looking dashboard for this Kickstarter dataset. And if in the future they release another data dump we can easily use Conveyor to upload it and our dashboard will be updated with that new data.

If we did our job right, you should now see the synergy that Conveyor can bring to Kibana for tackling BI with the Elastic Stack. We are going to continue to build Conveyor out for our own BI needs and we would love to help (or at minimum celebrate) any BI or data analysis work that you would like to do in this system.

If you have any trouble with the tutorial feel free to reach out at social@spg.ai, or join us on our gitter channel. Please check out the Conveyor Repo on Github and the docs for more information. If you find a bug, feel free to open an issue.

--

--

Caleb Keller
Smart Platform Group

Mechanical Engineer turned Data Scientist turned Machine Learning practitioner. Focused on solving the problems of enterprise data, starting with how we can Do