Data visualization with Plotly

Valentina Alto
Aug 4 · 4 min read

Data visualization plays a central role whenever you want to extract information and value from your data. Python offers many libraries for that purpose, however here I’m going to talk about Plotly.

Plotly is a technical computing company that develops online data analytics and visualization tools. It can be easily installed via pip install plotly, then imported in your python notebook.

To show you some of its potentialities, I’m going to use a dataset which contains information about volcanos around the world.

Let’s have a look at our dataset:

import pandas as pd
df = pd.read_csv( "https://raw.githubusercontent.com/plotly/datasets/master/volcano_db.csv", encoding="iso-8859-1")
df.head()

A first element we might be interested to collect is the density of volcanos in each country. To do so, let’s start with plotting a histogram which counts the occurrences of each country in our dataset:

import numpy as np
import pandas as pd
df = pd.read_csv(
"https://raw.githubusercontent.com/plotly/datasets/master/volcano_db.csv", encoding="iso-8859-1")
import plotly.express as px
fig = px.histogram(df, x="Country")
fig.show()

For a cleaner visualization, we can sort countries in descending order like so:

fig = px.histogram(df, x="Country").update_xaxes(categoryorder="total descending")
fig.show()

Furthermore, we can add new features to our histogram. Indeed, looking at our dataset we can see that there is one feature, ‘Status’, which might add relevant information to our graph. So let’s first visualize a pie chart of all the status and their importance (in terms of percentage):

import plotly.graph_objects as golabels=df.Status.unique()
values=df['Status'].value_counts()
fig = go.Figure(data=[go.Pie(labels=labels, values=values)])
fig.show()

Now let’s add that information to our histogram:

fig = px.histogram(df, x="Country",color="Status").update_xaxes(categoryorder="total descending")
fig.show()

With a first glimpse at our graph, we can see that the majority of volcanos are in either Holocene or Historical status.

We can also focus a bit more on our volcanos themselves and see, namely, which are the highest. For this purpose, we will use and customize a bar plot:

import plotly.express as px
fig = px.bar(df, x='Volcano Name', y='Elev', color='Type', height=400)
fig.show()

Since we have 1454 volcanos, the barplot is hard to interpret, hence I’m providing here just two zoomed extracts of our graph:

Now let’s focus on a new section of our dataset, which is that of geolocation of our volcanos. Indeed, looking at our data we can see that there are two features, latitude and longitude, which allow us to plot our volcanos on maps.

Let’s implement it with Plotly:

import plotly.graph_objects as go
fig = go.Figure(data=go.Scattergeo(
lon = df['Longitude'],
lat = df['Latitude'],
mode = 'markers'
))
fig.update_layout(
title = 'Volcanos',
geo_scope='world',
)
fig.show()

The same result can be obtained with a different layout. The following code shows how to display your volcanos on a 3D globe map:

fig = go.Figure(data=go.Scattergeo(
lon = df['Longitude'],
lat = df['Latitude'],
mode = 'markers',
showlegend=False,
marker=dict(color="crimson", size=4, opacity=0.8))
)
fig.update_geos(
projection_type="orthographic",
landcolor="white",
oceancolor="MidnightBlue",
showocean=True,
lakecolor="LightBlue"
)
fig.show()

Nice, now let’s proceed with something a bit more complicated. Once got some new data about one volcano’s shape:

df_v = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/volcano.csv")

We can plot a 3D reproduction of that volcano as follows:

fig = go.Figure(data=[go.Surface(z=df_v.values)])fig.update_layout(title='Volcano', autosize=False,
width=500, height=500,
margin=dict(l=65, r=50, b=65, t=90))
fig.show()

As you can see, with just a few lines of code we obtained beautiful and meaningful representations of our data, even before starting inquiring about them with complex statistical models.

If you want to explore all the graphs Plotly offers, I recommend you to visit the official website here (the code is also available in R).

Valentina Alto

Written by

Machine Learning and Statistics enthusiast, currently pursuing a MSc in Data Science at Bocconi University.

DataSeries

Connecting data leaders and curating their thoughts 💡

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade