Data Visualization with Plotly Part 1
In data science, visual presentation of the data is a first-class citizen. We combine together various charts to better understand the data and the relationships it hides. If you’re working on a data analysis role, then using appropriate visualization tools is something indispensable for you.
In a series of articles, we give you a detailed presentation of a famous plotting library in the Python ecosystem: Plotly. As the first part of this series, in this article we talk about why we choose to present you Plotly instead of the other libraries as well as the basic architecture of the Plotly library which will be useful to grasp the basic workings of it.
The pleasure of visualization: PLOTLY
Plotly is an open-source library which provides a whole set of chart types as well as tools to create dynamic dashboards. You can think of Plotly as a suite of tools as it integrates or extends with libraries such as Dash or Chart Studio to provide interactive dashboards. To give a sense of what we’re talking about, click on the link below and an animated graph will welcome you:
The main emphasis of Plotly are interactivity and visual quality. So, it supports dynamic charts and animations as a first principle. This is the main difference between other visualisation libraries like matplotlib or seaborn. The charts produced with Plotly are dynamic and animated by default and it supports a versatile playground to create dynamic dashboards with ever changing streams of data.
Plotly supports dynamic charts and animations as a first principle and this is the main difference between other visualisation libraries like matplotlib or seaborn.
The main properties of it can be summarized as follows:
- Plotly can be used with many languages including R, Python and Java.
- No JavaScript knowledge is required at all. You code Plotly in your choice of supported languages.
- An important feature of Plotly rests on how it handles graphs as JSON. Each Plotly visual is a JSON object. Why is this important? In this way, the visual can be accessed and used in different programming languages.
- With Plotly not only you can plot adhoc charts, you can also build dynamic dashboards. Dash is a useful Plotly extension for building web applications. You can use it to create dashboards that support user interactions.
- Chart Studio allows you to create and update the graphics you want without any coding. It has a very simple and useful interface. It is especially useful in areas such as business intelligence.
- Plotly allows you to view the entire dataset in the same figure which is very important for the user experience. Many different graphic interactions can be created from a single screen using the “Add Custom Controls” features.
- Transforming Matplotlib charts to Plotly charts is supported.
- Plotly has been added to the Pandas plotting backend with the new version of Pandas. So we can make plotting on Pandas without having to import Plotly Express.
How to install Plotly
Throughout this series of articles, we’ll be using Plotly on Python 3. Our choice of IDE will be the Jupyter Notebooks. Once you installed Python and Jupyter, then you should install the Plotly package.
You can install the latest version of Plotly by opening up terminal (or command prompt) and typing the command below:
pip install plotly
If your choice of package manager is conda, then you can install it using:
conda install -c plotly
And if you’d like to install it from a Jupyter Notebook, then you can do it by running the following command in a cell of your notebook:
!pip install plotly
Basic Architecture of the Plotly Library
The Plotly library has the following modules:
- Graph_objs (plotly. graphs_objs): It is the module that contains the objects or shape templates used to visualize. Graph_objs is low-level interface to figures, traces and layout. Graph objects can be turned into their Python dictionary representation . Similarly, you can turn the JSON representation to a graph object.
- Plotly Express(plotly.express): Plotly Express is the high level api of the Plotly and it’s much easier to draw charts with this module. We can even draw the whole figure with a single line of code. That being said, it is relatively new and sufficient help may not be provided in the documentation.
- Subplots(make_subplots): This module contains the helper functions for layouts of the multi-plot figures. Figures with predefined subplots configured in ‘layout’.
- Figure Factories(plotly.figure_factory): This module provides many special types of figures such that drawing these in Plotly or Plotly Express is quite difficult. These figures can be easily plotted with Figure Factories. These charts are: Annotated Heatmaps, Dendrograms, Gantt Charts, Quiver Plots, Streamline Plots, Tables, Ternary Contour Plots, Triangulated Surface Plots.
- I/O: This module is the low-level interface for displaying, reading and writing figures for static image, JSON, html and etc.
Usage Types
We can work with Plotly library online and offline.
Online Usage
Figures drawn online are saved under Plotly cloud. If you have a free account, it is always shared with others. This usage type is not what we favor and we’ll not be using in the rest of this series. But for your reference, we wanted to talked about it here briefly.
For online use, you must first create an account on Plotly, and you must create an API key with this account and identify it in the code you will run.
import chart_studio
chart_studio.tools.set_credentials_file(username=' ', api_key=' ')
Offline Usage
Using the offline mode, we can get figures like how we get from matplotlib or seaborn. The figures obtained as a result of offline work will be stored on the local. In order to use this mode, we need to call the plots from the offline class as follows:
import plotly
import plotly.offline as offlineoffline.plot([{'x': [7, 5, 1, 0],
'y': [11, 2, 5, 5]}])
If you want to use the resulting figure interactively, you can save it as a png file or use with Chart Studio (Plotly’s own editing tool) from the Plotly menu in the corner of the created figure:
If you want to display the figure as an output cell in your notebook, you can do it with the notebook mode command as follows:
import plotly
import plotly.offline as offline
offline.init_notebook_mode(connected=True)offline.plot([{'x': [7, 5, 1, 0],
'y': [11, 2, 5, 5]}])
Figures in Plotly
The basic unit of work in Plotly is the figure. A figure created with the Plotly library consists of 2 components: data and layout.
Data is a list of traces. Each trace is a dictionary type object that holds the values to be drawn. Each element of the figure is identified by a trace. A trace consists of a collection of data and the type of this data.
#Trace examplestrace = go.Box(y=y0,name ="employee")
trace1 = go.Scatter( x=[1, 2, 3, 4],y=[16, 5, 11, 9])fig.add_trace(trace)
fig.add_trace(trace1)
Layout is also a dictionary that describes the properties of the figure showing the data. Unlike trace’s configuration options, layout options are applied to the figure as a whole.
#Layout examplefig.update_layout(
height=700,
width=1000,
title_text="Log graphs",
showlegend=False,
)
Conclusion
Now that we’ve completed the overview of the Plotly library, we can continue with how we can draw our charts with it. In the next articles of this series, we will be working with Plotly and Plotly Express. Specifically, we’ll cover the following chart types:
- Basic charts (Line, bar, scatter, pie chart….)
- Statistical Charts (histogram,box plot….)
- Scientific Charts (Contour plots, heatmaps…)
- Financial Charts (Time Series and Date Axes, Waterfall Charts,..)
- Add Custom Controls (Dropdown Menus, Custom Buttons,….)
- Maps (Bubble Maps, …)
- 3D Charts (3D Scatter Plots, 3D Surface Plots,…. )
- Animations
If you don’t want to miss the next articles, please follow Bootrain on Medium and other platforms as well. Thanks for reading!