Python-Data Projects — Data Analysis UI Reinforced by Ipywidgets

Erol Mesut Gün
Analytics Vidhya
Published in
10 min readOct 1, 2019

Jupyter Notebook accompanied by ipywidget plugin allows a comfortable and interactive user experience.

In this post, we will be constructing an ipywidget framework that:

  • retrieves source data (in CSV format) and parses the content,
  • supplements statistical and meta information, and
  • plots data with common matplotlib graphs
Expected Design and Functionality in Action

Let’s face it! Whoever works with CSV data, especially while developing even a teeny-tiny function, has to accept the fact that it is a pretty troublesome process since every file has peculiar formats, delimitation, meta|top headers, grand totals, etc. Otherwise, the generalisability of the function is quite limited. Thus, we should define and stick to a manner that facilitates a user experience which allows users to interactively choose among alternatives.

A Typical Scenario Flow

Let’s start by digesting the general ipywidget framework and widgets types.

Ipywidget Layout Structure and Widget Types

While developing an ipywidget implementation, first-thing-first is creating the layout and spreading widgets into. The following picture summaries the outlook we aim to achieve.

Expected Layout Preview

Time to apply the reverse engineering technique and scrutinise its components.

Tab Structure

In the layout, there exists a 3-partite Tab structure and each Tab contains distinct child widgets that cooperates to achieve project targets; retrieve, supplement and plot.

tab = widgets.Tab()children = [...]                            # to be introducedtab.children = children
tab.set_title(0, "Upload")
tab.set_title(1, "Describer")
tab.set_title(2, "Plotter")

children property loads a list of Tabs that contain the index and name.

Accordion Widget

In the Upload Tab, we face an accordion widget which is super user friendly especially if a sequential process is to be followed.

accordion = widgets.Accordion(children=[...])    # to be introduced
accordion.set_title(0, 'File Selection')
accordion.set_title(1, 'Delimiter')
accordion.set_title(2, 'Skip Rows')

Just as Tab widget, hierarchic system can be introduced via children property and set_title() method.

Button Widget

Button widgets are trigger elements that user clicks on to retrieve expected action.

button_preview = widgets.Button(
description='Preview',
disabled=False,
button_style='info',
tooltip='Click to Preview',
icon='search')
def preview():
... # to be introduced
def preview_clicked(b):
preview()button_preview.on_click(preview_clicked)

As we can reinforce the visual aspect of the Buttons via description, button_style and icon properties, also the functionality via methods such as on_click(). Let’s save the functionality to the relevant section below, for now.

Output Widget

The naming should be straight-forward, Output widget presents the snapshot of data on the fly.

out = widgets.Output(layout={'border': '1px solid black'}).
.
.
with out:
out.clear_output()
print('\n -----Now this is how your DF looks like:----- \n')
if df is not None:
print(df.head(10))
else:
print('Configuration is wrong/missing...')
Output Widget Preview

FileUpload Widget

Again, as the name suggest, FileUpload widget is activated to feed system with raw data.

up = widgets.FileUpload(accept="", multiple=False)
FileUpload Widget Preview

RadioButtons Widget

RadioButtons widget is used to make a singular selection among alternatives.

# RadioButtons widget instantiation
delim = widgets.RadioButtons(
options=[';', ',', ' '],
description='Separator: ',
disabled=False
)
RadioButtons and SelectMultiple Widget Preview

SelectMultiple Widget

SelectMultiple widget allows multiple selection among different options.

# SelectMultiple widget instantiation
eraser = widgets.SelectMultiple(
options=['tab','"'],
value=['tab'],
#rows=10,
description='Eraser: ',
disabled=False
)

IntSlider Widget

IntSlider widget offers the possibility of defining a numerical magnitude on a sliding picker and the configuration of a predefined min-max spectrum.

# IntSlider widget instantiation
rows = widgets.IntSlider(
value=0,
step=1,
description='# of lines:',
disabled=False,
continuous_update=False,
orientation='horizontal',
readout=True,
readout_format='d'
)
IntSlider Widget Preview

ToggleButtons Widget

ToggleButtons, due to both; embodying the functionality of a standard Button and granting the selection possibility among different options, are strong visual types.

# ToggleButtons widget instantiation
toggle = widgets.ToggleButtons(
options=['Preview', 'Info', 'Stats'],
description='Options',
disabled=False,
button_style='warning',
icons=['search', 'info', 'tachometer']
)
ToggleButtons Widget Preview

Dropdown Widget

Dropdown widget constitutes yet another example to singular selectors. In a typical usage, the user is asked to pick an option that are either pre-defined by developer, or dynamically updated.

# Dropdown widget instantiation
graph_type = widgets.Dropdown(
options=['Bar Chart', 'Line Chart'],
value='Bar Chart',
description='Chart Type:',
disabled=False,
)
Dropdown and ColorPicker Widgets Preview and HBox/VBox Usage

ColorPicker Widget

ColorPicker widget enters the scene, once color related value yet to be defined. Either via choosing a tone on color pallet or by introducing color name in the widget, color is defined.

# ColorPicker widget instantiation
color_picker = widgets.ColorPicker(
concise=False,
description='Color Picker: ',
value='lightblue',
disabled=False
)

HBox/VBox

HBox/VBox, in spite of both are evaluated as widgets, they lack stand-alone visual form. On the other hand, HBox/VBox serves as containers, thus grouped widget in those containers will be positioned whether if they are Horizontally (HBox) or Vertically (VBox) clustered.

# 4 horizontal layers (VBox) and 2 columnar structure (HBox)
widgets.VBox([
widgets.HBox([graph_type, color_picker]),
widgets.HBox([x_axis, y_axis]),
button_plot,
out
])

Following the completion of visual architecture of exercise, now it is the time of granting functionality to Buttons.

Data Retrieval and Parsing

Steps to be followed during data retrieval and parsing the content:

  • Accessing the content of CSV that user has shared over FileUpload widget (“up” object);
  • Cleansing of data in accordance with the parameters that are introduced form user (“delim”, “eraser” and “rows” objects) and also conversion of data object into pandas DataFrame; and
  • Empowering the Buttons with the functionalities of “Preview” where the user can observe moment to moment snapshot of data and “Upload” to store data in the system.

Accessing the content of CSV over FileUpload Widget

Once the data obtained through FileUpload widget is analysed, it is easily observed that the raw data is stored under “content” field.

>>> print(up)FileUpload(value={'oscar_male.csv': {'metadata': {'name': 'oscar_male.csv', 'type': 'text/csv', 'size': 4413, 'lastModified': 1555765537290}, 'content': b'"Index", "Year", "Age", "Name", "Movie"\n 1, 1928, 44, "Emil Jannings", "The Last Command, The Way of All Flesh"\n 2, 1929, 41, "Warner Baxter", "In Old Arizona"\n 3, 1930, 62, "George Arliss", "Disraeli"\n 4, 1931, 53, "Lionel Barrymore", "A Free Soul"\n 5, 1932, 47, "Wallace Beery", "The Champ"\n 6, 1933, 35, "Fredric March", "Dr. Jekyll and Mr. Hyde"\n 7, 1934, 34, "Charles Laughton", "The Private Life of Henry VIII"\n 8, 1935, 34, "Clark Gable", "It Happened One Night"\n 9, 1936,...)

Residual parts give insights about meta information about the source file, such as file-type, etc.

def content_parser():
if up.value == {}:
with out:
print('No CSV loaded')
else:
typ, content = "", ""
up_value = up.value
for i in up_value.keys():
typ = up_value[i]["metadata"]["type"] if typ == "text/csv":
content = up_value[i]["content"]
content_str = str(content, 'utf-8')

Parsing and Transferring the Content into a DataFrame

The fact mentioned at the very beginning of this article repeats itself, the structure of each CSV varies. Picked delimiter by default is “;”, however in reality “,” splits cells from each other. Output widget reveals the fact.

Snapshot of Data over Preview Button and Selection of Delimiter and Eraser Inputs

To that end, “Delimiter” compartment from Accordion widget has to be readjusted as well as “Eraser”. By that, a proper parsing will be conducted.

Output Widget after Proper Parsing

All values that are introduced by user, can be captured through “value” property of the widgets and re-used inside of functions.

def content_parser():
if up.value == {}:
with out:
print('No CSV loaded')
else:
typ, content = "", ""
up_value = up.value
for i in up_value.keys():
typ = up_value[i]["metadata"]["type"]
if typ == "text/csv":
content = up_value[i]["content"]
content_str = str(content, 'utf-8')

if eraser.value != {}:
for val in eraser.value:
if val == "tab":
content_str = content_str.replace("\t","")
else:
content_str = content_str.replace(val,"")
if content_str != "":
str_io = StringIO(content_str)
return str_io
def df_converter():
content = content_parser()
if content is not None:
df = pd.read_csv(
content,
sep=delim.value,
index_col=False,
skiprows=rows.value)
return df
else:
return None

Step-by-Step Process Monitoring and Uploading the Final CSV

Thanks to the functionality of “Preview” button, interim results can be monitored and via “Upload” button, the final CSV can be uploaded into the system.

Preview Button Usage in the Flow

The main difference between “Upload” and “Preview” button, subsequent to “Upload” action, axis options in the “Plotter” tab are updated in accordance with the loaded data. As it can be observed over the scenario, there is a dynamic communication intra-widgets and between widgets and the user himself/herself.

Dynamic Update of Dropdown Option Values Subsequent to Loading a CSV
def preview():
df = df_converter()
with out:
out.clear_output()
print('\n -----Now this is how your DF looks like:----- \n')
if df is not None:
print(df.head(10))
else:
print('Configuration is wrong/missing...')
def upload():
df = df_converter()
with out:
out.clear_output()
print('\n -----Your uploaded DF looks like:----- \n')
if df is not None:
print(df)
x_axis.options = df.columns # Dropdown Widget update
y_axis.options = df.columns # Dropdown Widget update
else:
print('Configuration is wrong/missing...')
def preview_clicked(b):
preview()
def upload_clicked(b):
upload()

# Assigning functionality to buttons
button_preview.on_click(preview_clicked)
button_upload.on_click(upload_clicked)

Reaching to Statistical and Meta-Information

In the scope of this article, what we address by “statistical” and “meta” information is restricted to basic DataFrame operations such as “head()”, “info()” and “describe()” and their appearance on the output widget.

Describer Tab Preview

Each of ToggleButton alternative is listed below. In ToggleButtons, rather than using “on_click()” method as we did with Buttons, thanks to “observe()” method, we can retrieve and return the current values.

def desc():
info_level = toggle.value
if info_level != {}:
df = df_converter()
with out:
out.clear_output()
print('\n ------Your {} looks like:------ \n'.format(
info_level))
if df is not None:
if info_level == 'Info ':
print(df.info(verbose=True))
elif info_level == 'Stats ':
print(df.describe())
elif info_level == 'Preview ':
print(df.head(5))
else:
print('Configuration is wrong/missing...')
toggle.observe(desc_clicked, 'value')

Data Visualisation

We have already covered how X and Y axis option values are populated over uploading the raw file. Users can start analysing graphs by selecting dimensions and metrics. In addition to that, graphical representations can be reinforced by further colorisation and plotting styles.

To focus on ipywidget theme, I restrict visualisation options with simple line and bar graphs. Hereunder, “Chart Type” dropdown can be utilised to define the chart style.

Bar Chart Visualisation Preview
Line Chart Visualisation Preview
def plot():
graph = graph_type.value
if graph != {}:
df = df_converter()
with out:
out.clear_output()
print('\n -----Your {} looks like:----- \n'.format(
graph))
if (df is not None):
df = df.head(5)
height = df[y_axis.value]
bars = df[x_axis.value]
y_pos = np.arange(len(height))
plt.figure(figsize=(10,4))
if graph == 'Bar Chart':
plt.bar(
y_pos,
height,
color=color_picker.value)
plt.xticks(y_pos, bars)
elif graph == 'Line Chart':
plt.plot(
bars,
height,
color=color_picker.value,
marker='o',
linestyle='solid'
)
plt.xticks(bars)
plt.show()

def plotter_clicked(b):
plot()
button_plot.on_click(plotter_clicked)

By taking the variety of usage options into consideration, the implementation introduced above is open to development and improvement. Adapting this framework to peculiar CSV formats, niche parsing use-cases and flashy charts is nothing but easy peasy lemon squeezy.

To access Implementation Video (only available in Turkish)👇

Implementation Video (Explained in Turkish)

To access complete Python code 👇

import pandas as pd
import sys
from io import StringIO
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
tab = widgets.Tab()
out = widgets.Output(layout={'border': '1px solid black'})
up = widgets.FileUpload(accept="", multiple=False)
delim = widgets.RadioButtons(
options=[';', ',', ' '],
description='Separator: ',
disabled=False)
eraser = widgets.SelectMultiple(
options=['tab','"'],
value=['tab'],
#rows=10,
description='Eraser: ',
disabled=False)
rows = widgets.IntSlider(
value=0,
step=1,
description='# of lines:',
disabled=False,
continuous_update=False,
orientation='horizontal',
readout=True,
readout_format='d')
button_upload = widgets.Button(
description='Upload',
disabled=False,
button_style='warning',
tooltip='Click to Upload',
icon='check')
button_preview = widgets.Button(
description='Preview',
disabled=False,
button_style='info',
tooltip='Click to Preview',
icon='search')
button_plot = widgets.Button(
description='Plot',
disabled=False,
button_style='danger',
tooltip='Click to Plot',
icon='pencil')
graph_type = widgets.Dropdown(
options=['Bar Chart', 'Line Chart'],
value='Bar Chart',
description='Chart Type:',
disabled=False)
x_axis = widgets.Dropdown(
options=[''],
value='',
description='X-Axis:',
disabled=False)
y_axis = widgets.Dropdown(
options=[''],
value='',
description='Y-Axis:',
disabled=False)
color_picker = widgets.ColorPicker(
concise=False,
description='Color Picker: ',
value='lightblue',
disabled=False)
toggle = widgets.ToggleButtons(
options=['Preview ', 'Info ', 'Stats '],
description='Options',
disabled=False,
button_style='warning',
icons=['search', 'info', 'tachometer'])
accordion = widgets.Accordion(children=[
up,
widgets.VBox([delim, eraser]),
rows])
accordion.set_title(0, 'File Selection')
accordion.set_title(1, 'Delimiter')
accordion.set_title(2, 'Skip Rows')
accordion_box = widgets.VBox([
accordion,
widgets.HBox([button_preview, button_upload]),
out
])
children = [
accordion_box,
widgets.VBox([toggle, out]),
widgets.VBox([
widgets.HBox([graph_type, color_picker]),
widgets.HBox([x_axis, y_axis]),
button_plot,
out
])]
tab.children = children
tab.set_title(0, "Upload")
tab.set_title(1, "Describer")
tab.set_title(2, "Plotter")
tab
def content_parser():
if up.value == {}:
with out:
print('No CSV loaded')
else:
typ, content = "", ""
up_value = up.value
for i in up_value.keys():
typ = up_value[i]["metadata"]["type"]if typ == "text/csv":
content = up_value[i]["content"]
content_str = str(content, 'utf-8')

if eraser.value != {}:
for val in eraser.value:
if val == "tab":
content_str = content_str.replace("\t","")
else:
content_str = content_str.replace(val,"")
if content_str != "":
str_io = StringIO(content_str)
return str_io
def df_converter():
content = content_parser()
if content is not None:
df = pd.read_csv(content, sep=delim.value, index_col=False, skiprows=rows.value)
return df
else:
return None
def preview():
df = df_converter()
with out:
out.clear_output()
print('\n -----Now this is how your DF looks like:----- \n')
if df is not None:
print(df.head(10))
else:
print('Configuration is wrong/missing...')
def upload():
df = df_converter()
with out:
out.clear_output()
print('\n --------Your uploaded DF looks like:-------- \n')
if df is not None:
print(df)
x_axis.options = df.columns
y_axis.options = df.columns
else:
print('Configuration is wrong/missing...')
def desc():
info_level = toggle.value
if info_level != {}:
df = df_converter()
with out:
out.clear_output()
print('\n ------Your {} looks like:------ \n'.format(
info_level))
if df is not None:
if info_level == 'Info ':
print(df.info(verbose=True))
elif info_level == 'Stats ':
print(df.describe())
elif info_level == 'Preview ':
print(df.head(5))
else:
print('Configuration is wrong/missing...')


def plot():
graph = graph_type.value
if graph != {}:
df = df_converter()
with out:
out.clear_output()
print('\n ------Your {} looks like:------ \n'.format(
graph))
if (df is not None):
df = df.head(5)
height = df[y_axis.value]
bars = df[x_axis.value]
y_pos = np.arange(len(height))
plt.figure(figsize=(10,4))
if graph == 'Bar Chart':
plt.bar(
y_pos,
height,
color=color_picker.value)
plt.xticks(y_pos, bars)
elif graph == 'Line Chart':
plt.plot(
bars,
height,
color=color_picker.value,
marker='o',
linestyle='solid'
)
plt.xticks(bars)
plt.show()

def preview_clicked(b):
preview()
def upload_clicked(b):
upload()
def desc_clicked(b):
desc()
def plotter_clicked(b):
plot()

button_preview.on_click(preview_clicked)
button_upload.on_click(upload_clicked)
toggle.observe(desc_clicked, 'value')
button_plot.on_click(plotter_clicked)

--

--