3 Python Packages for Every Data Scientist

Jake from Mito
trymito
Published in
3 min readMay 19, 2022

--

  1. Mito

Mito is a spreadsheet front-end for Python. You can call Mito into your Jupyter Notebook and each edit you make in the front-end will generate the equivalent Python. With Mito, you don’t have to spend any time looking for syntax on Stack Overflow or Google. The code is generated for you, so you can ensure that you are creating correct, clean code.

Here is a video demo:

To install Mito, use these three commands in your terminal:

python -m pip install mitoinstaller
python -m mitoinstaller install

Then to open the Mitosheet interface:

import mitosheet
mitosheet.sheet()

Here is a link to the full install instructions.

You can configure a Mito pivot table by selecting the Pivot button from the toolbar and then choosing your rows, columns, values and aggregation types.

Each edit in Mito generates the equivalent Python in the code cell below. It is a much faster way of producing code than constantly heading to Stack Overflow to find the correct syntax.

The pivot table above generates this code and auto-comments it as well!

Mito does not just generate the code for pivot tables. In Mito, you can merge datasets, filter, sort, use functions, look at summary statistics, and more — and Mito will generate the equivalent Python for each of these edits..

To create a Plotly chart, all the user has to do is click the graph button and select their axes.

Here is Mito’s full documentation.

2. Streamlit

Streamlit is an open source Python package that lets you spin up data apps for ML and Data Science super simply.

The analysis part of Data Science is important, but you need to be able to communicate those findings. Interactive apps that are easy for end-users to engage with, are becoming increasingly popular.

To import Streamlit, run:

pip install streamlit
streamlit hello

The second command will open a demo environment in another window.

In Streamlit, you can do things like create a line chart:

chart_data = pd.DataFrame(
np.random.randn(20, 3),
columns=['a', 'b', 'c'])
st.line_chart(chart_data)

Or add interactive widgets:

if st.checkbox('Show dataframe'):
chart_data = pd.DataFrame(
np.random.randn(20, 3),
columns=['a', 'b', 'c'])
chart_data

Here is there full getting started documentation.

Here if a full course that goes over different apps you can build with Streamlit:

3. Plotly

Plotly is the graphing library of the future. It is the best package for quickly and easily making interactive charts and graphs. Packages like matplotlib and seaborn are certainly intuitive as well, but they lack the interactivity that makes Plotly so strong.

To install Plotly, run this command:

$ pip install plotly==5.2.1

In Plotly, there are a whole host of interactive charts to choose from. There are simpler charts, where changing the color of the bars is the interactivity.

import plotly.graph_objects as go
fig = go.Figure(data=go.Bar(y=[2, 3, 1]))
fig.show()
https://plotly.com/python/

They also offer more advanced, dynamic charts:

https://plotly.com/python/

I hope you find these packages helpful :)

--

--

Jake from Mito
trymito

Exploring the future of Python and Spreadsheets