Top 10 Nice-To-Have Data Science Libraries
Cool finds to make your life easier
Make your project pop with these easy-to-use libraries! Note that these don’t have an essentials-of-data-science element to them, like pandas or scikit-learn, they’re just fun, useful finds.
Missingno: Visualizes missing data.
!pip install missingno
import missingno as msgn#read in data heremsgn.matrix(data)
Plotly: Makes interactive plots, including maps and 3D graphs.
import plotly.offline as py
import plotly_express as px
import cufflinks as cf
#example line graph
data.iplot(kind='line', title='Title, xTitle='Epoch',yTitle='Loss')
Selenium: Makes automatic mouse movements online (i.e. clicking, browsing, etc.).
!pip install selenium
from selenium import webdriverbrowser = webdriver.Chrome(executable_path='/Users/User/chromedriver')
browser.get('https://xkcd.com/') # go to website
go_to_random_commic_button = browser.find_element_by_partial_link_text('Random')
Geopandas + Geopy: These are good for making maps.
!pip install geopandas
!pip install geopy#You can make all sorts of different things with these!
!pip install py_translator
from py_translator import Translator
translator = Translator()
output = translator.translate('Hello World!', dest='fr')
Graphviz: Visualizes tree-based models.
!pip install graphviz
!brew install graphviz
from sklearn.tree import export_graphviz#Make and fit modeltree_file = export_graphviz(model, out_file=None,feature_names=X.columns)
Jupyterlab_spellchecker: Spellchecks markdown text.
!jupyter labextension install @ijmbarr/jupyterlab_spellchecker
Nbextensions: This is not technically a library, its an extension. It will allow you to do a lot of nice things like code folding, automating a table of contents, and “prettifying” code.
!pip install jupyter_contrib_nbextensions
!jupyter contrib nbextension install --user
#enable the features you want from your jupyter homepage (an Nbextensions tab will appear as shown in the image)
Twitter scraper: Scrapes tweets based on date, location, words, etc. Make sure to include a time lag in your scrape to avoid being locked out of Twitter!
!pip install twitterscraper
from twitterscraper import query_tweetslist_of_tweets = query_tweets("'Hello' OR 'Goodbye' ",
limit = 50_000
enddate = datetime.date(2019, 9, 1),
begindate = datetime.date(2014, 1, 1),
poolsize = 1)
Imbalanced-learn: Includes several automated sampling methods to balance classes.
!pip install -U imbalanced-learn
Comment your coolest finds below!