Data Science Essentials
I’ve recently finished Lillian Pearson’s Data Science Essentials Course which can be found on LinkedIn and Lynda and I highly recommend it! The course is a great introduction to many Data Science tools and methods. The essentials course does a great job of balancing between explaining concepts and showing when and how to practically apply those concepts. Lillian Pearson is a data scientist, a found of Data-Mania, and the author of Data Science for Dummies. (*I don’t get any proceeds from purchases of the book or any web traffic from this link.) My github profile will have jupyter notebooks for each concept with example code as well as explanations for the code. The concepts that the notebooks are named after are explained in more detail at the top of each notebook.
Notebook index
Perhaps later I’ll do a deeper dive into an individual subject that’s covered in these notes. There’s a lot of breadth here so there’s plenty of of room to explore these subjects further.
1. Data Munging Basics
- Filter and select data
- Treat missing values
- Remove duplicates
- Concatenate and transform data
- Group and aggregate data
2. Data Visualization Basics
- Create standard line, bar, and pie plots
- Define plot elements
- Format plots
- Create labels and annotations
- Create visualizations from time series data
- Construct histograms, box plots, and scatter plots
3. Basic Math and Statistics
- Use NumPy arithmetic
- Generate summary statistics
- summarize categorical data
- Parametric methods
- Non-parametric methods
- transform dataset distributions
4. Dimensionality Reduction
- Introduction to machine learning
- Explanatory factor analysis
- Principal component analysis (PCA)
5. Outlier Analysis
- Extreme values analysis using univariate methods
- Multivariate analysis for outlier detection
- A linear projection method for multivariate data
6. Cluster Analysis
- K-means method
- Hierarchical methods
- Instance-based learning with k-Nearest neighbor
7. Network Analysis with NetworkX
- Intro to network analysis
- Work with graph objects
- Simulate a social network
- Generate stats on nodes and inspect graphs
8. Basic Algorithmic Learning
- Linear regression model
- Logistic regression model
- Naive Bayes classifiers
9. Web-based data visualizations with Plotly
- Create basic charts
- Create statistical charts
- Create Plotly choropleth maps
- Create Plotly point maps
10. Web scraping with Beautiful Soup
- Introduction to Beautiful Soup
- Explore navigatableString objects
- Parse data
- Web scrape in practice

