Doug SteeninTowards Data ScienceSeminal Papers in Data Science: A Relational Model for Large Shared Data Banks50 years later, a review of some main concepts from E.F. Codd’s 1970 paper that laid the groundwork for relational databases and SQLOct 17, 2020Oct 17, 2020
Doug SteenBeyond the F-1 score: A look at the F-beta scoreTailoring the F-beta score for specific binary classification problemsOct 11, 20202Oct 11, 20202
Doug SteenHow to code your first simple game using PythonAll you need is a .py file and the command line!Oct 4, 20201Oct 4, 20201
Doug SteeninTowards Data ScienceProgress bars for Python with tqdmTrack the execution of Python iterations with a smart progress barSep 26, 2020Sep 26, 2020
Doug SteenPrecision-Recall CurvesSometimes a curve is worth a thousand words - how to calculate and interpret precision-recall curves in Python.Sep 20, 20204Sep 20, 20204
Doug SteeninTowards Data ScienceUnderstanding the ROC Curve and AUCThese binary classification performance measures go hand-in-hand — let’s explore.Sep 13, 20201Sep 13, 20201
Doug SteeninTowards Data ScienceHow to build KNN from scratch in Python… well, at least without sklearn’s KNeighborsClassifier.Sep 5, 20203Sep 5, 20203
Doug SteeninTowards Data ScienceA Gentle Introduction to Self-Training and Semi-Supervised LearningCoding an example of self-training in Python to utilize unlabeled data for classificationAug 30, 20201Aug 30, 20201
Doug SteeninAnalytics VidhyaObtaining sports data from an API using Python requestsAs a data scientist, the ability to obtain data through an API is a critical skill. In this post, I provide a brief tutorial on obtaining…Aug 23, 2020Aug 23, 2020
Doug SteeninAnalytics VidhyaImplementing PCA in Python with sklearnPrincipal Component Analysis (PCA) is a commonly used dimensionality reduction technique for data sets with a large number of variables…Aug 16, 20202Aug 16, 20202