Python-Pandas Evangelism From a Collegiate Historian-turned-Programmer
In 1949, a personal resume that included a legitimate High School diploma and “Typing” was worth considerably more than it’s weight in gold. Such a resume was potentially a passport to a higher professional strata; certainly for millions of young men who had returned from the battlefields in Europe and Asia, and often, for the women who had entered the workforce during that conflict.
It’s easy to see why typing was so valuable at the time — it represented a vector for information sharing that had finally been adopted almost universally in the post-war…
All my practical posts to date have used Python 3 in one way or another. Python is a great, adaptive language, but like any tool, it’s only as effective as the person using it!
One great way to improve your coding abilities and really get the most out of Python (or any other language, really) is to establish or adopt a consistent, systematic “voice” that you use in all your future projects.
So What exactly is your Python “Voice” anyway?
Well, when you write code in Python, you will have to follow a general framework that the language can identify…
Unsupervised learning is often looked on as a little ‘unconventional’ in the data science world, especially when empirically provable results are desired. K-Means clustering enjoys some enduring popularity, however, because it is relatively simple to employ, and because it functions as a powerful, if temperamental, exploratory data analysis tool.
As promised in my last article, I’ll walk through some of the basics of simple K-means clustering below!
Review of Unsupervised Learning versus Clustering:
As opposed to some other practical examples in this blog, discerning readers will notice that we have no “true”, testable data to compare to! K-means (KM) and…
Machine learning algorithms come in many multifarious forms. One of the simplest categories of machine learning model is the Classification model — a family of models that, at their core, try to apply known labels to the provided data.
In any case where data Classification is desirable, we proceed from an a priori understanding of how the data should break into groups, and then try to apply those groups to observations from our data. …
The Internet is home to a hugely diverse body of data. With this tool in hand, we have an avenue to massive stores of information; but extracting and using that information requires some special tools.
BeautifulSoup is a python package that can help extract, or ‘scrape’, information directly from the HTML and XML files that form the bones of most web pages. In the remainder of this article, we’ll play with BeautifulSoup and build out a case example so you can try it for yourself.
Classification and Regression are two very important concepts for modeling with Machine Learning. In this short article, I’ll go over the basics of these concepts and how they can be applied to simple Data Science questions.
At the most basic level, Predictive Modeling is designed to use historical data to predict unknown data. The unknown data we predict for can be almost anything; from categorical attributes, to approximations of incomplete data, and even predictions of future outcomes!
Financial tech analyst and programmer; I believe that transparency through Data Science is the defining next step in human progress this century.