I created a small coding experiment in Python to see if I can find semantic similarities between documents using basic bag-o’-words strategies, a.k.a, unstructured, word counts. My attempt here is not particularly sophisticated, but I wanted a hands-on experiment to get me up to speed on some of the basics…


Beautiful Ebook Soup

If you leave with just one idea from this blog, please let it be that the Python module BeautifulSoup from bs4 seems to work, … well, beautifully with ebooks. I was recently introduced to the HTML parser module BeautifulSoup in my coursework for data science. And as a new data…


Or, Don’t despair; understanding how to predict the sale prices of houses in King County might actually help us acquire language.

One of key factors that convinced me to study data science was the exhilarating moment when I made the connection between data science and linguistics. …


As a freshman data-scientist-in-training, a few thoughts came to mind in my first week of bootcamp regarding the potential for implicit bias in my approach to data science. And now with my first five immersive days of bootcamp solidly behind me, I can eagerly confirm the role we have as…

David Haase

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store