The term “Big Data” is one that is a hot keywords in business and technology today. Data has become an immensely useful tool that has evolved from basic customer or patient data, to also monitoring website clicks, images, audio, video, rf transmissions, or any other type data can be tracked and stored. Throw all of this together with say a customer database with purchase history, now you have exponentially more tools to further determine how best to market and sell to a customer. Sounds simple enough right?! Why wouldn’t a business or organization want to use this information-it sounds like…


I was recently asked to speak upon a random Data Science topic, CART’s, and give a semi-technical, brief explanation of the subject matter. After spending some time consolidating what I already knew, while adding in some extra research to help solidify my position, this is what I focused my high-level summary on.

CART’s, better known as Classification and Regression Trees, uses the concept of decision making, by the means of determining what the next question should be, if any. Once the algorithm reaches the point it doesn’t think it can ask any more questions (or it is cut off through…


In Machine learning, building a model is fairly easy to do to predict an outcome. What complicates the process is finding the right features to pass through the model to help predict the outcome more efficiently. Feature engineering comes to the rescue to help create new variables, from existing variables, to give a more specialized look at the data. In Python, Pandas has a create dummies command that will take a feature, say if someone’s class (upper, middle, lower), that all are in the same field, and make a new binary column for each class. Now the feature of upper…


I recently became familiar with the process of using website API’s and/or how to do web scraping, to extract words or tables from websites, to become a source of data for machine learning purposes. That in itself was pretty interesting, but what you can do with all that information, particularly with words, is fascinating. Welcome to the world of Natural Language Processing (NLP).

NLP has become on of my favorite subjects I have learned in my Data Science learning. Being able to take a post from Reddit, or comments from Amazon, or an article from a webpage, and to create…


Data Science-Perceptions vs. Reality

For the past 20+ years I have spent working as a corporate accountant, ranging from accounts payable to general ledger to controller to a pseudo CFO. Data has always been an integral part of my job, both in creating journal entries and preparing financial statements. I have spent many, many hours weeding through raw data from a point-of-sale system to get inventory movement, to getting sales data, etc (nothing was integrated in an ERP or CRM system), so examining, extrapolating, and drawing conclusions from tens of thousands of rows of data was normal. I became well-versed…

Sam Lundberg

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store