NLP is ubiquitous in technology we use daily, such as email filters, search results, predictive text, and smart assistants. There are also a few NLP data science project ideas floating around Ro, and I have a few personal projects in mind where NLP/data mining would be very handy, so I figured now would be a great time to learn about basic techniques through this course!

Overall, I think that the original lessons (Getting an Idea of NLP and its Applications) in this course are comprehensive and structured well, although it focuses more on data mining (superficial text analysis). The instructor…

Basic machine learning concepts are useful for any analyst! I followed a popular Intro to ML course on Udemy and compiled my high-level notes from the training below.

A few basic definitions

  • Artificial Intelligence — The idea that a computer can complete tasks that are historically thought of as only being able to be done by a human, such as speech recognition
  • Machine Learning — A subset of Artificial Intelligence; a method of how computers create data models by “learning”
  • Mean — Expected value, or average
  • Variance — Measures how far a set of numbers are spread out from their average value
  • Covariance —…

What is the probability of sharing birthdays?

The other day, a few coworkers and I were having lunch and we found out that two people in the group had the same birthday. Someone exclaimed, “What are the chances!?”

“Welllll”, I declared as a prior statistics major, “I don’t know right now about this size group, but if you have a group of 23 people, there is a 50% chance that at least two people have the same birthday.” What?? How can that be?

I then decided to ask the data team: How many people do you need before you have over a 50% chance that two people…

Sarah Harrison

