For Data Scientist at Ro to get practical understanding of NLP and relevant algorithms.

Natural Language Processing (NLP) uses algorithms to understand and manipulate human language, and is a very specialized branch of Artificial Intelligence (AI) research. A lot of theories and practical approaches are developed solely to process language data, and as one of the most popular field of AI, the state-of-the-art NLP techniques are evolving very fast. The NLP Specialization course hosted by deeplearning.ai provides a systematic review of some major challenges of NLP problems and technical solutions behind each type of application.

Image for post
Image for post
4 courses under the NLP specialization

This course requires a working knowledge of machine learning as well as a good understanding of deep learning framework. You can take the training on Deep Learning Specialization first before diving into this topic. …


For Data Scientist at Ro to get practical understanding of Deep Learning

This course is the first part of the 5-course series of Deep Learning Specialization. It is available on coursera and Andrew Ng gives all the lectures in this course.

Image for post
Image for post

Estimated time for completion: 20 hrs (~2.5 days)

Coding exercises are available at the end of each section. The exercises will guide you to build Neural Networks from scratch using Numpy, no Keras or TensorFlow.

Another note, this course is dense with Linear Algebra concept. If you are familiar with matrix operations and matrix multiplication, it will speed up the progress a lot. If you are not familiar with those, don’t worry, this course provides very thorough explanations as well.

More about the other courses later.


NLTK.Vader is one of the more popular tools for sentiment analysis

Sentiment analysis is one of the most popular field in Natural Language Processing (NLP) that automatically identifies and extracts opinions from text. This technique transforms large-scaled unstructured text data into structured and quantitative measurements of the sentimental opinions expressed by the text. Sentiment analysis has been widely applied to monitor the sentiment trend in product reviews, social media comments, news and blog articles.

Researchers have devoted more than a decade to solve this problem, and a few NLP-based sentiment analysis algorithms are readily available. In this article, I will review one of the most popular sentiment analysis tool NLTK.Vader, …


Example Business Applications

If you have suffered the Statistics class in college, you probably have heard about the term, the Markov Chain. Markov Chain is essentially a sequence of events in which the probability of each event depends only on the state attained in the previous event. Sounds abstract? It is actually an every powerful statistical model to help us understand how the business is performing. In this article, I will go over the basic concept of Markov Chain and how we can apply the Markov Model to unlock some business insights.

Introduction of the Business Problem

Markov Chain is used to model a series of events. Each sequence usually is composed by various events and the order and the length of the sequence can vary drastically different from sample to sample. This chain of events are usually very hard to describe with deterministic statistics. For example, in a monthly subscription service, a user can choose whether they want to subscribe or not each month. One user’s subscription history may look like something like…


Retention is a metric to understand how many users return to your platform after the initial conversion. Retention is commonly used as a long-term health indicator in product development and improvement. Additionally, by examining how retention varies by geography, gender or other behavioral characteristics, the business gains better understanding of the contributing factors for retention and can shape data-driven business strategies.

Retention is usually reported as a percentage of users who return to the platform, which generally declines over time and can be charted as a retention curve. The figure below is an illustration of the concept: each trace represents a cohort starting at a different point in time, the x axis is the duration for how long the cohort has been alive and the y axis is the percentage of users that retained. …


Customer life time value (LTV) is a fairly common term that gets tossed around within startups. Depending on the context, the life time value and the calculation needed to get the number could be surprisingly variant. In general, LTV is a dynamic concept representing the net profit attributed to the entire relationship with a customer. LTV is a very influential metric in shaping business decisions because it promotes the concept of longer-term customer relationship management rather than focusing on the immediate profitability. A more accurate understanding of LTV can also allow a company to confidently lower prices and offer more incentives. However, it is usually difficult to quantify this metric, because the projection into the future needs to be done with various assumptions and goals, which leads to methods ranging from crude heuristic to sophisticated machine learning models. …


Ro Data Sip-and-Share Q1 2019

Feature selection or feature pruning is a very crucial step in the pipeline of building a good prediction model and to understand the connections among the features and the target. The goal of feature selection is two-fold: 1. identify and remove features with little or no predictability of the target to prevent overfitting and 2. identify highly correlated or redundant features and suppress the negative impacts towards the model without losing critical information. In here, I will review the following approaches to achieve feature selection in the context of linear and logistic regression:

  1. Statistical Inference
  2. Greedy Search
  3. Regularization

Statistical Inference

The statistical inference approach estimates the standard error the coefficients of regression model, and then constructs a confidence interval and p-value to test whether the coefficients are significantly different than 0. If the null hypothesis of the coefficient being zero is rejected with a small p-value, it means this feature has some genuine effects on the target. …


For Discrete, Continuous, and Standardized Variables

Image for post
Image for post
The logit function

If you’re trying to learn machine learning nowadays, chances are that you have encountered logistic regression at some point. As one of the most popular and approachable machine learning algorithms, the theory behind the logistic regression has been explained in and out by so many people. One area that is less explained, however, is how to translate coefficients into exact impact size measures. …

Ying Ma

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store