Data-driven Product Management | All views expressed here are my own | 🐦 @scar_nyc27

An OCR to NLP capability that extracts insights from unstructured text using a combination of open-source & cloud-based tools.

Image for post
Image for post

In as little as 3 months, the Healthfirst Data Science team developed a robust Optical Character Recognition (OCR) to Natural Language Processing (NLP) capability combining open-source tools and brand new state-of-the-art AI services available from Amazon. We are using Amazon Textract (OCR) and Amazon Comprehend Medical (NLP) to extract insights from unstructured text that were previously trapped in Patient Medical Records.

Patient Medical Records contain a treasure trove of clinically relevant information (such as important demographic features, medical conditions, and medications)…


Observe the effect of Hubble’s Law using OLS Regression, Pairs Bootstrap Resampling, and Hypothesis Testing.

Image for post
Image for post

In this post, I am going to revisit Hubble’s Law and analyze the original dataset Edwin Hubble used and conduct Ordinary Least Squares Linear Regression on Hubble’s 24 measurements of distances and recessional velocities of extra-galactic nebulae. Then, I will use a pairs bootstrap resampling and perform a hypothesis test on the measured effect of galactic distance on recessional velocities to determine if Hubble’s Law still holds true.

Based on the results of the hypothesis test we can conclude with a high degree of statistical…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store