Python Machine Learning

After 10 months of Jupyter notebook by notebook, 149 lesson by lesson inchwise progress, I completed the Udemy Python for Data Science and Machine Learning course by Jose Portilla. It was a great follow on to Andrew Ng’s Intro to Machine Learning course that I finished January 2018. In the middle of the course at the beginning of this year, I took a break to analyze the CDC population health dataset doing exploratory data analysis (EDA). Jose’s course was well put together and give it solid recommendations for those looking for broad python based learnings.

More good things to come soon …


Detailed analysis of population health indicators, social determinants of health, and effects on gender and race

Photo by CMDR Shane on Unsplash

In part 2 of our CDC Chronic Disease indicator dataset, our analysis revealed several areas with highly correlated interrelationships — indicators within the cardiovascular disease, chronic kidney disease, diabetes, and select indicators in the overarching conditions “social determinants” category. While there are also highly correlated relationships in other areas such as cancer and COPD, we’ll be focusing primarily on the former set in this final blog post.

While Figure 1 from the previous post looked at the relationships among all the indicators, Table 5 from the previous post showed there were a number of top correlation pairs. By looking at…


The ties among diabetes, chronic kidney disease, and cardiovascular disease

Photo by v2osk on Unsplash

In the last story, we started looking into a 15 year chronic disease dataset from the U.S. Center for Disease Control and Prevention, or CDC. The beginnings of the exploratory data analysis started with understanding the columns and rows of data and what was relevant for further analysis.

In this post, we are going to dig deeper to understand these 400K rows and 17 categories of topics, which requires a bit of data wrangling of the dataframe into a format for pivot table summary and visualization. After looking at the previous df_new.head(), …


An exploratory data analysis of population health indicators using Python and data science techniques

Photo by Dan Gribbin on Unsplash

Recently, I’ve taken on a personal project to apply the Python and machine learning I’ve been studying. Since I’ve an interest in population health, I decided to start by focusing on understanding a 15 year population health specific dataset I found on Kaggle. This dataset was from the US Center for Disease Control and Prevention on chronic disease indicators. In this blog series, I want to demonstrate what is in the dataset with exploration. Later on, I’ll go into more of the data visualization. …


Photo: Erik Halvorsen, Joowon Kim, Kareem Barghouti, Matt Noojomi, Shadi Shariatnia, Ben Legum, Daniel Wu, Farzad Soleimani, and Bill McKeon. (Not available for photo: Peter McCaffrey and Kovi Bessoff)

Almost one year ago, I started a new journey into digital health, which was a relatively new field for me. Apart from participating in and leading in a few community health initiatives years ago, I sought to use my background and skills in design research and product management to uncover unmet needs to create digital health innovations. This week, I graduated from the TMCx biodesign fellowship, having met so many very incredible people along the way.

I want to thank the leadership and staff at Texas Medical Center Innovation Institute — Erik Halvorsen, Bill McKeon, Farzad Soleimani, Eric Richardson, Gwyn…


In Part 2, we presented a case of a company with existing products and services and their challenges tackling new opportunities. To address existing products, the digital analytics assessment gives a high level overview but with limitations. To address new products, an approach based on ethnography and observation was more insightful as it addresses the question of motivations and the why’s. In this part, we continue to explore another case and dive deeper into two more methods.

Case: Company A is a technology company with a breadth of software and services in a mature industry. Three competitors directly play in…


TMCx06 Demo Day — June 7, 2018 at the Texas Medical Center Innovation Institute

As a TMC biodesign fellow, I was at the Demo Day for TMCx06 Digital Health startups with an audience of over 800. This day wraps up a 4 month long accelerator for 21 digital health startups from Texas, San Francisco, New York City, and around the world.

Recently, CEO Bill McKeon illustrated the Texas Medical Center’s grand vision and major investment in TMC3, which is the next 50 year vision for building an innovation ecosystem to Houston. This leverages the 60+ research, provider, and healthcare member organizations in this 2 square mile medical complex. …


In Part 1, we talked about the significant cost of not understanding what your customer and user needs were. We presented a case of a company and its challenges, and how we used Net Promoter Score as one way to understand their needs. We also talked about creating personas as an illustration and the process to document and assess what we do and don’t know about our customer or user. In this part, we continue to explore another case and additional analytical and qualitative methods.

Case: Company Q is a public company that had grown through mergers and acquisition, and…


Do you know the cartoon often used for agile product development where the intended design of the swing gets misconstrued into malformed shapes? Ok, this is not yet another article about agile or waterfall, etc. I want to focus on a broader and yet more practical topic. That is, how does a product management professional identify customer and user needs practically and sustainably? Perhaps your organization or team is new to this. I want to say a few words that could help you get started. If you don’t have a PM in your team because you’re a startup or your…


Like a kid in a candy store, I’m happy to announce that I completed Professor Andrew Ng’s Introduction to Machine Learning course on Coursera. In this 11-week course, I sought to bridge my statistical knowledge and background in predictive analytics into artificial intelligence and machine learning (AI/ML). After spending more than 11 weeks, I can say that this is a good starter course on AI/ML and wanted to share my thoughts on getting started with machine learning courses such as these:

  • Introductory statistics and a bit of calculus knowledge is helpful (although not required). As the course goes over gradients…

Daniel Wu

Digital Health, Product Management, Data Science, Analytics, and Innovation

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store