Image for post
Image for post

Python Machine Learning

After 10 months of Jupyter notebook by notebook, 149 lesson by lesson inchwise progress, I completed the Udemy Python for Data Science and Machine Learning course by Jose Portilla. It was a great follow on to Andrew Ng’s Intro to Machine Learning course that I finished January 2018. In the middle of the course at the beginning of this year, I took a break to analyze the CDC population health dataset doing exploratory data analysis (EDA). Jose’s course was well put together and give it solid recommendations for those looking for broad python based learnings.

More good things to come soon …


Detailed analysis of population health indicators, social determinants of health, and effects on gender and race

In part 2 of our CDC Chronic Disease indicator dataset, our analysis revealed several areas with highly correlated interrelationships — indicators within the cardiovascular disease, chronic kidney disease, diabetes, and select indicators in the overarching conditions “social determinants” category. While there are also highly correlated relationships in other areas such as cancer and COPD, we’ll be focusing primarily on the former set in this final blog post.

While Figure 1 from the previous post looked at the relationships among all the indicators, Table 5 from the previous post showed there were a number of top correlation pairs. By looking at the recurring patterns of indicators by specific topic, we can narrow down the scope of the topics of interest. …


The ties among diabetes, chronic kidney disease, and cardiovascular disease

In the last story, we started looking into a 15 year chronic disease dataset from the U.S. Center for Disease Control and Prevention, or CDC. The beginnings of the exploratory data analysis started with understanding the columns and rows of data and what was relevant for further analysis.

In this post, we are going to dig deeper to understand these 400K rows and 17 categories of topics, which requires a bit of data wrangling of the dataframe into a format for pivot table summary and visualization. After looking at the previous df_new.head(), …


An exploratory data analysis of population health indicators using Python and data science techniques

Recently, I’ve taken on a personal project to apply the Python and machine learning I’ve been studying. Since I’ve an interest in population health, I decided to start by focusing on understanding a 15 year population health specific dataset I found on Kaggle. This dataset was from the US Center for Disease Control and Prevention on chronic disease indicators. In this blog series, I want to demonstrate what is in the dataset with exploration. Later on, I’ll go into more of the data visualization. …


Image for post
Image for post
Photo: Erik Halvorsen, Joowon Kim, Kareem Barghouti, Matt Noojomi, Shadi Shariatnia, Ben Legum, Daniel Wu, Farzad Soleimani, and Bill McKeon. (Not available for photo: Peter McCaffrey and Kovi Bessoff)

Almost one year ago, I started a new journey into digital health, which was a relatively new field for me. Apart from participating in and leading in a few community health initiatives years ago, I sought to use my background and skills in design research and product management to uncover unmet needs to create digital health innovations. This week, I graduated from the TMCx biodesign fellowship, having met so many very incredible people along the way.

I want to thank the leadership and staff at Texas Medical Center Innovation Institute — Erik Halvorsen, Bill McKeon, Farzad Soleimani, Eric Richardson, Gwyn Ballentine, Melissa Feldman, and Ariel Rogg — for the guidance, vision, and organization of the program. To the 40+ the startup entrepreneurs and founders I met in the accelerator this year, you all are awesome to be making such progress. There were many many advisors supporting us that have helped tremendously providing feedback and guidance. I appreciate all of them sharing their time generously, including Pedram Mokrian and Mike Lyons, Marta Zanchi, Dr. Billy Cohn and Tom Luby (JLABS), Hank Safferstein and Marissa Kuzirian(PLSG), and many others. To my biodesign colleagues and VastBiome teammates, thank you for the opportunity to work and learn from you. …


Image for post
Image for post

In Part 2, we presented a case of a company with existing products and services and their challenges tackling new opportunities. To address existing products, the digital analytics assessment gives a high level overview but with limitations. To address new products, an approach based on ethnography and observation was more insightful as it addresses the question of motivations and the why’s. In this part, we continue to explore another case and dive deeper into two more methods.

Case: Company A is a technology company with a breadth of software and services in a mature industry. Three competitors directly play in the space with the company in its primary offering. Additionally, the organization has typically been in a reactive mode when it comes to product feature development. Among internal reasons, one was the result of overly relying on sales and business development to deeply understand needs. …


Image for post
Image for post
TMCx06 Demo Day — June 7, 2018 at the Texas Medical Center Innovation Institute

As a TMC biodesign fellow, I was at the Demo Day for TMCx06 Digital Health startups with an audience of over 800. This day wraps up a 4 month long accelerator for 21 digital health startups from Texas, San Francisco, New York City, and around the world.

Recently, CEO Bill McKeon illustrated the Texas Medical Center’s grand vision and major investment in TMC3, which is the next 50 year vision for building an innovation ecosystem to Houston. This leverages the 60+ research, provider, and healthcare member organizations in this 2 square mile medical complex. …


Image for post
Image for post

In Part 1, we talked about the significant cost of not understanding what your customer and user needs were. We presented a case of a company and its challenges, and how we used Net Promoter Score as one way to understand their needs. We also talked about creating personas as an illustration and the process to document and assess what we do and don’t know about our customer or user. In this part, we continue to explore another case and additional analytical and qualitative methods.

Case: Company Q is a public company that had grown through mergers and acquisition, and through its success, has the majority of the market share of a very fragmented market. Its industry touched on many customers but the industry was not seen as innovative in any way. Q was a thought leader in trying new ideas. However, the methodology of approaching new product development relied on partnerships and a collaboration with external consultants for product vision. Building in-house capabilities was an imperative to allow sustainable growth of strategic initiatives. …


Image for post
Image for post

Do you know the cartoon often used for agile product development where the intended design of the swing gets misconstrued into malformed shapes? Ok, this is not yet another article about agile or waterfall, etc. I want to focus on a broader and yet more practical topic. That is, how does a product management professional identify customer and user needs practically and sustainably? Perhaps your organization or team is new to this. I want to say a few words that could help you get started. If you don’t have a PM in your team because you’re a startup or your org hasn’t been built with a product function, I want to guide you with a few thoughts on implementing a few practices so you can wear that hat to better identify needs. Finally, perhaps your organization has this — box checked, you say. …


Image for post
Image for post

Like a kid in a candy store, I’m happy to announce that I completed Professor Andrew Ng’s Introduction to Machine Learning course on Coursera. In this 11-week course, I sought to bridge my statistical knowledge and background in predictive analytics into artificial intelligence and machine learning (AI/ML). After spending more than 11 weeks, I can say that this is a good starter course on AI/ML and wanted to share my thoughts on getting started with machine learning courses such as these:

  • Introductory statistics and a bit of calculus knowledge is helpful (although not required). As the course goes over gradients and sum squared errors, it helps to understand why gradients are derivatives and why we’re doing sum squared errors to figure out the cost function. Sure, it’s possible to get through the course, but it may not be as easy to do so without understanding the underlying mathematical concepts. …

About

Daniel Wu

Digital Health, Data Science, Analytics, Product Management, and Innovation

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store