Thinking like a (business) Data Scientist

andrew wong
Human Science AI
Published in
2 min readMar 19, 2019

I am about to finish off my Flatiron School Module 2 Final Project, on a Jupyter Notebook with streams of code and comments on the codes.

I kept thinking to myself. How can I communicate better through Jupyter Notebook? Can I advanced someone/reader knowledge through what I have written on Jupyter Notebook?

There are a few principles that I am (trying) uphold. First, my Jupyter Notebook, should be readable (and enjoyable to read). Second, my Jupyter Notebook, should be have a consistent flow from one idea to another. I want bring the reader on an interesting discovery journey, not through the mud. Third, my Jupyter Notebook, should have an impact to business.

On the third point, here’s what I would like to elaborate a little more. I have a background in sociology and managing project, so I know a bit more storytelling and managing stakeholders expectations. Let’s get going.

First and foremost, it is important to identify key business initiatives (in your organization). This will determine areas of focus next few years, and where management attentions are. There is little value to focus on the peripheral (maybe).

Second, with a bit of focus, we turn to identifying key business drivers (that’s metrics, what kind of data business needs and wants). This is where we start to do some sense making with data.

Third, once we have some sense of what the business directions (at least, that’s the business perception is), we can start to identify key business questions and decision points. This is an important step because this allow us to go further in understanding the business.

Fourth, this is the fun part (or when you’re pulling your hair out in frustration). This is where we are putting data science into practice by:

  1. Identifying data source (what data is available right now)
  2. Identifying what are the valuable data inside or outside the company (or data that we can scrubbed to make it valuable (like gold mining)).
  3. As an iterative approached we continue to explore data, do analysis, and report findings.

Fifth, run through a business prioritization matrix to ensure alignment about business stakeholders around top priority use cases (as an outcome of running data science activities). This allow data scientist to structure discussion, and frame further business prioritization on what to focus further.

There you go, enjoy being a (business) data scientist!

--

--