The Pyramid of Data Needs (and why it matters for your career)

Every company has a pyramid of data needs, and your role as a data scientist/analyst will fall somewhere along this spectrum. Understanding this framework is key to properly articulating your current skills/responsibilities and where you want to go with your career.

Image for post
Image for post
Data needs as illustrated by Monica Rogati
Image for post
Image for post
My (practically equivalent) visualization of data needs

Before we get too deep into the subject, let me give credit where it’s due. This entire concept is based off of Maslov’s Hierarchy of Needs, and allegorizing it to data science is not new. In fact, Monica Rogati gave an exceptional description and visualization of data needs about six months ago. So why re-hash the subject — especially given that my knowledge of the subject and writing skills pale in comparison to Monica’s?

Well, Monica concisely framed the discussion in terms of advising startups/companies, but my motivation is to write about how this pyramid impacts the careers of individual data scientists. I’ve had the same conversation trying to map one’s interests to their role and I keep coming back to this visualization. It is super helpful at conveying skills and job responsibilities in a generalized, but still meaningful way. So rather than wasting more marker ink on a whiteboard, I’m following Rachel Thomas’ principle of putting it in a blog.

So with that, a couple points to get across…

(Please) call it a pyramid, not a hierarchy

  1. My overly aggressive color scheme (apologies for creating what has been described as “a bastardized version of the pride flag”).
  2. It’s upside down (relative to Monica’s).
  3. I use the word “pyramid” instead of “hierarchy.”

The two latter choices are purposeful. By flipping the pyramid and supplanting “hierarchy,” I am attempting to eliminate the perception that working in the narrower parts of the pyramid is better, cooler, or more impactful for a business. Too often people chase the fanciest data science technique (e.g. deep learning, multi-armed bandits), rather than focusing on the building blocks to creating a solid foundation. Heck, rarely do we even stop to question whether something like deep learning is needed. Using verbiage and visuals that avoid implying superiority is crucial to dehyping the narrower parts of the pyramid.

Every company is different

Let’s use an example. Take a brick and mortar store. Maybe they have 50 transactions a day. Logging every transaction is still critical to knowing what inventory is selling and isn’t, but this doesn’t require any sophisticated data management. It might get logged in excel and maybe they do some business intelligence-like work to gain insights. But the point is they don’t need anything more advanced for their operations than the first building block of data pyramid.

On the flip side, a company that gives out loans — even a small local credit union — will need to build multiple layers in order to ensure the loans are being repaid. In fact, the local credit union will likely want to move as close to modeling as possible to minimize the risk of their loans by predicting people’s likelihood to default.

Image for post
Image for post

As companies’ needs differ, so do their staffing strategies. Some prefer specialists (using different people for each layer of the pyramid) whereas others prefer generalists (asking 1–2 people to own the large portions of the project). The specialist model gets stronger results at each stage but requires huge communication overhead to ensure proper building across many people. The noticeably small overlap in responsibilities breeds clear, smooth ownership but also makes it extremely difficult to speak the same language throughout the entire pyramid. Meanwhile, the generalist model allows people to build data science products (i.e., dashboards, models, etc.) quicker but can often land on local maxima. Honestly, achieving a high-quality generalist model is also really hard to hire for since it’s so rare for someone to have such a diverse background.

I’ve seen both strategies work firsthand, but welcome your thoughts on the pros/cons of each.

Use the pyramid to frame conversations on skills and responsibilities

No one role is “better” than another

The best folks I’ve worked with are those that can acutely identify exactly where the gaps are for their team and work with their team to fill it. This ability to find the biggest opportunities typically aren’t the skills taught in graduate school, online courses, or even what shows up on your bi-annual feedback. So no matter what your role, I strongly encourage you to always be thinking about what part of the data pyramid will have the largest impact for your team/company. Once you figure that out, I trust you’ll wrangle the necessary people or skills to find a solution that works.

Written by

Data Science Manager

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store