The age of Data Product Managers: How to prepare!

Lucas Fonseca Navarro
The Startup
8 min readDec 20, 2019

--

Do you think the role of a product manager is on the rise in today’s market around the world? They sure are! Vacancies have been growing exponentially and in the other hand it’s difficult to find professionals to fill them (even with exorbitant salaries). What if I say that’s a role that that even more requested? This role is what I call a Data Product Manager, a professional who, in addition to the product, have advanced knowledge of Data Science — nowadays this is a huge opportunity!

The main fuel for statistical & machine learning models — that underpins a huge variety of products — Data is growing more important to businesses every day. Whether it’s to personalization of user experience (Netflix, Spotify, Youtube), recommending relevant products to increase sales (Amazon, Wallmart), or analyzing how much credit to assign to a customer (Any Fintech), there are applications for virtually every technology company. As a result, there was an increase in the number of professionals who built these products (the Data Scientists), and of course, the need arose for qualified product managers to lead these teams.

The Data Product Manager was born! In addition to the skills of a regular PM, he needs advanced soft-skills to handle more senior teams — this professionals are very different to other developers — and a little bit of technical Data Science expertise to deal with the evolution of your product within your team. I was lucky enough to joint this movement early in my career and that was definitely what allowed me to grow so fast!

My personal journey as a Data Product Manager

I started my career on the other side of the coin, as a Data Scientist, while finishing a masters degree with a specialization in Machine Learning — I was developing data products at GetNinjas. At certain point we decided to create a multidisciplinary team focused on building data products and as I had interest and aptitude for the product area, I became the product manager of this data team. At this moment I started my role as Data Product Manager.

Right off the bat I discovered that the daily routine of this newly formed team would be very different from other product development teams. Data products have a different life cycle with different steps and rhythm, problems have a different nature, scientists have their own characteristics, everything changed! My technical data skills combined with my facilitator skills have proven to be an extremely powerful combination for leading the team, and together we have been able to deliver results at incredible speed for the company.

With the experience gained, after a while I started to receive proposals to work as PM in Data teams from a lot of countries and I have been working with data products to this day (even as a CEO now). This paper is extremely in demand and there’s a lack of skilled professionals in the whole market: this is a huge opportunity for you reader who likes both areas and I will help you prepare with the rest of this article.

Cool, but how can I prepare to be a Data Product Manager?

To work at this role, you need to understand the complexities of a Data Team, learn how to conduct them, facilitate the ceremonies and manage the development processes in conjunction with specific data science techniques background.

Learn how to manage the complexities of a Data Team

Usually a Data Science team is composed of senior people, it is very common to have Masters or phDs in the composition. These professionals tend to be lone wolves, they like to do everything on their own and have strong, inflexible opinions. The PM here needs to manage it acting as a strong facilitator. Always keeping the team environment healthy during the ceremonies, giving everyone the opportunity to talk and helping to conduct discussions to avoid endless rounds of argument, sometimes it takes an external Decider to close a discussion loop— it’s important to keep track of time in conversations or they will extend a lot. Being able to discuss with the same vocabulary as these professionals becomes very important here, reinforcing the need for technical data knowledge.

The life cycle of data products is different, development processes must also be adapted. The first distinction occurs in the modeling phase, where a solution must be proposed to the problem. Here, the team needs to choose some statistical (or ML) model to solve the problem, but this process involves a lot of research and experimentation, which makes the time indefinite. This step makes more difficult to use some agile methodologies. It is important for the PM to give space for scientists to work autonomously, but to avoid loosing too many time and over engineering I recommend setting checkpoint deadlines. It is possible to use an adapted Scrum with Sprints for example, and at the end of each Sprint the partial modeling results are presented and re-discussed or re-prioritized (there is an optimal reference to this from Doug Rose at the end of the article).

The other difference is that the system needs to be constantly evaluated and its model always updated. For the PM and the team it is important to keep track to the performance of these models constantly to detect when this time comes. The world changes faster each day, with that the data changes and the model needs to be adapted.

Structuring a data team within a company is also a big challenge. Thinking at two extremes, in one hand we have a team that is only responsible for the “pieces” of the product that require data modeling, building for example a Machine Learning Black-Box that another team will consume to add value to final customer, directly or indirectly. At the other extreme, the team can be completely autonomous, building their end-to-end products, from models to the end-user interface. I particularly think it’s best for the team to work in as many product steps as possible, that is, as autonomously as possible (my experience in both GetNinjas and OLX was like that), but it depends on the structure of your company. Either way a must-have for the PM is a huge load of communication and alignment of the PM with stakeholders from other teams and areas of the company, as these models always tend to impact or be part of other pieces of the company’s overall product.

Study a lot! Technical knowledge is a must-have

I will divide the tech knowledge into basic (Statistics, SQL, Processes) and advanced (Machine Learning, Technologies, Emotional Intelligence) topics. I think it’s important for a Data Product Manager to study at least a little bit of each of these topics.

Statistics: This is the most basic topic of all (even for regular PMs), you need to have basic notions of statistics and probability. Be able to interpret a summary of the data with mean, median, quartiles. Correlation between variables, understanding more widespread probability distributions such as normal and binomial and so on.

SQL: One step beyond interpreting statistics is being able to extract it. The most common mechanism for this is the use of SQL to extract data and simpler measurements directly from your company’s database. SQL is a language with a very narrow vocabulary and few rules, it is relatively quick to learn and very intuitive. It is worth searching for platforms that help you exercise a little here, get your hands dirty.

Processes: As mentioned above, the products of a data team require different processes than usual. It is important for PM to study as many processes methodologies as possible and create an experimental mindset, shaping these processes to find something that works for their team. Always maintaining agility and quality in deliveries as well as a healthy and inclusive environment for all team members.

Machine Learning: A topic more advanced than statistics but in the same segment is the knowledge of Machine Learning techniques and systems. It’s hard to talk about Data Science today without directly linking it to ML, as it’s a very broad field with applications in just about any segment. A little bit of match is necessary to understand these techniques, but don't worry, as a PM the basics is enough. I was tempted to put this topic down as basic because I think it is imperative that the PM can spot opportunities to apply ML to his product nowadays, making it exponentially better (it's a huge advantage).

Technologies: If you’re the type of PM who likes to know a little about the code and technologies your team works on, there are other things to learn besides SQL here. The most widely used language in the area today is Python, mainly for building ML systems (luckily it’s also easy to understand). Another widely used language for both systems and data manipulation is Scala.

Emotional Intelligence: As mentioned in the previous session, the profile of Data Science team members tends to be more senior and with some particularities that may require greater emotional intelligence level in day-to-day meetings and situations to manage. I find it very important to prepare in this area too, although much practice is the best way there are some good books and articles to help you in this field.

Conclusion

The role of a Data Product Manager is even newer than that of PMs and Data Scientists, this rising tide in the market is just beginning. It is a huge opportunity for you reader who works in some of these areas to explore an alternative path and join a movement that can enhance your career. I finish this article with the main references I recommend to prepare for this paper and with an invitation for an open channel, to discuss ideas together if you want to delve into this movement!

References

The following books together passes through all the above technical dimensions (except for SQL), I read them and strongly recommend for anyone that's starting in this role:

  • Naked Statistic: Stripping the Dread from the Data, Charles Wheelan — $7,00
  • Data Science for Business: What you need to know about data mining and data-analytic thinking, Provorost & Fawcett — $16,00
  • Data Science from scratch with Python, Joel Grus — $21,00
  • Data Science: Create Teams That Ask the Right Questions and Deliver Real Value — $15.00
  • Machine Learning, Tom M. Mitchell — $54,00
  • Multipliers: How the Best Leaders Make Everyone Smarter, Liz Wiseman — $12,00

To learn SQL on your own I recommend using some practical platform like w3School, but an easier way is to ask some data analyst or scientist at your company for help and start practicing with simple day-to-day analysis.

--

--

Lucas Fonseca Navarro
The Startup

Co-Founder & CEO at Já Vendeu, helping people sell their stuff without any effort