The Changing Role of the Data Scientist

In a recent report by VentureBeat (VB) Insight entitled “The State of Marketing Analytics: Insights in the age of the customer,” Analyst Jon Cifuentes comments on the growing need for data scientists, “Gartner famously predicted 4.4 Million data science jobs to be available by this year, with only 1/3 being filled. If the findings of this research are any indication, that may be too conservative of an estimate.”

Even back in 2012, analytics guru Tom Davenport and now U.S. Chief Data Scientist D.J. Patil were commenting on the shortage of skilled data scientists and the cart before the horse: the lack of university programs offering appropriate degrees. Here they shed light on the growing importance of the role: “At ease in the digital realm, they are able to bring structure to large quantities of formless data and make analysis possible. In a competitive landscape where challenges keep changing and data never stop flowing, data scientists help decision makers shift from ad hoc analysis to an ongoing conversation with data.”

Here, I interview Data Scientist Frank Jing and Software Engineer Ryan Rapp to get their unique perspectives on the changing data science landscape and the evolution of this sought-after role.

Rebekah Iliff: What influence has the “age of the customer” had on data science?

Ryan Rapp: What consumers want from products isn’t changing. They’re trying to get basic needs met and a lot of of products are getting better at providing that. Take radio for example. It’s an old technology that used to be good enough. You used to choose between stations based on genre. Now, we want the songs chosen for us and for the technology to be smarter which is why there’s a need for Spotify, Pandora, and so on. People are becoming increasingly selective and they’re turned off when overwhelmed with a number of options.

Frank Jing: I agree. New needs requires new technology. There’s a need for curated content in this chaotic world and data science provides best practices for that.

RI: Many analytics vendors out there are beginning to cater more to the average business user. How is this changing the role of the data scientist?

RR: As the data science field matures, I’d anticipate that home-baked approaches to well-understood data science problems would become less prevalent. (Why build a team to tune your sales conversion funnel, for example, when a vendor with a proven track record can provide that service?) However, for companies whose products and core competencies include data science, including said vendors, they’ll always need the best and the brightest data scientists to continue to enhance their offerings or get a leg up on their competition. In that sense, data scientists will have a role of increasing importance.

FJ: Automated analytics are important to have, just like in any modern car with automatic transmission, power steering, or blind spot notification. But I don’t see a (responsible) driver sleeping while driving. And, yes, I’ve seen those Google SUVs around Mountain View. But I doubt I can buy one of those and ask it to win a race. I wonder if running a company is more like “getting home safely” or “win me Daytona 500.” Unless every C-suite member wants to be Jimmy Johnson, I bet they will want to hire at least one data scientist or more.

RI: Should high-level leaders be handling their own big data analytics and if so, what tools can help them?

FJ: I would think everybody has to do some simple data analytics such as making a Pivot table from time to time. But I believe the more important thing is to have a data-driven mindset. Big data analytics is only useful when the leaders understand and believe the results. Books such as Data Science for Business may be helpful. Excel should be more than enough for most tasks at this level.

RR: The ideal toolset depends on how your organization stores/queries its data and the type of analysis you’d like to do, but a few big names are Apache Hadoop/MapReduce, Apache Spark, and Neo4J.

RI: What are the benefits of working with an analytics product vendor?

RR: It’s often more effective to solve a specific analytics problem for a wide range of companies than to solve a wide range of analytics problems for a single company. Companies that develop analytics products can hone in on a single analytics problem and solve that problem in a robust way. As they gain a wide range of experience working with many different types of companies, they can gain greater visibility into what is likely to work and not work for a company–something that can take years of experimenting to figure out if you’re working in a silo.

FJ: There’s a lot to be said for being an expert at one thing versus being an expert at everything. Before Ford developed the conveyor belt, they were creating everything. Then they had the idea to build a tool that did one thing well to contribute to the whole and provide a much better service to everyone. In a way, you do want a curation of all of the services but they need to be harmonious. You need a person on staff who knows how to pull everything together and connect the dots.

RI: How are the roles of data scientists changing?

RR: The role is still quite young and the whole world is trying to figure out the new sets of challenges being presented by the new world. It’s similar to PR. There are just so many channels to take into consideration (self-publishing/owned media etc.) that it can be overwhelming to decide what to sift through and how to pivot. With data science, it’s not about changing, it’s more about adjusting to the magnitude of new problems.

As the field matures, products are becoming more integrated with each other too. AirPR is integrated with Google Analytics, Salesforce, etc. Because they can integrate into each other, there’s more opportunity for effectiveness and effective management. That connectedness helps evolve the space.

RI: How do you anticipate that the role may continue to change over the next 5 years?

RR: I suspect that many self-hosted platforms will continue to evolve or be reincarnated into fully managed cloud services. If a cloud service can provide the same interface as self-hosted solutions in a way that is stable, maintained, and endlessly scalable, that will be a win for businesses and cloud service providers.

FJ: Many more people will come into this field because the need is there. There are more channels, more information about every single customer, and more actions are being taken. Knowing more about people creates more work for data scientists because businesses will need additional people to dig through the data and make sense of it.

Plus, human nature has changed. We’re not looking at new interactions between people, we’re just measuring it differently. For example, customer Z likes to buy a certain type of cookie but it would have been virtually impossible to determine why she/he likes them not too long ago.How could the cookie company tell how good their business was doing? Sales. But now, we know all of this information about our customers. We know their demographic information, where they live, and so forth. With that data, the company can start to form a story that informs the next type of cookie they create for that cluster of people, versus conducting a test group where you’re basic placing a bet.

RI: So we’ll all benefit from more cookies catered to our taste thanks to data science?

FJ: Exactly.

Test your own data literacy with this handy quiz or learn the nuts and bolts of applied PR data. I can’t guarantee you an immediate cookie for reading, but the benefits for your business sure will taste sweet.