Data Science & IT: Finding common ground

Trisha Mahoney
Inside Machine learning
3 min readMar 13, 2018
Shutterstock license

Increasingly, the ability to infuse machine learning into your business is separating successful organizations from those falling behind their competition. But putting data science into production can be a difficult task when you’re struggling to align stakeholders, keep up with the latest open source tools, and build models at a rapid pace. Today’s data science teams often measure their success by the number of models they put into production, but the reality is that most companies still have very few models deployed.

So why are so few companies successfully implementing machine learning at scale? One major reason is that there’s a disconnect between the development and the production of models, and much of this disconnect occurs between data science and IT. So why can’t these two teams find common ground?

The Data Science Leader’s Perspective

You often hear that data scientist is the sexiest job of the 21st century, but managing a data science team is another story entirely. Data science leaders, who are often practitioners themselves, manage teams who might never have operated in a production environment. Data scientists should spend more time aligning with IT, but the bulk of their time is spent accessing, wrangling and cleansing data. On top of that, IT needs data science leaders to address governance, legal, and compliance risks, but they’re often not very skilled in these areas.

Another point of contention between these two teams is using open source. Open source is a must-have for today’s data scientists, but waiting months for IT approval on widely-used open-source packages has forced many data scientists to download un-approved software on their desktops.

The IT Leader’s Perspective

For many IT leaders, working with data science teams can be a nightmare because they just don’t have the level of control they had with traditional software development. Older software development tools are production environment ready, but today’s data science teams use open-source tools like Python and R, which are difficult to put into production. The high number of Python and R packages have made package management as well as version control difficult for IT to manage.

To further exacerbate problems, many data science teams barely document data, packages, or results, so reproducing models becomes a huge issue. IT may be responsible for monitoring deployed models, but they don’t know whether the models are still accurate or are being used in the right way. This often leads to poor performing models producing inaccurate results that lead to bad business decisions.

Find Common Ground

So how do data science & IT find common ground? Check out our THINK 2018 session where we bring a real data science leader and IT leader on stage for a “couples therapy” session to talk about their perspectives, goals and how they can work together more effectively.

If you’re a data science or IT leader struggling to get your data science practice into production, click HERE to watch the video now, or learn more about IBM Watson Studio at https://www.ibm.com/cloud/watson-studio.

--

--