Data Scientists, Training Job Description, Purple Squirrel and Unicorn Problem — Part II

Jaganadh Gopinadhan
5 min readSep 2, 2020

This article is a continuation of my previous article posted on LinkedIn and Medium.

https://medium.com/@jaganadhg/dsunicorns-8fa01b1de79

https://www.linkedin.com/pulse/data-scientists-trainings-job-description-purple-gopinadhan-jagan-/

Recap

In the last article, we discussed the importance of creating the right JD for hiring a Data Science professional. Two of the significant job profiles Machine Learning Scientist and Data Scientists, along with tips to write JD, were discussed. In this article, we will discuss other job roles such as Machine Learning Engineer, ML Platform Support/DevOps, AI/ML Developer, and Data Engineer for ML/Data Science.

Machine Learning Engineer

Machine Learning Engineer is a person who can work with Data Scientist and Software Engineers to develop and deploy scalable AI/ML solutions. An ideal candidate should have a familiarity with Machine Learning algorithms in general (not necessarily a Mathematician). The person should be familiar with Data Science process flow to work with Data Scientist in tandem. Major strength areas should be Software Engineering capability such as writing clean, modular code, writing unknit tests, incorporate architecture patterns in the end-to-end solution, and help in the deployment. Exposure to container technologies, database systems, data engineering, and familiarity with infrastructure is desirable. More weightage should be given to software engineering concepts.

If your enterprise is planning for ML products and project deployments, an ML engineer is an indispensable role. At the minimum 1:10 ratio of Ml Engineer to Data Scientist will be desirable in a large busy team. Rather than the number of models or types of models developed, the number of deployed models will be the right measurement criteria to assess the candidate. Determining the tech stack for such a candidate should be governed by IT strategy. Cloud heavy AI/Ml teams should focus on cloud-specific skills.

As we speak, we witness the adoption of AutoML in the enterprise. Even though AutoML is not the silver bullet in AI/Ml area, the ML Engineer role will become significant than Data scientists eventually. In a mature AI/ML strategy, the enterprise will leverage experienced Data Scientists and the AutoML platform to build models. Such models will get operationalized and maintained by ML Engineers. The focus is being shifted from Mathematician to a sound Software Engineer who can understand Data Science.

ML Platform/Dev Ops

There are many AI/ML platforms are available in the market. Some are cloud-based. Some are on the premise, and some hybrid solutions. Like any other IT platform, a platform governance and maintenance strategy is required for AI/ML platforms. Sometimes the platform may be as simple as RStudio or may be as extensive as Databricks or Dataiku. Each of these platforms needs its governance and controls. Installing new software packages, creating code environments, controlling limits of the user, and creating custom Docker binaries are some of the high-level tasks required. If a platform provides model monitoring and deployment, there will be additional tasks to add to the list. This job is a Data Science upskilled person with good IT platform management. The person may not create models but will sim through the process to accommodate the platform tweaks and troubles. Awareness of various popular ML libraries will be an added advantage. Getting an experienced candidate in this domain is relatively strict as this role is an evolving one across many enterprises.

As discussed in our first article, for this role writing, the JD needs a vision of AI/ML IT strategy. Platforms used by the company and IT vision are critical. The person’s skill in infrastructure is very crucial, and architecture infrastructure solution is essential. Here more than a Mathematician, we could see a Data Science aware IT infrastructure professional, even though the person might have some training as a Data Scientist sometimes.

AI/ML Developer

Some of the popular tasks in AI/ML space are now available as ready to use API services. Chatbots or conversational agents, text insights extraction, and speech service are some of the examples. Almost all the large to small players in the space delivers excellent services. At the same time, some of the powerful AI/Ml frameworks started proving javascript frameworks such as Tensorflow.js and openCV java scripts. Such frameworks help the enterprise to create API mashup based AI/Ml solutions for various business solutions. For such a role, most of the time, enterprises rely on the best smart developer. But eventually, we would witness more API services in the space. Fundamental ideas on AI/Ml concepts will be an added advantage for developers trying to enter this space. The javascript Machine Learning world is again slowly evolving. Knowledge of Machine learning frameworks, its purpose, and underlying principle may be required. A course of AI/Ml for the business executive’s level will be a good starter for learners.

When it comes to the JD part, we need to think very objectively. What we are planning to achieve in the project/program should drive the JD.

Data Engineer for ML/Data Science

Buy or build is an important question when it comes to an AI/Ml platform. Even though there are plenty of Data Science platforms out in the market, enterprises tend to build their platform. Some of the mission-critical Ml systems may be deepening on heavy data crunching before it goes to ML models. Creating scalable, reliable, and blazing fast pipelines are the need of the hour. Such channels will sometimes be free from the data frame puddles. A data engineer who can understand the Machine Learning/Data Science workflow is desired here. Most of the time, enterprises with mature AI strategy hire and retain such professionals. Strong software engineering skill is an added advantage for such professionals. It is tricky to craft a one size fit JD for such a case. The JD should be based on the development strategy and vision for the team. Tec stacks should be aligned with an architecture team.

Unicorns or Purple Squirrel?

So far, we have discussed six different job roles and JD, which falls under the modern umbrella term ‘Data Scientist.’ If you are looking for a Unicorn, we are looking for a combination of all of these six JD’s discussed so far. This makes the unicorn search difficult for an HR professional.

The purple squirrel is a puzzling problem. Here the combination of education, experience, and skill meets. Sometimes the JD’s may have rigorous education and experience requirements, which one should look for tricky.

Happy Hunting !!!!

To be contd…

--

--

Jaganadh Gopinadhan

Artificial Intelligence and Analytics Leader | Sr. Manager Projects at Cognizant