How to Build a Data Science Team in the Public Sector?

In accordance with Singapore’s vision to become the world’s first Smart Nation, the Smart Nation initiative was launched in 2014. The initiative strives to rally the collective effort of various stakeholders including citizens, governments, businesses and education institutes, to co-create a future of better living for all through tech-enabled solutions.

The application of data science is at the heart of a truly Smart Nation, and we see a rising trend of staffing data science teams within the public sector. However, to build a successful data science team, it is important to recognise that attaining the right talent composition is not a trivial matter.

In a Straits Times interview with Mr Daljit Sall, director of human resources firm Randstad Technologies Singapore, he said that the need for organisations to derive valuable insights from the vast amounts of data accumulated would continue to drive the demand for data scientists, analysts and other such roles. In fact, he observed that Randstad has seen demand for such data science-related roles increased by 50 per cent in Singapore over the last two years.

To manage the vision of developing in-house data science capability, and to deal with the issue of data science talent shortage, numerous organisations have assumed the following alternatives:

  1. Recruit applicants with related/partial data science skillsets (e.g. statisticians, optimisation experts, computer scientists, and self-trained IT professionals); and/or
  2. To purchase Commercial off-the-shelf (COTS) Data Science Platforms to address the skills inadequacy.

In the article “Staffing Data Science Teams”, Gartner has identified the following personas in a functional data science team. These personas may not all be permanent staff of the data science team, but they must be potentially accessible at various stages of a data science project.

  • Data Scientists: Critical key staff members that can extract various knowledge from data, have overview of the end-to-end process and can solve data science problems.
  • Citizen Data Scientists aka “Data Analysts”: Not formally trained data scientists, but can still execute a variety of “simpler” data science tasks, especially with the help of so-called smart data discovery tools (data science COTS solutions).
  • Data Engineers: Make the appropriate data accessible and available for data scientists and, as such, can be instrumental in big productivity gains.
  • Business Experts: Individuals that understand the business domain really well. This can sometimes be the business leaders, or sometimes a range of key specialists.
  • Source System Experts: Those that have intimate knowledge of the data at the business application level.
  • Software Engineers: Needed sporadically when custom coding is required (special visualization, data integration or deployment of certain results, for example).

In addition to the above, Gartner has also identified two other variations of the “Data Scientists” role.

  • Quant Geeks: Excel in a specific range of quantitative skills. In certain situations, they are a “nice-to-have,” in rare situations a “must-have.”
  • Unicorns: Data scientists that are extremely well-versed in the whole range of skills — they are those “know-it-alls,” that are romanticized every now and then in the literature. They are super rare.

The following diagram illustrates the “minimal” level of skill required for each role in the typical business disciplines of “Domain Understanding”, “IT Skills” and “Quantitative Skills”.

Source: Staffing Data Science Teams

In essence, do not get into the haste of riding the data science hype. To avoid “white elephant” projects, platforms and team, start by thoroughly understanding the value that data science could bring to your organisation, and evaluate the roles/skills required for a successful data science team.

Build data science teams steadily. One could start by outsourcing or collaborating with external organisation entities (e.g. education/research institutes or commercial data science providers).