The Psychology of Data Science

The who’s who game in a data science team

I have recently published a piece on what it means and what it takes to be a data scientist. I want to add a further consideration, which lies at the intersection of science and psychology.


I. Data Scientists’ Characteristics

There is no scientist exactly alike another, and this is true for data scientists as well. Even if data science seems to mainly be a field run by American white male with a PhD (what I inferred from King and Magoulas, 2015), this is not conclusive at all on the ideal candidate to hire. The suggestion is to value the skills and capabilities more than titles or formal education (there are not many academic programs so well-structured to signal the right set of competencies to potential employers).

So far, in order to become a data scientist, the paths to be followed could have been unconventional and various, so it is important to assess the abilities instead of simply deciding based on the type of background or degree level. Never forget that one of the real extra-value added by data science is different field contaminations and cross-sectional applications.

But there is also another relevant aspect to take into account in choosing the right candidate for your team, and that is Psychology.


A card from the Rorschach test.

II. How Personality Impact Your Data Team

Not all the data specialists are the same (Liberatore and Luo, 2012; Kandel et al., 2012), and it is possible to cluster them in four different groups (Harris et al., 2013) and by four different personalities in order to reach a deeper granularity, based on their actual role within the company (“Archetypes”) and on personal features (“Personality” — according to the Keirsey Temperament Sorter).

Correctly identifying the personality type of a data scientist is crucial to amplify his internal contribution and efficiency, as well as to maximize the resources employed to recruit him.

Data scientists’ personality assessment and classification

In the table above, a full disentanglement of data scientists’ types is provided. The color roughly represents the partition between three main skills they possess — based on the survey run by Harris et al. (2013) — that are mathematics-statistic-modeling skills (blue), business competencies (green), and coding abilities (red).

Having this clear classification in mind may be argued to be a merely speculative and useless labeling exercise, but it is indeed extremely relevant to increase the team efficiency: identifying personal inclinations and aspirations would allocate the best people to the best job role, and common complaints and problems such as the lack of time for doing analysis, the poor data quality, and the excessive time spent in collecting and cleaning data (King and Magoulas 2015), would be eliminated — or better, they would be assigned to the right people.


This classification does not want to be, of course, a quality assessment of what type of data scientist is better than others but rather a framework for helping organizations to also identify the minimum team structure to start with: on the main diagonal there are indeed the basic figures needed in order to properly establish a fully-functional data science team.

The Gardener (usually known also as data engineer) is in charge of maintaining the architecture and making the data available to the Groundbreakers, who are usually identified as the proper data scientist, and that try to answer research questions and draw insights from data once they verified through models testing. The insights are then passed to Advocates (business intelligence) and Catalysts (customer intelligence team), who respectively communicate that information to executives and use it to increase customers’ satisfaction.

Data Science Value Chain

III. Conclusions

Having these four different basic team figures guarantees an efficient data-driven business and sharp outcomes delivering, as soon as the communication across teams and across departments is well implemented.

It is common practice to have short ( five minutes at most) stand-up internal meetings every morning to wrap up the daily objectives, works and expected outcomes, as well as weekly meetings with other departments to align the work.

Structuring the process as above proposed would also increase the scalability of data projects.

If you want to have some fun and take a Data Science Personality test, check out here!! It is not a professional test, but it is something quick and dirty that might give you an idea of what category you fit best.

References

Harris, H. D., Murphy, S. P., & Vaisman, M. (2013). Analyzing the Analyzers. O’Reilly Publishing.

Kandel, S., Paepcke, A., Hellerstein, J. M., & Heer, J. (2012). Enterprise Data analysis and visualization: An interview study. In Proeedings of IEEE Visual Analytics Science & Technology (VAST).

King, J., & Magoulas, R. (2015). 2015 data science salary survey. United States: O’Reilly Media, Inc.

Liberatore, M., & Luo, W. (2012). ASP, the art and science of practice: A comparison of technical and soft skill requirements for analytics and or professionals. In Interfaces 201343, (vol. 2, pp. 194–197).

Note: the above is an adapted excerpt from my book “Big Data Analytics: A Management Perspective” (Springer, 2016).