Why Substantive Expertise in Data Science isn’t just Experience
The popular Venn diagram, as used to explain what constitutes Data Science and its variants (e.g. analysis in big data), has one circle that resonates with me in particular: Substantive Expertise. For the uninitiated, it’s important to mention the rest: Math & Statistics knowledge and Hacking Skills. In my view, substantive expertise could mean knowledge acquired by someone who’s worked in a particular industry for over a decade. Obviously, there are a set of heuristics acquired that aid analysis and furthermore not just knowing the right questions to ask, but how to navigate finding reasonable solutions (if they exist). The other is classroom expertise: basically making extrapolations from one’s own education that includes both abstract and practical work. However, this dichotomy is not binary. To be sure, there are plenty of data scientists out there who rely on the two; what they learnt in school and what they’ve learnt over the duration of their careers (before doing data science).
Substantive expertise surely refers to working knowledge and skills as defined by different domains varying from say a chemical engineer to a product manager. Or an economist and a healthcare specialist. The universal set is immense. But are there similarities, regardless of the career? Mentioning heuristics in the beginning was important. Rules of thumb that act as shortcuts in navigating innumerable complexities that are even used to handle Donald Rumsfeld’s Unknown unknowns, is not uncommon across disciplines. There is also the occasional reliance on history: what’s happened in the past, and an individual’s learning experiences in either dealing with a problem or learning to replicate a success from it. Admittedly, however, ‘history’ is really regressing to the definition of what work experience is.
Recalling the medical drama, House MD, where the protagonist is known for coming up with novel albeit socially unacceptable solutions, serves as an interesting example. In one episode, at the penultimate stage, when House and his team of affable doctors have tried innumerable tests and procedures to determine the malady, a flash of insight occurs. With a test tube in-hand, House proceeds to look at it under a lamp with a strong light. The team, however, believes he has lost in mind (rightly so), insisting he is wasting the precious time they need for more time-critical tasks. House quips that light acts as a catalyst, in other words speeding up observation time that would not have occurred under normal circumstances. And sure enough, the liquid in the tube turns color, confirming his prognosis. Also, something similar is witnessed in the movie, Limitless, where Eddie Morra (played by Bradley Cooper), has the plot around a pill that unleashes super-human mental faculties. Most notably, is when Morra explains the effects of the pill as the unofficial narrator: it allows access to the most remote memories only to make necessary connections in real time, selecting the best choices.
While the aforementioned are largely inundated with fiction, the scenarios aren’t far removed from our day-to-day workplace challenges where random flashes of insight are explicable, simply because we made a connection or analogy to a past event. Moreover, this past event may not necessarily be related to our career. It could naturally be from interactions with our families or other mundane activities. Then inevitably, it shouldn’t be a stretch to say that substantive expertise as an important data science constituent, should not have its emphasis on a career solely working with data.