Utilizing GP-4–1106-preview and BERTopic Topic Modeling techniques, we’ve classified the myriad of duties in the DSA realm, highlighting six key areas of responsibility (note: a single position may require multiple skills):

  1. Computer Vision/Deep Learning (CV/DL): Focused on developing algorithms for computer vision and applications in deep learning.
  2. AI/ML: Around 50% of these roles involve developing and deploying AI models, with some applying machine learning in business contexts. Interestingly, 20% are related to natural language processing and entity recognition.
  3. Data Project/ETL (DP/ETL): Here, 25% of roles are dedicated to data platform maintenance and conducting need-based interviews.
  4. Database/SQL (DB/SQL): Most responsibilities in this category are linked to database performance optimization, with a minority in management and maintenance.
  5. Report/Business/Visualization (R/B/V): This includes user behavior analysis, product overview reporting and analysis, and statistical analysis techniques.
  6. ADs/GA: About 30% of these tasks are assigned duties, including digital ad optimization and proficiency in Google Analytics.

Overall, 76% of roles demand skills in DP/ETL, followed by 37% in R/B/V. Roles in ADs/GA and AI/ML are less prevalent. Across four distinct job types, 60% encompass DP/ETL tasks, highlighting the pronounced need for Data Analyst roles in R/B/V. Both Data Scientist and Data Engineer roles find DB/SQL responsibilities crucial, whereas AI/ML is particularly vital for Algorithm Engineers and Data Scientists.

In essence, every role demands proficiency in DP/ETL. Data Engineers tend to focus more on DB/SQL responsibilities, Data Analysts on R/B/V, and Algorithm Engineers often delve into AI/ML and CV/DL. Data Scientists balance their roles across DB/SQL, R/B/V, and AI/ML.

Further, we initially categorize these six areas into three broader types: Data Processing (DP/ETL/DB/SQL), Presentation and Application (R/B/V/ADs/GA), and Algorithmic (CV/DL/AI/ML). A Venn Diagram clearly shows the overlap and distinctness in job functions:

  • 82% of roles involving Data Processing, with 49% exclusively focusing on this area, 37% also requiring Presentation and Application skills, and 20% needing Algorithmic expertise.
  • 40% of roles require Presentation and Application skills, 75% of which also handle Data Processing tasks, and 13% need Algorithmic abilities, leaving only 9% exclusively focused on this area.
  • The least common, at 25%, are roles needing Algorithmic skills, with 64% also handling Data Processing tasks, 20% requiring Presentation and Application, and 32% specializing solely in Algorithmic.

It appears that roles including Presentation and Application are more likely to require cross-disciplinary skills, whereas positions mainly focusing on Data Processing and Algorithmic tend to be more specialized.

Finally, industry-wise, the prioritized skills vary. For instance, manufacturing places a higher emphasis on computer vision and algorithm development. Industries like ICT, professional services, health, and social welfare show minor differences in roles, with the latter particularly valuing statistical analysis tools. For finance, insurance, advertising, media, and retail services, data-driven capabilities are paramount. Specifically, finance, insurance, and retail services emphasize customer behavior analysis and tagging, while retail services and media focus on data visualization.

--

--

ChunYu Ko
The whispers of a data analyst

Work is data, and hobby is also data, but I yearn for my roommate's two cats, lazily lounging at the doorway.