An analysis of the job market in Self-Driving Car (US only)
Back in November (2016), I started the Udacity Self Driving Car Nanodegree. During those 6 months, I learned about the technologies (hardwares & softwares) used in Self-Driving Cars (SDC). Also, I now get a better sense of the kind of jobs that would be a better fit for me. But, I am not sure about the requirements that would land me a job related to SDC? I have heard that most jobs require to be proficient in C++, and that most SDC jobs are in the Bay Area (phew!!! …I live in San Jose).
Instead of relying on hearsay, why not use data to get a better idea of the SDC job market. So, here it is: my analysis of the SDC job market using the job postings on indeed.com!!!
Source of data
“INDEED” was a choice by default: with their API, one can retrieve a list of job postings. This is not possible with LINKEDIN ’s API. For each job posting, we get the following information: a unique ID of the posting (jobkey), the job title, the company’s name, the country, the State and the GPS location of the company. Here is the pipeline that we used to build our dataset:
- API: with the API, search for all jobs that are related to SDC . I used the following keywords: ‘self-driving car’, ‘autonomous vehicles’, ‘autonomous cars’ ... and more …
- Web-scraping: get the full description of the job. This is done using the ‘jobkey’ and the urllib library:
The requirements for the job are extracted: education (CS, CE, Physics, …), programming skills (C++, Java, Python, etc…), area of expertise (Machine Learning, Radars, ...).
3. Outliers: we filter out the unrelated jobs postings by making sure that the full job description contains key expressions such as: “self-driving cars”, “autonomous vehicles”…
4. Manual filtering — I’d love to say that this pipeline is bullet proof, but it is not: there were still some unrelated job postings. They were removed manually.
At the end of the data collection process, we have a dataset made of about 765 job entries. For conveniency, the data is stored in a SQlite database.
Now, let’s dig in and get some useful information.
Where are the SDC jobs?
We report the percentage of SDC related job postings for each State (see map below). The State with the highest number is California: about 50% of the job postings are for positions in California. Florida and Michigan ranks 2nd and 3rd respectively, but they are significantly behind California. The number of job openings related to SDC is about 390 for California, and 100 for Michigan.
California : the epicenter of the effort on Autonomous vehicles.
What’s interesting is that California has only 1 Car assembly plant: the Tesla plant. The traditional Car Manufacturers have plants in other states. Michigan has the highest number of plants, but lags well behind California when it comes to SDC jobs. We could argue that the traditional car manufacturers are not the most active on the SDC job market.
Who is hiring?
Based on the data from INDEED alone:
there are 229 entities (companies, startups, etc…) in the US, that have job postings for self-driving cars, or related products.
Checkout this link for the full list of companies!
Below is a list of companies that have more than 10 job postings.
+---------------------------------------------+---------------------+ | Company Name | Nbr of job postings |
| Daimler | 67 |
| Luminar | 47 |
| NXP SemiConductors | 33 |
| NVidia | 31 |
| HERE | 25 |
| General Motors | 20 |
| Google | 17 |
| Delphi | 15 |
| Uber / Drive.ai | 12 |
| Applied Minds / Tesla Motors / Argo AI | 11 | +---------------------------------------------+---------------------+
Do you really need C++?
Let’s look at the programming skills that are the most frequently found in the jobs description. We do not make a distinction between the Requirements section and the Preferred section: for example, if the string “C++” is found anywhere in the job description, it is added to the database. Also, we do not take into account the frequency of occurrence of the string in the same job description.
A third of the jobs require proficiency in C++
C++ is mentioned in about 30% of the job descriptions. Python is 2nd with 21.7% and 3rd position for C with 19.2%. The slice Others includes programming languages/OS/frameworks that have a score below 5%: Tensorflow, Keras, php, OpenCv, Ruby.
Most jobs require at least a bachelor or a master. There are also a few jobs that ask for a PhD candidate. However, the mention of PhD is more often found in the section ‘Preferred’, rather than in the section ‘Requirements’.
What about the field of study?
A degree in Computer Science is the most popular request.
It is followed by ‘Electrical Engineering’, but it represents almost half of the number of Computer Science. There is also demand for candidates with degree in Physics or Mechanical Engineering but it is to a lesser extent.
Fields of expertise
I ‘ll spare you another pie-chart here! In order to extract the fields of expertise, I created a dictionary of words. This is how they rank in terms of frequency in the entire dataset:
- Robotics ……………………………………………. 47%
- Machine Learning / Predictive analytics………… 34.9%
- Radar ………………………………………………. 15.9%
- Parallel programming …………………………….. 2.1%
From this analysis, we could draw the profile of the candidate that best matches most of the job postings:
- Bachelor/Master in Computer Science
- Proficient in C++
- Expertise in Robotics
- willing to relocate to CA
Are you surprised?
Let me know what you think!
I am currently looking into expanding the source of job listings to other websites : either using API or web scraping to collect additional job postings.
Evolution of the job market for SDC: with the development of educational programs exclusively dedicated to SDC, like Udacity SDCND, or MIT SDC class, it would be interesting to see how the job requirements evolve over time…whether they become targeted towards more specialized profiles.