Landing an entry-level job in the NLP/NLU industry

Published in

CodeX

14 min readAug 25, 2021

Natural Language Processing Career Path: Knowledge Engineer, Computational Linguist, Speech Technologist, NLP Scientist… what job position to look for after graduation?

If you are reading this article, the chances you are graduating in the Natural Language Processing or Computational Linguistics domain are significantly high. You are probably looking for an internship or a junior position in the NLP area. And this is exactly what I had to go through a couple of years ago. And you are probably asking yourself:

What kind of career to pursue in the Natural Language Processing and/or Computational Linguistics industry?

The honest answer is… it depends. Rather than focusing exclusively on a set of skills needed, I believe that to carefully choose a career path in the NLP industry, you shall ask yourself the following questions:

a) What do I enjoy and like the most within the options the NLP job market can currently offer?

b) What NLP related skills I currently have?

c) What skills I currently do not have and would like to acquire?

d) What do I want to get from the work experience?

One of the main issues that people can face when starting a career path in the NLP domain is to not be aware of what the NLP industry and job market offers. In addition, if recently graduated, there might be discrepancies between the terminology and definitions learnt in academia and the same exact words when used by (some) companies. Let’s start with that: definitions, nomenclatures.

Knowledge Engineer, Computational Linguist, Speech Technologist, Language Data Scientist… why all those names?

You have looked for job positions on LinkedIn. You have witnessed chaos! One job title, multiple meanings. You have looked for clarifications on manuals, asked other people, browsed the Internet. Someone told you a computational linguist is expected to work more on the linguistic side of a project, while a Language Data Scientist is basically a programmer that only works with Machine Learning and Deep Learning. That’s false. Both in research and industry. The reality is that each company can mean different things using the same job title in the job announcement. The only way you can actually understand what you are meant to do and if you might like the job is by reading carefully what the company expect you to already know and what your tasks will be. Forget anything else and do not limit your search only to the job title you think it best suits your skills. Carefully read the job announcement details and visit the company websites. Try to figure out what services and/or products they offer. If possible, try to infer their development and production pipeline even before the interview.

Artificial Intelligence can basically mean everything

If you have an upper university education in NLP, you already know that “artificial intelligence” is quite an ambiguous term, and it can often lead to dense debates. However, the chances you confuse “artificial intelligence” exclusively with “machine learning”, “neural networks” and “deep learning” is extremely high. If you think so, you might be surprised that there are several AI companies that actually do little machine learning and no deep learning at all. If this shocks you, maybe it is time for introducing some further terminology that might be helpful to understand the market.

Symbolic vs Statistical AI

Historically, there have been mainly two ways to approach artificial intelligence: symbolic AI (also called GOFAI — good old-fashioned artificial intelligence –, or classical AI), and statistical AI. Recently, companies like IBM or Expert.ai are focusing their research work on hybrid approaches. If you are interested in hybrid NLP/AI I suggest you reading the work of Perez (Expert.ai) et al. “A Practical Guide to Hybrid Natural Language Processing : Combining Neural Models and Knowledge Graphs for NLP” and the work of Alexander Gray (IBM) Logical Neural Networks: Toward Unifying Statistical and Symbolic AI)

Statistical AI might be considered as an umbrella term that includes frameworks and methods ranging from purely statistical ones (exempli gratia: Moses Statistical Machine Translation, GIZA++) to machine learning and deep learning methods and techniques. However, please be aware that machine learning is not just glorified statistics, and there is a difference between classical statistical methods and ML/DL.

The main difference between symbolic and statistical artificial intelligence can be understood, in layman terms, referring to the distinction between deductive and inductive thinking. Symbolic AI might be considered deductive: given a set of rules interpreting symbols, a conclusion is deduced. On the opposite, machine learning and deep learning might be considered inductive: find hidden patterns in the data. More precisely, tensors are passed through a function, and a trend is inferred.

For this reason, Statistical AI needs a relatively large amount of training data and high computational power. Instead, developing symbolic AI models (often referred to as expert systems) requires less computational power, and their accuracy is not directly correlated to the amount of available training data.

Ok, but why those definitions matter when discerning a career path in NLP?

For a simple reason: classical NLP and rule-based NLP are not dead, they are heavily used in the industry, often as part of complex pipelines where rules-based approaches and ML are intertwingled. And, if you aim to get into research, it is worth knowing that symbolic approaches might be useful for Quantum Natural Language Processing, Neuro-Symbolic Artificial Intelligence, and explainability. Rules-based approaches are also useful when building ontologies, knowledge graphs, lexicons, and disambiguation pronunciation model for TTS software. So, if you want to answer to the question “what do I enjoy and like the most within the options the NLP job market can currently offer?” you should start to dig into the current situation, trends, and possible developments in the NLP domain. Here it is some food for thought.

Rules-based NLP (Classical NLP) is not dead nor will die soon

Tensorflow, Keras, GPT-3, Scikit-Learn, Neural Networks, deep learning framework, SpaCy. They are great tools, but you shall not forget old school rules-based NLP. Companies have in fact developed NLP solutions far before the era of pre-trained language models and Github available HuggingFace models. How was it done? Rules-based expert systems. Most people have no idea of what rule-based natural language processing is. Even worse, a lot of people without any linguistic backgrounds (and little or no work experience) unfortunately believe that classical AI and rules-based expert systems are exclusively for non-technical linguists. This is simply wrong.

Do you want to know what is the most common and highly likely one of the most efficient approaches to disambiguate non homophones homographs to improve the pronunciation correctness of a TTS voice? Rules-based approaches.

A client asks for a specific in-domain categorization and NER/extraction/data mining model. They give you an extremely little set of documents, let’s say 50/100 pdfs. Do you expect good results from a neural network given this set of conditions? I would say no. On the opposite, if you need high precision, quick results, and a fast development of a model that must get around 80/90% F1 score exclusively on a specific type of in-domain documents, rule-based expert systems are the solution. You might think this cannot happen that frequently, but it is quite common, even if the client is a big corporate company.

In addition, legislators seem to be gradually introducing “explainability” as a legal requirement for ethical AI. And rules-based NLP is explainable, while neural networks are black box. Therefore, you shall understand what the benefits of rules-based approaches to NLP tasks are and when to use them. You might think that low computational power and little or unlabelled training data are not common scenarios, but they are. And currently, classical AI deals with those scenarios quite efficiently.

Does that mean that rules-based NLP is better than ML/DL?

Absolutely not what I am saying. What I am saying is: as an NLP/Computational Linguist you must know which method is better to be used depending on the specific scenario you are facing. Especially if you want to work in consultancy firms or startups/early scale up companies.

Are you asked to develop an abstractive automatic summarization model? Focus your efforts on testing different neural networks, sentence embeddings, make use of pretrained language models. Are you asked to develop a Named Entity Recognition model with a few documents, short time, fixed template, a list of fixed entities provided by the clients? All you need might be well written regexes. Not as fancy as a fine-tuned transformer, sure. But cheaper, quicker, and effective.

In a nutshell: if you want to develop and deploy NLP solutions to customers or you must quickly find efficient yet cost-affordable solutions in companies that are growing and have limited number of early-stage investments, it is critical for you not to be affected by your own bias towards a specific type of technology.

Get acquainted with the existence of NLP/NLU business solutions development toolkits (examples: IBM Watson, IntelliJ Expert.ai Studio Core, Amazon Comprehend, Expert.ai Platform), when it is convenient to use them, when it is better to rely on Scikit-Learn or Pythorch/Tensorflow. I am not saying that you shall master everything, nor create mockup projects for each technology in your portfolio. But be aware of the great variety of NLP toolkits and NLU development solutions in the market.

Pre-trained language models, Ontologies, Knowledge Graphs, Linked Data

If it is true that it is important to take into consideration rules-based expert systems and hard-coding knowledge engineering, it is crucial to know what is going on with pre-trained language models, Transformers, BERT, state-of-the-art NLP research and technologies, ontologies and Knowledge Graphs as Linked Open Data.

Plenty of material has already been written, and you should already be familiar with those keywords if you want to land a career in the Natural Language Processing or in the Natural Language Understanding domain. In case you want a quick catch up, I suggest you reading Evolution of Language Models: N-Grams, Word Embeddings, Attention & Transformers. It summarizes the main development in NLP from the 50s up to nowadays: “the era of pre-trained language models”, except for not digging into Knowledge Graphs, Knowledge Bases, and Linked Open Data.

There is a chance you have already some knowledge of what DBPedia, WikiData, and SPARQL are since Knowledge Graphs are recently getting a lot of attention in the NLP/NLU domain both in research and industry. But despite what many people think, they are not recent inventions. The attempt of structuring natural language in a graph data format to enhance machine comprehension is much older than you might think. Just to give you an idea, it is decades that companies like Cyc (founded in 1984) and Expert.ai (founded in 1989) have been sold NLP/NLU business solutions that leverage on proprietary knowledge graphs.

This is to say: you should consider natural language as data and equip yourself with a set of hard skills broad enough to give you the possibility of critically thinking what the best approach is to solve an NLP problem in each scenario. Keep yourself informed, read papers. This will help you always grasping an idea of what is going on in the NLP domain.

Research vs NLP-Business to Non-NLP-Business (a chain of B2B in the NLP industry)

If you have the habits of reading papers and following the main NLP and Computational Linguistics conferences, you have highly likely noticed one important thing: corporations excel in NLP research. Companies like Google and Amazon have a significant number of important papers that often exceeds what academic research can do due to a huge difference in terms of economic resources. At the same time, companies like HuggingFace are disrupting the NLP world replicating papers and providing free code and pre-trained models that businesses can use. The canonical difference between “academic research” and “industrial research” seems to be more and more a gray area.

In my opinion, this is resulting in a new dichotomy: research vs NLP-based businesses offering products and services to non-NLP businesses. Universities and extremely big companies are leading the research area. The latter category also gains from lending hardware, servers, cloud computing, virtual machines, API to other NLP-based firms. Those NLP-based firms then develop and sell NLP solutions to non-NLP companies that needs digitalization and automation. Depending on the size of the NLP firm, the NLP firm can own proprietary technology or not. If the “NLP company” does not own proprietary technology, it relies exclusively on standard NLP libraries, pre-trained models, and ML/DL training from scratch. Otherwise, if the NLP firm owns proprietary technology, a mix of open-source common frameworks is used in combination with the core technology of the NLP firm. Because of this phenomenon, there are mainly 5 categories of entry level jobs in the NLP world:

Annotator/Linguist

Someone that has extremely good linguistic skills and not necessarily a programming/scripting background. Extremely valuable for companies to maintain knowledge bases, build ontologies, improving annotations of both sound and text data. If the company has proprietary rules-based languages or frameworks, it is possible the linguist is also trained on those instruments, approaching computational linguistics methods.

Knowledge Engineer/Computational Linguist/Language Technologist

Bigger companies have a portfolio of big clients and relatively high income. They need someone that can quickly learn and master proprietary technology and simultaneously know the main Python NLP libraries with some knowledge of machine learning and deep learning. As the company is big, usually web developers deal with backend and frontend.

NLP as MLOps

Companies where NLP is regarded mainly as a machine learning subdomain

Backend Web/Software Developer with NLP skills

They can be early-stage businesses, or, more often, relatively small consultancy firms that needs to hire someone with expertise and experience in NLP/NLU. However, it is not uncommon that, in this scenario, the job duties also require good knowledge of SQL, API, web and/or software development skills.

NLP Engineer/Researcher

It can be industrial research or a PhD. Good knowledge of mathematics, machine learning, and deep learning is required. The main activity consists in researching complex issues crucial to the employer. Finding the research project you like might be complex, but it can allow you to work on emerging NLP sub-fields and non-canonical tasks, such as end-to-end speech translation, multi-modal NLP, hybrid NLP, explainable deep learning.

What about working in NLP startups?

Though question, and I think it really depends on the specific startup, the funds they have, their growth plan, if their exit strategy is being acquired by bigger companies (in which case the focus is on R&D) or to provide B2B or B2C products and services.

Choose your Career Path in Natural Language Processing and Understanding (NLP and NLU)

That said, now let’s focus on what matters most: yourself and the choice to be made. Knowing the industry trends, the possible developments, and the current market situation is only half of the work. You have also to figure out what skills you currently have and what is your goal in the long run. Ask yourself those two questions:

1. What is my background: languages and/or linguistics, programming, or both?

2. What skills do I want to get from the job I am applying to?

There are two benefits from answering honestly to the above-mentioned questions. The first one is that, by realistically assessing your skills, you can stop batch sending your CV, and focus on job announcements that are compatible with your current skills set. This will increase the number of interviews and answers you will get. Secondly, by knowing what skills you want to acquire, you increase your chances to keep the job and professionally grow. In addition, having the answers to those questions allow you to have the data required to assess your interest in being part of a specific company, which leads to the final question:

What to look for in a company: research papers, experience with corporate clients?

If you recently finished university, you might consider research papers an extremely important metrics. If a company/institution has publications in important conferences, such as EMNLP or SemEval, this gives you already an insight of the importance your employer gives to scientific and public research in the NLP/NLU domain. However, this is not the only metrics that you have to look for. There might be companies without scientific publications because of the need to preserve a proprietary technology that gave them disruptive advantages in the B2B market. But how to know that? My opinion is: look at your employer data with the mindset of an investor.

Would you be more willing to invest your money:

1. in a company that has a few extremely good research papers and almost no clients nor income nor a scalable and efficient cash flow strategy?

2. in a company listed in the public market, founded decades ago, with big corporate clients and a stable cash flow?

There might be companies that have years of resilience, corporate clients, are listed in the public market, and are also leading scientific research and presiding NLP conferences, but how many of them? Not that many. And usually, they are quite big. Three examples: Apple, Google, Facebook.

How did I decide which job offer to accept?

a) What do I enjoy and like the most within the options the NLP job market can currently offer? Working with clients, developing ad hoc NLP solutions

b) What NLP related skills I currently have? Python and standard NLP-related libraries, basic SQL, phonological and linguistic annotation, dataset creation and cleaning, forced alignment, text manipulation, regexes, theoretical knowledge of machine learning and deep learning, beginner level of machine learning implementation, usage of pre-trained models/embeddings in pipelines and projects.

c) What skills I currently do not have and would like to acquire? Advanced machine learning and deep learning implementation skills, acquiring expertise in at least another programming language, potentially useful also for web development (like javascript).

d) What do I want to get from the work experience? Experience with big corporate clients, tasks compatible with my current skills (point b), the possibility to gradually acquire the new skills I am interested into, being in a company that is researching hybrid NLP technologies.

Conclusion

In this article I explained the decision-making process, the conceptual framework, and the considerations that lead me to land my current job at expert.ai.

Let me know your thoughts on this brief guide to look for a junior job position in the Natural Language Processing industry; if you agree or disagree with my viewpoints regarding the natural language processing current job market situation and industry trends, and, more importantly, if you found this content helpful.

Cheers,

Giuseppe

Knowledge Engineer at Expert.ai