Who Stays Behind Your New Voice Assistant, or What I Learned as a Language Annotator for Big Tech

Have you ever caught yourself wondering what is behind the new “Captions” feature on Instagram, or how your bright new Alexa understands you right off the bat?

Natalia Kuzminykh
3 min readFeb 12, 2022
Photo by Ivan Bandura on Unsplash

Echo, Home Pod, Alexa, Cortana, Bixby, Siri — this list has significantly expanded in the last few years, and with the shift to a hybrid model of working, even more technologies are waiting for us in the near future.

But what makes it all possible, and why does it stay in the shadow for an ordinary customer? I am going to share what I learned after working as a Language Annotator for Big Tech companies and how you too can make money on it.

What is language annotation?

The steady growth of algorithms today creates a demand for high-quality data. However, the true challenge comes from data annotation, which is an essential step in building machine learning models and has to be done by human annotators. Typically, it involves transcribing or/and labelling a large amount of raw audio material with subsequent analysis.

Depending on the project, an annotator may be asked to focus on different tasks, such as:

  • Speech to Text Transcription, which enhances ML models’ ability to capture various utterances after training. In particular, this method made it possible for TikTok and Instagram users to display speech from short-live videos on the screen;
  • Audio Classification instead focuses on contrasting real voices from sounds and background noise (such as radio, TV or animal sounds). This type of audio annotation is crucial for the development of voice and digital assistants, as it helps them recognize who is performing the voice command;
  • Natural Language Utterance involves the categorization of raw speech (from semantics and context to dialects and intonation) in order to train virtual assistants and chatbots to distinguish the user's intent;
  • Speech Labelling is now widely used by music streaming services, such as Spotify, to match a given recording with its lyrics.

How to become an annotator?

Although annotation is vital for developing numerous algorithms, it is highly repetitive and time-consuming. Therefore, Tech giants opt not to involve highly qualified engineers for such tasks, preferring to share data annotation with third companies.

Thus, if you are looking for remote job opportunities in annotation, I would recommend first to pay attention to:

  • Telus International AI — specializes in many languages (from English to Vietnamese) and provides psychological assistance for their annotator;
  • Lionbridge — not so late was a part of Telus AI. Hence, it is also ready to work with annotators from any country. Among its other services are translation, linguistics, and game testing;
  • Appen — is now working closely with the growing Chinese market. Therefore, it is also possible to contribute to a one-time task among projects that seek only experts;
  • RWS Moravia Group has recently started offering freelance opportunities for linguistic graduates.

If you enjoy reading stories like these and want to support me as a writer, consider signing up to become a Medium member. It’s $5 a month, giving you unlimited access to stories on Medium. If you sign up using my link, I’ll earn a small commission.

If you enjoy reading stories like these and want to support me as a writer, consider signing up to become a Medium member. It’s $5 a month, giving you unlimited access to stories on Medium. If you sign up using my link, I’ll earn a small commission.

You also can support my sleepless nights when I am creating the content by buying me a coffee.

Related articles:

--

--

Natalia Kuzminykh

NLP Developer & Conversational AI | A linguist from Italy who is learning to navigate passion to technologies