How can we train a computer to feel?

A Look into the R&D Process for Sentiment Analysis

Resultid Team
Resultid Blog
6 min readAug 29, 2022

--

Dall-E mini result for “machine learning”

Computers are great at many things. Your laptop can perform thousands of computations in fractions of a second, render entire cityscapes to a hairline’s resolution, and store massive libraries of information on a hard-drive that is barely the size of your palm. But, despite these strengths, there are also many seemingly mundane things that computers can’t do. Computers can’t feel. Neither can they intuit the world as easily as we can. Consider the task of sentiment analysis, for instance, where we aim to use automated processes to swiftly identify whether opinions in a body of text are positive or negative. Ask any person to read a product review paragraph and summarize what the writer is feeling; for them it’s a trivial task that can be completed with high accuracy. “The reviewer liked the product,” our human reader can swiftly conclude. It’s not so simple for a computer, who has no “natural propensity” for anything, let alone language and emotional perception. All actions of the computer must be taught to it. So how do we teach it to “feel”, or, at least, to recognize patterns that indicate feelings? How can we translate this task into something a computer can learn to do? Let’s explore some of the considerations and principles behind designing a natural language processing (NLP) model.

Transforming Language

Dall-E mini result for “computer transforming language”

It’s important to recognize that computers interpret the world in a manner very different from how we do. Words and sentences are intrinsically meaningless to computers, so in order for us to leverage the many advantages of computers in textual interpretation, we must first transform text into a form that computers can “understand”, either into a number or a collection of numbers. How exactly this transformation occurs may vary from model to model, but many NLP modelers opt to use transformer architectures that leverage self-attention to identify structural and contextual features of the text at hand. Machine learning jargon can be confusing, but the guiding principle behind this transformation step is to generate useful representations of the text that allow the computer to recognize patterns and thereby discern meaning and extract insights, more quickly, and possibly more thoroughly, than a human reader.

The BERT transformer model

Understanding Structure

Dall-E mini result for “syntax”

An important property of language is its structure (syntax, grammar, etc), and, therefore, successful models must be able to learn not only from the individual words, but from their interactions and positions within sentences and larger paragraphs. Individual words carry meaning, but sentences multiply in complexity due to the number of possible meanings. I never said I gave her the money has seven different meanings depending on which word the speaker places emphasis. This is great and this is not great differ structurally by only a single word, but their meanings are opposite. Comparing I love my chair because it is soft, and I’m able to enjoy my book because it is quiet, we can see that “it” in each sentence functions differently, despite being identically located after “because”. With brains wired for grammar, we understand that “it” in the first sentence refers to the chair, specifically the chair’s seat, and “it” in the second sentence refers to the volume in the area in which the reader sits. While language learning sort of just, happens, for us, as humans, “teaching” it to computers takes a fine degree of mapping and description of a language’s underlying rules. It’s far more difficult for us to formalize these rules and communicate them to computers than it is to teach a baby its first language. And, while learning new languages is a somewhat daunting task for most adults; teaching language to a computer is all the more technically difficult.

A grammarian’s classic sentence.

Applying Machine Learning

Dall-E mini result for “computer following rules to rapidly sort through huge amounts of data”

This is where machine learning comes into play. A huge advantage of machine learning algorithms is that they can approximate solutions fairly well in tasks that humans can’t easily develop formal solutions for. For instance, in sentiment analysis, though we can intuitively discern between positive and negative sentiments in text, it’s difficult for us to formalize our thought process into a series of well-defined rules, and even if we tried, we’d have to account for all sorts of different possibilities that might arise. In this instance, machine learning can be applied to help humans descriptively formalize language structures, to be able to teach the machine to rapidly interpret large textual data collections. Which is to say, the machine itself helps the person to teach it further. As long as we can define the task at hand, we can leverage these structures and algorithms to develop powerful and effective solutions.

Datasets & Distributions

Dall-E mini result for “gigantic data set blowing away”

Specifically in the context of sentiment analysis, we can use datasets of sentences paired with a human-labeled sentiment value (e.g., ‘I had a great time’: 1, ‘Today was a bad day’: -1). These datasets can either be found online or curated by hand, but regardless of their source, it’s important to recognize their strengths and limitations, and their relevance to what you’re trying to teach a computer to analyze. Training a model on a dataset of car reviews and testing it on a set of tweets about a political candidate, for example, is like reading Macbeth to prepare for a calculus test. Models should learn on datasets related to what you’re training them to analyze. Ultimately, developing and iterating models is an arduous process, and a significant consideration in the development process is to ensure that models are being trained on data examples that relate to the end goal.

Training & Iteration

Dall-E mini result for “iterative training model in sunlight computer lovesong”

The training process leverages the computational power of computers to perform millions of computations each second and occurs without direct human supervision. After each training cycle, however, developers return to modify and refine their models in pursuit of higher and higher levels of precision. There are all sorts of factors to tweak, and each variation will have benefits and drawbacks. In changing the learning rate, we weigh learning speed against precision; in modeling complexity, we weigh representational power against overfitting. Developers are constantly iterating upon and evaluating their models to validate their performance before settling upon a final model. It’s only after myriad iterations layered upon each other that a single model development lifecycle comes to a close.

What’s Next?

Dall-E mini result for “computer perceiving feelings”

These are just some of the considerations at play when it comes to developing successful NLP models, and though this isn’t meant to be a comprehensive course, it should hopefully shed light on some of the challenges and opportunities at hand in the NLP research and development process. Machine learning and NLP are still relatively young fields with room for improvement from a number of angles. We hope that you’re as excited as we are to see where this field will take us!

--

--