“Remember that you are a human being with a soul and the divine gift of articulate speech” were the wise words written by George Bernard Shaw to enlighten Eliza Doolittle of one of the greatest skills she possesses as a human — the gift of speech and language. This was the basis of the play Pygmalion which was adapted into My Fair Lady — the famous Musical starring Audrey Hepburn.
Now, imagine someone taking this quote as a challenge and bestowing a computer with the same gift. That’s exactly what the MIT Professor Joseph Weizenbaum did when he developed the very first chatbot— ELIZA.
To set the context, in the play, Professor Higgins, a snobbish phonetician agrees to a wager with Eliza Doolittle, a cockney flower girl to give her speech lessons so she may pass as a duchess. The story then follow the events where Eliza learns the skill of articulate elocution. Weizenbaum took this as the basis to develop the model and thus named it ‘ELIZA’. According to him, ELIZA could speak in a way that created an illusion of understanding the person on the other end, reflecting a form of linguistic transformation like Eliza‘s. He also thought that, similar to how ELIZA’s performance incrementally improved by interacting with various users, Eliza Doolittle experienced personal growth in her linguistic journey by constant conversational exchange.
ELIZA was supposedly intended to explore communication between humans and machines. It came as a surprise and shock to Weizenbaum when Eliza was attributed with human-like feelings of understanding and empathy.
Let’s now delve into the technical details of the model to satiate our inquisitive NLP minds. Eliza was developed in order to perform the role of a psychotherapist. At it’s core it was a very simple model using pattern matching and substitution. The model was fed scripts with which it mimicked the speech patterns when a user inputted a text and creates a conversational interaction that imitated what happens in an actual doctor’s office. ELIZA was also notably different from the human character as ELIZA wasn’t human enough to learn new patterns than what was fed into it, just by interaction alone.
Weizenbaum wrote in the programming language MAD-SLIP ( Michigan Algorithm Decoder) on IBM 7094, which was a time-sharing system in MIT at that time. When developing the model, Weizenbaum had to identify and combat the fundamental problems that the computerized program would have to overcome in order to be the most human — like :
- The program should be able to identify the most important keyword.
- The program should be able to discover some minimal context from the identified keyword.
- The program should be able to make the choice of appropriate transformations on the words.
- The program should be able to generate response intelligently in the absence of key words.
- The program should be able to edit or extend the script fed to it.
The pattern matching was done through an algorithm that analyzes the input sentences based on decomposition rules, which are triggered by the keywords in the sentences. To break it down, the program assimilates and comes up with a transformation rule for each keyword from the input scripts. When it identifies the keyword in the text provided by the user, it applies the corresponding transformation associated with that keyword.
To elaborate, the keywords have a rank or a precedence number. When the algorithm searches for the keywords from left to right in a sentence, it will drop an already found keyword in favor of a higher ranked keyword. The algorithm also uses a period or comma as a delimiter, and when these delimiters are encountered and a keyword has been already found, it discards the rest of the input text and uses only the said keyword.
For each class of conversation, the algorithm uses a SCRIPT, which consists of all the keywords and associated transformation. The script is not an inherent part of ELIZA; rather, it is the data that ELIZA uses. Thus, it is not bound by any fixed set of recognition patterns or even a language.
The pattern-matching capabilities of ELIZA were just similar to regular expression and string matching algorithms. Although these methods have become rudimentary now, they were ground breaking back in those times, laying the foundation for advanced NLP algorithms. But due to these very primal algorithms, ELIZA’s conversational flow was much more scripted than a dynamic flow, which can be seen in chatbots nowadays.
The very elaborate details of the pattern-matching nitty-gritty would lead us into a labyrinth of transformation algorithms and transition matrices, which would be more complex than the level of mathematical context I intended to set for this article.
So, to conclude, building the first chatbot using NLP algorithms in 1960’s after being inspired from a common play was a very ambitious move in the history of NLP. ELIZA continues to inspire all the developers engaged in AI and chatbots. The model has a history of winning awards and was also one of the first models to attempt the Turing Test. In the world that is subsumed by ChatGPT and GenAI, the intriguing history of the first-ever chatbot piqued my curiosity and led me here to share it with the world. My interest in literature was just the icing on the cake that motivated me when I learned about the influence of the play that helped a professor to build this early legacy in NLP.
If anyone feels the need to delve into deeper mathematical details, you can read this paper that explains the math behind the model : ELIZA
We can take away from the whimsical world of ‘My Fair Lady,’ that language is the ultimate accessory, and transformation knows no bounds — for in the end, as Professor Higgins might jest, ‘The rain in Spain stays mainly in the plain, but the magic of the theater lingers long after the curtain falls.’