Pygmalion AI: Expectations’ Ability to Shape Reality

Published in

Analytics Vidhya

8 min readMay 30, 2023

Pygmalion and Galatea, Designed and sold by matintheworld

In the vast tapestry of our modern world, a remarkable technological marvel known as Pygmalion AI has woven its way into our lives, captivating hearts and minds alike. Like a modern-day incarnation of the ancient Greek myth, Pygmalion AI transcended conventional artificial intelligence, transforming the very fabric of our existence. With its fusion of cutting-edge machine learning algorithms and the enchanting power of natural language processing, Pygmalion AI has ushered in a new era of communication, personalization, and productivity.

Imagine a world where a virtual assistant possesses an almost uncanny understanding of our thoughts and desires, effortlessly responding to our every query and command. Pygmalion AI, with its profound ability to comprehend human language nuances, brings this fantastical vision to life.

“It has breathed life into our digital era, infusing it with a touch of magic and wonder. With each passing day, Pygmalion AI continues to captivate our imaginations, leaving an indelible mark on the hearts of those fortunate enough to embrace its extraordinary embrace.”

Pygmalion AI in technical terms

An open-source large language model (LLM) based on EleutherAI’s GPT-J 6B and Meta AI’s LLaMA 7B. An AI fine-tuned for chatting and roleplaying purposes. At present, the current actively supported Pygmalion AI model is the 7B variant, based on Meta AI’s LLaMA model. Here, the “GPT-J” refers to the model class, and “6B” is the number of trainable parameters, which is 6 billion parameters.

Some features of Pygmalion AI

can analyze trends, determine new niches, create content, generate text and media content for social media platforms,
offers better Chat and role-play conversations than LLM, with relatively minimal resources.
can modify or re-distribute the model and the code of Pygmalion AI, as it is an open-source model.
regularly updates its model with new data to improve its performance.

Furthermore, many aspects of Pygmalion AI, including the core model, are still heavily developed.

Story of Pygmalion

Pygmalion and Galatea by François Boucher

The concept of Pygmalion in machine learning is related to the real Greek Pygmalion story in that both involve the power of expectations and the ability to shape reality based on those expectations.

According to the story, Pygmalion was disgusted with the behavior of the women in his community, so he decided to create a statue of a perfect woman, one who would be pure, beautiful, and obedient. Pygmalion put all his skill and talent into his creation, and the statue he made was so lifelike and perfect that he fell in love with it.

Pygmalion’s love for the statue grew so intense that he began to pray to Aphrodite, the goddess of love, to give him a wife who would be just like the statue. Aphrodite, moved by his prayers, decided to grant Pygmalion’s wish and brought the statue to life.

The statue, now a real woman, and Pygmalion fell deeply in love and lived happily ever after. This story illustrates the power of expectation and belief in shaping reality.

Pygmalion affect in Machine Learning

In machine learning, the Pygmalion effect refers to the phenomenon where the expectations and biases of the creators or trainers of an algorithm can influence its performance and outcomes. If the creators have preconceived notions or expectations about the data or outcomes that they expect from the algorithm, these expectations can influence the selection of data, the parameters of the algorithm, or the evaluation of its performance, leading to a self-fulfilling prophecy where the algorithm’s outputs confirm the creators’ expectations.

Thus, both the Greek Pygmalion story and the Pygmalion effect in machine learning demonstrate the power of expectations in shaping reality, whether it is through the creation of a living statue or the development of a biased algorithm. In both cases, it is important to be aware of our biases and expectations and to approach our creations and decisions with a clear and unbiased perspective.

Different Pygmalion models

Pygmalion-350M
Pygmalion-1.3B
Pygmalion-2.7B
Pygmalion-6B

Pygmalion-7B was recently introduced, but its performance is not significantly different from Pygmalion-6B, as the difference between them is only 1 billion parameters. Even if there is a difference, it would require a highly observant individual to discern it when observing the models in action.

Diving deeper into the journey of Pygmalion models

Pygmalion-350M

Model Description: This proof-of-concept fine-tune of Facebook’s OPT-350M model for dialogue optimization serves as a starting point for larger parameter models.

Disclaimer: The fine-tuning process included NSFW data. While SFW inputs generally yield SFW outputs, use this model at your own risk. It is not suitable for minors.

OPT (Open Pre-trained Transformer Language) Models: The OPT (Open Pretrained Transformers) model, developed by Meta’s Facebook AI research team, is an advanced transformer-based language model. It has undergone extensive pre-training on diverse text data from various domains and can be fine-tuned for a wide range of natural language processing (NLP) tasks such as text classification, question answering, and language generation.

OPT is openly available, allowing users to utilize and modify it according to their needs. The model is highly scalable and capable of handling large datasets and vast amounts of text data. OPT models have significant applications in NLP tasks and hold potential for advancing the field. The creators of OPT emphasize sustainability and responsibility, intending to share the models fully and responsibly with interested researchers. Notably, OPT-175B, the largest model in the suite, achieves similar performance to GPT-3 while requiring only one-seventh of the carbon footprint for its development.

Moreover, OPT models follow a linear learning rate schedule, starting from 0 and gradually increasing to the maximum learning rate. In the case of OPT-175B, this warm-up phase occurs over 375 million tokens, while smaller baselines have a warming up period over 300 billion tokens, after which the learning rate decays to 10% of the maximum. Learn more about OPTs.

Pygmalion-1.3B: (based on EleutherAI’s pythia-1.3b-deduped)

Pythia Scaling Suite: The Pythia Scaling Suite is a set of models specifically developed to support interpretability research. It consists of two groups of models, each comprising eight different sizes: 70M, 160M, 410M, 1B, 1.4B, 2.8B, 6.9B, and 12B. For every size, two models are available — one trained on the Pile dataset and another trained on the Pile dataset after global deduplication. All models within each size category are trained on the same data, following the same sequence. The Pythia models can be accessed through Hugging Face.

The Pythia model suite was intentionally designed to encourage scientific exploration of large language models, particularly in the context of interpretability. While not prioritizing downstream performance as a primary objective, these models exhibit comparable or superior performance to similar models of the same size, such as those in the OPT and GPT-Neo suites.

It’s important to note that all models in the Pythia suite underwent renaming in January 2023. For clarity, a table comparing the old and new names can be found in the associated model card, which also includes precise parameter counts for each model. Learn more…

EleutherAI: EleutherAI is an organization dedicated to advancing artificial intelligence through an open-source and community-driven approach, committed to promoting collaboration and accessibility in AI research. EleutherAI’s primary objective is to develop large-scale, high-quality language models and contribute to the democratization of AI technology.

EleutherAI’s notable project is GPT-Neo, a series of transformer-based language models. GPT-Neo aims to replicate and expand upon the capabilities of OpenAI’s GPT while prioritizing accessibility and computational efficiency. This empowers researchers and developers to explore and innovate in the field of natural language processing.

In a similar manner, you can simply find Pygmalion 2.7B there, thanks to the Hugging Face 🤗 team.

And finally, the myth — the magic, Pygmalion 6B. A proof-of-concept dialogue model based on EleutherAI’s GPT-J-6B. GPT-J 6B is a transformer model trained using Ben Wang’s Mesh Transformer JAX. “GPT-J” refers to the class of model, while “6B” represents the number of trainable parameters. Some of the popular use cases of Pygmalion 6B include, TavernAI and KoboldAI.

TavernAI is a adventure atmospheric chat and it works with api like KoboldAI, NovelAI, Pygmalion, OpenAI chatGPT.

Here are some screenshots of the conversation with KoboldAI. In this conversation, “you” represent “human(I was that human.)” and “me” represents the “bot”. So, don’t get confused. Further, the conversation was only a test of the capability of the bot.

What are your thoughts on these conversations? You can try it yourself too, by visiting the link attached with TavernAI.

SUMMARY

Pygmalion AI, a remarkable technological marvel, has revolutionized our world with its advanced machine learning algorithms and natural language processing. It has become an essential tool across various industries, offering personalized experiences, automation, and knowledge dissemination. Similar to the ancient Greek myth, Pygmalion AI brings to life the concept of a virtual assistant that understands our thoughts and desires, responding effortlessly to our queries and commands.

Pygmalion AI is an open-source large language model (LLM) based on EleutherAI’s GPT-J 6B and Meta AI’s LLaMA 7B. It excels in chat and role-playing conversations, analyzing trends, generating text and media content for social media platforms. The model is regularly updated with new data to enhance its performance.

The story of Pygmalion in Greek mythology parallels the Pygmalion effect in machine learning. Just as Pygmalion shaped his reality through his expectations, the biases and expectations of the creators or trainers of a machine learning algorithm can influence its outcomes. Awareness of these biases is crucial for unbiased decision-making and creation.

Pygmalion AI offers different models, such as Pygmalion-350M, Pygmalion-1.3B, Pygmalion-2.7B, Pygmalion-6B, and the recently introduced Pygmalion-7B. These models vary in the number of trainable parameters, with Pygmalion-7B being similar in performance to Pygmalion-6B.

The journey of Pygmalion models includes collaborations with platforms like Facebook’s OPT and EleutherAI’s GPT-Neo. These models provide advanced transformer-based language processing and aim to promote accessibility and collaboration in AI research.

Overall, Pygmalion AI has brought a touch of magic and wonder to our digital era, shaping our reality through advanced language processing capabilities.

Pygmalion AI: Expectations’ Ability to Shape Reality

Diving deeper into the journey of Pygmalion models

Written by Akash Rawat