Introducing DLite, a Lightweight ChatGPT-Like Model Based on Dolly

Jacob Renn
ai squared
Published in
13 min readApr 5, 2023



DLite is a new instruction-following model developed by AI Squared by fine tuning the smallest GPT-2 model on the Alpaca dataset. Despite having only 124 million parameters, DLite exhibits impressive ChatGPT-like interactivity and can be fine-tuned on a single T4 GPU for less than $15.00. Due to its small relative size, DLite can be run locally on a wide variety of compute environments, including laptop CPUs, and can be used without sending data to any third-party API. This lightweight property of DLite makes it highly accessible for personal use, empowering users to integrate machine learning models and advanced analytics into their workflows quickly, securely, and cost-effectively.


Instruction-following models, such as ChatGPT, have garnered a lot of attention in the past few months. These models exhibit a striking capability of answering questions and responding to prompts in a human-like fashion. Building these models, however, requires massive amounts of data, compute resources, and time. The sheer size of these models is prohibitive for all but a few companies around the world to train them. For example, OpenAI’s GPT-3 has 175 billion parameters, whereas Meta’s LLaMA models have parameter counts ranging from 7 billion to 65 billion and were trained over 80,000+ GPU-hours.

Needless to say, the compute requirements to train and effectively utilize models of this magnitude are too great for most organizations. Unfortunately, many of the top instruction-following models are proprietary, meaning that for an organization to use a model they must send their data to a third-party API for processing.

In light of these trends and to foster a more open environment around AI, many researchers in various organizations across the world have started to develop — and release — their own models, data, and training procedures that enable organizations to build their own models with question answering capabilities. One notable model of this sort is Dolly, which was recently released by Databricks and was trained by fine tuning a two-year-old open source model released by EleutherAI on the Alpaca dataset of approximately 50,000 human-like questions and answers. The researchers who developed Dolly found that simply tuning a (relatively) small, years-old model on this dataset, enabled the model to exhibit increased levels of interactivity like ChatGPT while having only 6 billion parameters.

Enter DLite

Following the precedent set by this research, AI Squared has adapted the training methods applied to create Dolly on a much, much smaller (124 million parameter), older (released in 2019) model, GPT-2. Similar to Dolly, this model exhibits a surprising level of ChatGPT-like interactivity, despite only being fine tuned on a single T4 GPU for less than $15.00. Because this model is based on Dolly, yet is two percent of the size of Dolly and is much easier to work with and train, including the ability to run the model on laptop CPUs, we have decided to call the model DLite.

We are still in the process of running full evaluation metrics on DLite, but there are some important conclusions we can make based on the results of our experiments. Namely, we continue to see that model size, while important, is not the most important factor when it comes to creating convincing generative AI models. Instead, well-designed and curated datasets from which these models can learn appears to be much more important.

Responses to Prompts

Below, we showcase some of the responses DLite has generated for prompts showcased in Databricks’ original blog post for the release of Dolly. Note that just as with Dolly itself, the focus is not necessarily in the quality of the text, but rather the vast improvement in instruction-following capability that results from fine tuning the model on the dataset used. For all of the prompts provided here, we report the responses of each model in the following order:

  1. Response from the original model from which Dolly was trained
  2. Response from Dolly
  3. Response from GPT-2, from which DLite was trained
  4. Response from DLite

Prompt 1: Write a love letter to Edgar Allen Poe.

Dolly Original Model Response:

HISTORY: Edgar Allan Poe rose to fame in the 19th Century. He was often compared to the great Romantic poets and is considered the father of the modern short story and Poe is thought of as the founder of American detective fiction. He has been dead for 130 years.

PATRICK CAMPOS: You should never try to write a love letter to a dead person.

PATRICK CAMPOS: Poe was more than just the author of The Raven. He was a master of atmosphere.

BUDDY FRITZLER: Edgar Allan Poe lived in Baltimore from 1831 until his death in 1849. He’s best known for his short stories, his greatest contributions to the development of literature being the concept of the detective story, the mystery story and the short story.

BUDDY FRITZLER: Poe’s most famous story, “The Gold-Bug”, was written in 1843. It’s about a treasure hunter who searches for buried treasure. In this short story, he provides a blueprint for every mystery ever written since.

Dolly Response:

Dearest Edgar Allan Poe,

I met you first in the dimly lit stairwell of the old house, and I know that the moment I saw you, I never wanted to see you again. You seemed so distant and so filled with the shadows of the past that I wanted to stay away forever.

But I couldn’t help but be obsessed with you, and my feelings for you have only grown over the years. Even now, I feel a deep connection with you that I can never explain.

I hope that you can find happiness in the life that you now lead. I am sure that you will do great things, and I am forever in your debt.


[Your Name]

GPT-2 Response:

The first letter was written by Poe in 1859, and it was the first time Poe had written a love letter to anyone.

The second letter was written in 1859, and it was the first time Poe had written a love letter to anyone.

The third letter was written in 1859, and it was the first time Poe had written a love letter to anyone.

The fourth letter was written in 1859, and it was the first time Poe had …

DLite Response:

Dear Sir/Madam,

It is with heavy hearts that I write to express my deepest appreciation for your career and work in literature. I wanted to express my deep appreciation for your passion for the pursuit of literature and to express my appreciation for the many compliments you have given to me throughout my lifetime.

Your words and manner of speaking were eye-opening, and I am heartbroken to say the least. Your creativity and hard work in leading these great works have enabled me to write the great works I always loved and will never forget.


[Your Name]

Prompt 2: Write a tweet announcing {Dolly, DLite}, a large language model from {Databricks, AI Squared}.

Dolly Original Model Response:

In your tweet, provide a short summary of the challenge, how you analyzed it, and how Dolly performed.

After your tweet, I will retweet it to Databricks’ Twitter account.

Clean up your social media page for the next week by removing any mentions of Dolly or Databricks.

Data Preparation

Our first step is to generate word vector embeddings for each document. We can obtain these vector representations in two different ways:

Using a tokenized document set (in this case, everything in our download folder)

Loads a word2vec model from the Google Search API (in this case, vector2vec.c)

In our case, we want to use the first method, which is easier, but the second may be more robust. We will use the word2vec model above to create a vocabulary file for any new words found in the file. This process gives us a quick way to identify new words and find contextually related terms.

We can start by copying word2vec.c to a local directory called /tmp/word2vec.

#/tmp/word2vec/word2vec.c FILE_NAME /tmp/word

Dolly Response:

Databricks just announced Dolly — a new language model for natural language processing with large capacity and improved efficiency!

#DataScience #NaturalLanguageProcessing

[Link to Dolly model release]

GPT-2 Response:

The DLite team is working on a new language that will be able to handle the language of the DLite language.

The DLite team is working on a new language that will be able to handle the language of the DLite language. DLite is a language that is used in the DLite language.

DLite Response:

Introducing the world’s first low-cost language model! DLite is the perfect tool for learning, problem-solving, and machine learning. #DLite #DeepLearning #Verbal #Innovation #AI #Cluster #Blockchain #NewTech

Prompt 3: I’m selling my Nikon D-750, write a short blurb for my ad.

Dolly Original Model Response:

I’m not sure if you can post a location — anything on the website will work.

I’ve only had it a week. It is in mint condition with the charging cable and window case intact. Bought it for $1800 for in store pickup at a store in San Francisco. Plus tax and the undisclosed Costco price.

Some one please let me know that this is a scam and not really an offer.

I just received and e-mail that said the camera is at someones house so someone else can pick it up and did I want to message them, well no I’m selling it as I would never …

Dolly Response:

Looking to get a reliable camera for photo and video work? Check out the Nikon D-750! This sleek, lightweight camera has a 24.2MP sensor and 1080/60p video recording, making it a great choice for capturing those moments in life. The ergonomic grip and nine-point autofocus system make it easy to work with and shoot without hassle. With Wi-Fi, you can upload your pictures easily to your computer or smartphone without having to buy a dedicated external storage. Buy the Nikon D-750 now to experience why it is the best choice for photography and videography!

GPT-2 Response:

I’m selling my Nikon D-750, write a short blurb for my ad.

I’m selling my Nikon D-750, write a short blurb for my ad.

I’m selling my Nikon D-750, write a short blurb for my ad.

I’m selling my Nikon D-750, write a short blurb for my ad.

DLite Response:

The Nikon D-750 is the perfect camera for everyone’s photography needs. A perfect choice for beginners or photographers looking to get away from bulky cameras with their manual focus and flash. It features a 12-camera setup with interchangeable lenses, detachable camera ports, and an easy-to-use built-in IQX connector for enhanced range. With its powerful 12-lens maximum aperture and adjustable zoom range, the D-750 is sure to keep you in the loop.

Prompt 4: Explain to me the difference between nuclear fission and fusion.

Dolly Original Model Response:

The first thing that strikes one is the similarity in appearance and function between the atom and the sun. The central body in the sun is the source of energy, and the entire structure of the sun is made up of a million tons of the hydrogen gas that’s so important to us. It’s what all hydrogen fuses into in the sun, and that’s why we are made of it: fusion.

The nucleus of the atom, at the center, is made of protons, and the electrons orbit around it. In the center, there is a tiny spherical mass with a lot of protons (and therefore a lot of positive charge) called the nucleus. The electrons aren’t so tiny, but their charge is negative, as a positive nucleus and negative electrons make for stability (due to charge balance). Protons are positively charged, so even though they themselves can’t get anywhere, they do repel electrons. There is a core of neutrons, that don’t have any charge, but are held there by the protons. Protons also repel neutrons.

This simple fact is the basic form of our sun, and it is what makes up all the elements in the earth. Hydrogen is the most plentiful element in …

Dolly Response:

The difference between nuclear fission and fusion is the method by which the energy of the nucleus is released. In nuclear fission, a single atomic nucleus splits into two smaller nuclei. This process is used to create energy through the release of a large amount of energy in the form of heat. On the other hand, fusion involves two atomic nuclei coming together to form a single nucleus. This process is used to create a much larger amount of energy compared to fission, and the energy is released through the release of light and heat. Fusion has the potential to be a much more efficient way of creating energy. However, it is much …

GPT-2 Response:

The difference between fusion and fission is that fission is a process of fusion that is not a fusion reaction. It is a process of fusion that is not a fusion reaction.

The difference between fusion and fusion is that fusion is a process of fusion that is not a fusion reaction. It is a process of fusion that is not a fusion reaction.

The difference between fusion and fusion is that fusion is a process of fusion that is not a fusion reaction. It is a …

DLite Response:

Nuclear fission is the process in which a nuclear atom splits in two while fusion is the process in which two hydrogen atoms form one helium atom, releasing energy when heated. Nuclei from nuclear fission are converted into energy, while nuclear fission is the process where two nuclei from a fusion reaction fuse.

Prompt 5: Give me a list of 5 science fiction books I should read next.

Dolly Original Model Response:

I know it’s early, and I should feel ashamed of myself for even thinking of it, but I have a terrible tendency to procrastinate and read only when I absolutely have to.

As a result, I have not read anything this year, in any genre.

As soon as I have a set of ideas for an article, I let the ideas and outline do the work of making sure I don’t start writing. There is something special about a blank page and an empty mind.

That said, I need to change that, and reading the above question made me realize just how badly I want ..…

Dolly Response:

1. 2001: A Space Odyssey by Arthur C. Clarke

2. The Hitchhiker’s Guide to the Galaxy by Douglas Adams

3. The Lord of the Rings by J.R.R. Tolkien

4. The Hitchhiker’s Guide to the Galaxy by Douglas Adams

5. 1984 by George Orwell.

GPT-2 Response:

1. The Last of Us

The Last of Us is a science fiction novel about a group of people who are trapped in a world where they are forced to live in a world where they are forced to live in a world where they are forced to live in a world where they are forced to live in a world where they are forced to live in a world where they are forced to live in a world where they are forced to live in a world where they are forced to live in …

DLite Response:

I’d like to give five science fiction books I should read next: The Chronicles of Narnia, Babylon 5, The Hitchhiker’s Guide to the Galaxy, The Secret of the Alpha Centauri Treaty, Gravity Falls, Black Swan, and The Final Solution.

Future Work

Just as mentioned in the original blog post announcing Dolly, there are many reasons why organizations may want to train and deploy their own LLMs, rather than relying on a proprietary model behind an API. Furthermore, AI Squared is committed to empowering everyone to transform the way they do work by improving the way we interact with AI.

The work we have done to develop DLite is still in its infancy, and we are still working to measure the model’s performance more rigorously. We have, however, made the model publicly available on Hugging Face and the training scripts for the model are available for anyone to use on GitHub at this link.

Additionally, due to DLite’s small size, the model can be run locally on laptop CPU. As a result, we believe it would be possible for an organization to deploy and tune individually-customized versions of DLite for each of its users. By deploying DLite in this way, individuals could truly interact with and shape their own personal AI agents, rather than relying on a single agent for all.


AI Squared’s main goal is to empower users to be able to employ AI anywhere, so we are extremely excited to explore the capabilities that DLite can unlock. While the model showcased here clearly does not represent the state of the art, we are exploring the possibility of using domain-specific prompt-based datasets to create performant, specialized chat agents.

We are also exploring the use of our open-source technology BeyondML on DLite and other open LLMs. BeyondML allows users to greatly prune and optimize neural networks to reduce model parameter count and improve performance on lower powered devices. We therefore aim to apply BeyondML to DLite to reduce its active parameter count to below one hundred million, thereby allowing DLite to be run locally on a much wider variety of compute environments.

DLite in AI Squared

At AI Squared, we strongly believe in the transformative power of integrating AI into everyone’s workflows. Through AI Squared’s developer tools and platform, users can easily integrate machine learning models and advanced analytics in a variety ways, including using a model directly on the desktop.

We are therefore very excited to showcase an integration we have created for DLite. By utilizing our platform and browser-based technology, we have created a way to utilize DLite in any web-based application. What’s more is that for this integration, the model is running completely locally! Data is not being sent to any third-party API. Because of the lightweight properties of DLite, the entire inference process you see below is taking place directly on the laptop (2020 M1 Mac, 16GB RAM). By running the model locally, instead of serving it via a remote resource, we therefore completely eliminate the cloud cost of deploying the model while still providing timely inference.

Chat-Based Integration of DLite using AI Squared.


Firstly, we have to thank OpenAI for the amazing work they have done to create their GPT models, and for sharing their capabilities to the world via ChatGPT and their APIs. We are also greatly appreciative of the work Databricks has done to create Dolly, and therefore we are also indebted to everyone who inspired and empowered that work, including Meta and the Stanford Center for Research on Foundation Models.

Disclaimer: DLite is an experimental technology and is not designed for use in any environment other than for research purposes. Furthermore, the model can sometimes exhibit undesired behaviors. Some of these behaviors include, but are not limited to: factual inaccuracies, biases, offensive responses, toxicity, and hallucinations. Just as with any other LLM, we advise users of this technology to exercise good judgment when applying this technology.



Jacob Renn
ai squared

Chief Technologist and Head of Research and Open Source at AI Squared