May 23

OpenAI launched the most significant AI revolution with the release of ChatGPT. Everybody was amazed by the possibilities provided by this generative AI.

Organizations started to use this technology to accelerate their work and the value they can bring to their customers: chatbots, writing assistants, tasks automation, etc …

However, using OpenAI models come with a price not all organizations are ready to pay: the lack of data privacy. Indeed, the generative model uses the text provided by users to improve itself.

But the recent leakage of Samsung's personal information drew attention to this major issue.

At the same time, with the success of this AI, we witnessed the emergence of open-source Large Language Models (LLMs) instantiated by Meta with LLaMA: Vicuna, Alpaca, GPT4All, …

However, even if the LLaMa’s weights leaked after its release, allowing anybody to use the pre-trained version (which cost around 5M$ to train), it’s important to remind everybody that any commercial usage is prohibited by its license…

Therefore, it was impossible to run this kind of LLM for a business purpose.

Until recently.

StableLM, Dolly-2, and MPT-7B (there) are open-source models that achieve state-of-the-art results, and they were released with a commercial license.

That means any organization can use them for business purposes.

It also means these models can be fine-tuned with private data, allowing any organization to exploit the maximum of LLMs power, without leaking their private data to an external organization like OpenAI. That’s what Bloomberg did with its own LLM: BloombergGPT.

In this article, I will show you how to train your own LLM on your own data. For this example, I’ll fine-tune Bloom-3B on the “The Lord of the Rings” book.

