Comparison & Cost Analysis: Should we go with open-source or closed-source LLMs?

Alaa Dania Adimi
11 min readJan 10, 2024

--

With the explosion of Large Language Models (LLMs), a crucial decision arises for businesses: Should we invest in closed-source or open-source models? This isn’t a straightforward decision. Both options offer distinct benefits and drawbacks that require careful consideration.

In this article, we will first dissect the advantages and disadvantages of both closed-source and open-source LLMs within the enterprise context. Then, we will talk about the different ways to deploy and integrate open-source models. By the end, we will provide a cost comparison for some closed-source and open-source models that can aid you in decision-making regarding LLM implementation.

This article is a fruit of research conducted by me and my partner in crime Rania Fatma-Zohra Rezkellah as part of our final year internship as Artificial intelligence developers.

Note: We’re constantly learning and want to make this guide the best it can be. If you have any experiences, tips, or questions about using LLMs in business or projects, please share them in the comments below or send us an email — I will leave them below! We appreciate your help in making this resource even more valuable.

Disclaimer: Prices and property details may change in the future. This article was last updated on June 28th, 2024. We’ve linked all our information sources below for your reference.

1. Open-source Large Language Models

1.1. What are open-source Large Language Models?

Open Source LLMs are language models whose source code is publicly available and can be freely accessed, used, modified, and distributed by anyone. Open-source options include Llama by Meta, Flan T5 and T5 by Google AI, and Mistral by Mistral AI.

“When you’re doing research, you want access to the source code so you can fine-tune some of the pieces of the algorithm itself. With closed models, it’s harder to do that” says Alireza Goudarzi, a senior researcher of machine learning at GitHub. [Source: GitHub Blog]

1.2. Advantages

Some of the advantages of using open-source Large Language Models include:

  • Control: With open-source LLMs, you have control over the model, its training data, and its applications.
  • Customization: Open-source LLMs are easier to run and customize because their underlying architecture and weights are publicly available.
  • Community support: Open-source LLMs are often supported by a large community of developers who contribute to their development and improvement.
  • Innovation: The open-source community is known for its innovation and ability to quickly adapt to new technologies.
  • Transparency: With open-source LLMs, we have full visibility into the model’s inner workings, which can help build trust with customers.

1.3. Disadvantages

Some of the disadvantages of using open-source Large Language Models include:

  • Limited resources: Open-source projects may have limited resources compared to closed-source projects backed by large corporations.
  • Dependency on community: The development and improvement of open-source LLMs depend on the contributions of the community, which may not always be reliable.

1.4. Some popular open-source Large Language Models

There’s an increasing trend in the number of released open-source LLMs. The chart below from the paper “A Comprehensive Overview of Large Language Models” illustrates the increasing trend towards instruction-tuned models and open-source models in the last year.

Figure 01 — Chronological display of LLM releases: blue cards represent ‘pre-trained’ models, while orange cards correspond to ‘instruction-tuned’ models. Models on the upper half signify open-source availability, whereas those on the bottom half are closed-source.

There are so many open-source models that we can use, let’s list a selection of them:

  1. Llama — Large Language Model Meta AI (Llama) is Meta’s LLM released in 2023. The largest version is 65 billion parameters in size. Llama comes in smaller sizes that require less computing power to use, test, and experiment with (7B, 13B, 33B, and 65B).
  2. Mistral — Mistral 7B is a large language model created by Mistral AI. According to the Paris-based startup, Mistral 7B outperforms other open-source LLMs like LLaMA 2 on many metrics. Just a month ago, the team released a newer model called Mixtral 8x7B, a high-quality sparse mixture of expert models (SMoE) with open weights.
  3. T5 — T5 is developed by Google, it’s a text-to-text transfer transformer. It is one of the first popular models to be able to boast of such a feat. The model comes in many variants like T5 small, base, large, 3B, and 11B.
  4. Flan T5 — Flan T5, where Flan stands for “Fine-tuned Language Net” and T5 stands for “Text-To-Text Transfer Transformer”, is an enhanced version of T5 that has been finetuned in a mixture of tasks. For the same number of parameters, these models have been fine-tuned on more than 1000 additional tasks covering also more languages. The model also comes with five variants such as the original T5 model: Flan T5 small, Flan T5 base, Flan T5 large, Flan T5 XL and Flan T5 XXL
  5. Falcon — Falcon is a transformer-based, decoder-only model developed by the Technology Innovation Institute (TII). The model is available in three variants: Falcon 7B and 40B (7 billion, 40 billion parameters) and even a larger variant too which is Falcon 180B.

These are just some examples of open-source models that are being used currently. You can always check the Open LLM leaderboard by HuggingFace which aims to track, rank, and evaluate LLMs and chatbots as they are released.

1.5. Licensing of open-source

Awesome! There are many open-source LLMs that we can play with, however, it’s very important to check the licensing of models before using them. OSS licensing refers to the legal agreements that govern the use, modification, and distribution of open-source software (OSS). These licenses provide a framework for how OSS can be shared, developed, and commercialized while protecting the rights of both contributors and users. To keep it simple, we can categorize them into:

  • Permissive licenses like Apache 2.0 let you do more or less what you want with the model, including commercial use.
  • Restricted licenses like CC BY-SA 3.0 place restrictions on commercial use, but don’t prohibit it.
  • Non-commercial licenses like Facebook’s proprietary ones or Creative Commons CC BY-NC-SA 4.0 explicitly prohibit commercial use and are a bad choice for building apps.

To read more about licenses for Large Language Models. We suggest checking the following documentation on GitHub.

2. Closed-source Large Language Models

2.1. What are closed-source Large Language Models?

Closed-source LLMs are large language models whose source code is not publicly available. They are often developed by large corporations and may be proprietary. Examples of closed-source LLMs are the GPT series from OpenAI, Cohere’s LLMs, and Claude from Anthropic.

“Closed, off-the-shelf LLMs are high quality. They’re often far more accessible to the average developer” says Eddie Aftandilian, a principal researcher at GitHub. [Source: GitHub Blog]

2.2. Advantages

  • Development: Closed-source LLMs are often designed with developers in mind, providing well-documented APIs that make it easier to incorporate LLMs into applications without requiring extensive expertise in machine learning or natural language processing.
  • Deployment: Closed-source LLMs are generally easier to put into production than open-source LLMs. Enterprises can rely on the vendor to provide guidance and assistance throughout the deployment process, ensuring a smoother transition from development to production.
  • Support: Closed-source LLMs may come with dedicated support from the company that developed them.

2.3. Disadvantages

  • Limited customization: The underlying architecture and weights of closed-source LLMs are not publicly available, making customization and fine-tuning impossible.
  • Lack of transparency: With closed-source LLMs, we have limited visibility into the model’s inner workings.
  • Unanticipated Changes: The closed-source model architecture can change behind the API that developers use. This can pose challenges to ensure the stability and consistency of the behavior of the application that relies on the LLM.

2.4. Some popular closed-source models

There are many organization that offers APIs to closed-source LLMs that businesses can easily use to build applications. Some of the leading players in the sector include OpenAI, Cohere, Google PaLM, Anthropic, and AI21.

  • OpenAI is an AI research laboratory that was founded in 2015 as a non-profit dedicated to developing “safe” artificial intelligence.
    — The most popular ones are GPT-3.5-Turbo, GPT-4, GPT-4 Turbo and GPT-4o.
  • Cohere founded in 2019, is an LLM company focused on building AI for the enterprise. Cohere’s LLMs are trained on massive datasets of text and code, and they can be used for a variety of tasks, including machine translation, text summarization, and question-answering. Cohere has a variety of models that cover many different use cases.
    Command is Cohere’s default text generation model which is trained to follow user commands and to be instantly useful in practical business applications.
  • Google PaLM was first announced in April 2022. It has already been used to achieve groundbreaking results in several NLP areas, including machine translation, question answering, and code generation. Google PaLM offers a variety of models, the most known are:
    text-bison which is fine-tuned to follow instructions and can also be used for summarization, classification, and more.
    chat-bison which is fine-tuned for multi-turn conversation use cases.
  • Gemini is the LLM launched by Google in 2023. It is a set of large language models (LLMs) that leverage training techniques taken from AlphaGo, including reinforcement learning and tree search. Gemini’s combination of multi-modal abilities, use of reinforcement learning, text and image generation capabilities, and Google’s proprietary
    data are all the ingredients that Gemini needs to outperform GPT-4.
    — The most popular ones at the moment are Gemini 1.5 Flash, Gemini 1.5 Pro, and Gemini 1.0 Pro.
  • Anthropic was founded in 2021 as an AI research company building general AI systems and language models. Anthropic currently offers two families of models:
    Claude which is their most powerful model, excels at a wide range of tasks from sophisticated dialogue and creative content generation to detailed instruction.
    Claude instant which is a faster, cheaper yet still very capable model, that can handle a range of tasks including casual dialogue, text analysis, summarization, and document comprehension.
  • AI21 is an AI lab & product company whose mission is to reimagine the way we read and write by making the machine a thought partner to humans.
    — The Jurassic-2 series has three models Jurassic-2 Ultra for an unmatched quality, Jurassic-2 Mid for an optimal balance of quality, speed, and cost and Jurassic-2 Light for speed and cost-efficiency.

3. Leveraging open-source Large Language Models for production

Leveraging open-source Large Language Models offers a multitude of options for integration into your project. When considering how to incorporate these models, there are several choices available, each with its advantages and considerations.

3.1. Self-Hosting Open-source Large Language Models

Self-hosting open-source Large Language Models on-premise provides a high degree of control and security. Organizations with strict data privacy requirements or sensitive information may prefer this option as it allows them to manage the infrastructure, ensuring compliance with their standards. However, it demands significant hardware resources and expertise to set up and maintain the infrastructure effectively.

3.2. Deploying Open-source Large Language Models in the Cloud

Deploying open-source Large Language Models in the cloud offers scalability and accessibility. Cloud-based solutions enable easy scaling based on computational needs, allowing for flexible usage. They also often provide additional services such as auto-scaling, backup, and load balancing, reducing the administrative burden.

There are several cloud providers available, including Azure, Google Cloud Platform, and AWS, among others, that you can consider.

You can read this article “The Pros and Cons of Using LLMs in the Cloud Versus Running LLMs Locally” offered by DataCamp.

3.3. Using Managed Open-source Large Language Models

Utilizing services like Replicate, TogetherAI, Anyscale, and RunPod allows immediate access to the capabilities of many open-source models without worrying about deployment — You can check their official websites for more information. You can easily use their endpoints to make use of some awesome open-source Large Language Models like Llama-2 and Mistral.

With Replicate, for example, you have a bunch of open-source models ready to go and let you host public or private models on your private instances. For public models on shared systems, you pay for execution time, but with private instances, there are additional charges for booting and idle time.

Remark: As mentioned by one of my colleagues on LinkedIn Berriche Aymen, the first two options, namely “Self-Hosting” and “Deploying in the cloud,” require the assistance of individuals with expertise in model deployment. Therefore, there are additional costs (besides the resources) that are associated with hiring these individuals compared to the third option, which does not require extensive expertise.

4. Cost Comparison

Selecting the most suitable type of Large Language Model involves considering various factors like quality, latency, security, and crucially cost — particularly for side projects.

While open-source large language models are often perceived as more cost-effective, this isn’t always true. According to The Information, numerous startups are investing approximately 50% to 100% more in running Meta’s Llama 2 compared to competing OpenAI’s GPT-3.5 Turbo, despite the significantly higher expenses for the top-tier GPT-4.

Figure 02 — Cost comparison of GPT-4, GPT-3.5, and Llama 2 7B for generating a million tokens.

The cost differentials can occasionally skyrocket. In a striking example, the founders of the chatbot startup Cypher conducted tests using Llama 2 in August, incurring expenses amounting to $1,200. In a direct comparison, executing identical tests with GPT-3.5 Turbo costs a mere $5, showcasing a substantial divergence in operational expenses between the two models.

Special thanks to AI Business for crafting an insightful article detailing significant statistics regarding the genuine expenses associated with AI. You can find the referenced article linked below for further exploration and insight!

In this article, our exploration didn’t encompass the cost analysis of deploying open-source LLMs on remote GPUs. Instead, we concentrated solely on examining the expenses associated with some closed-source and open-source models served by Replicate, TogetherAI, and RunPod.

We chose these models due to their ease of integration and swift inclusion in projects. We aimed to provide a clear overview of costs to assist in understanding the potential expenses involved.

4.1. Closed-source Large Language Models

The Artificial Analysis Leaderboard provides a key comparison of OpenAI, Cohere, Gemini, and Anthropic models as shown in the table below. This table highlights factors like context window, pricing (per 1M token), latency, and available models, aiding provider selection.

Figure 03 — Comparison between some closed-source models based on different factors

4.2. Open-source Large Language Models

Below is a table summarizing the cost breakdown for Llama-2 models and Mistral models open-source models served by Replicate, TogetherAI and RunPod.

Figure 04— Comparison between some open-source models served by Replicate, RunPod, and Together

5. Conclusion

To sum things up, the choice between open-source and closed-source Large Language Models involves weighing specific advantages and drawbacks. Open-source LLMs provide control, customization, and community support but may have resource limitations. Closed-source LLMs offer ease of development and deployment with dedicated support but lack customization and transparency.

Regarding cost, the perceived cost-effectiveness of open-source LLMs might not always hold true, as some startups invest significantly more in running certain open-source models. Ultimately, the decision depends on the business needs, resources, and the desired level of control.

We hope that this article has provided valuable insights into the diverse array of options available for consideration. Feel free to drop a comment below, send us an email, or connect with us on LinkedIn. Your feedback and contributions are highly appreciated.

Emails:

  • ja_adimi@esi.dz
  • jf_rezkellah@esi.dz

Linkedin accounts:

6. Additional links

[1] Open-source vs Closed-source Large Language Models.

[2] A Comprehensive Overview of Large Language Models.

[3] OpenAI models.

[4] Cohere’s models.

[5] Google PaLM 2 models.

[6] Anthropic models.

[7] AI21 Juraissic-2 models.

[8] Open Source vs. Closed Models: The True Cost of Running AI

--

--

Alaa Dania Adimi

Here to share some of the different things I am learning.