Tongyi Qianwen open sourced two large language models, 72B and 1.8B, and open sourced a large audio model for the first time.

6 min readDec 1, 2023

On December 1, Alibaba Cloud held a press conference for Tongyi Qianwen and open sourced the Tongyi Qianwen 72 billion parameter model Qwen-72B. Qwen-72B has achieved the best results among open source models in 10 authoritative benchmarks, becoming the industry’s most powerful open source large model. Its performance exceeds the open source benchmark Llama 2–70B and most commercial closed source models. In the future, high-performance applications at the enterprise level and scientific research level will also have the option of open source large models.

Tongyi Qianwen also open sourced the 1.8 billion parameter model Qwen-1.8B and the large audio model Qwen-Audio. So far, Tongyi Qianwen has open sourced four large language models with 1.8 billion, 7 billion, 14 billion, and 72 billion parameters, as well as two large multi-modal models for visual understanding and audio understanding, achieving “full size, full modality” Open source. The intensity is unparalleled in the industry.

The industry’s strongest open source model fills the gap in China’s LLM open source field

Qwen-72B is trained based on high-quality data of 3T tokens and continues the strong performance of Tongyi Qianwen pre-training model. It won the best results of open source models in 10 authoritative benchmark evaluations and surpassed the closed source GPT- in some evaluations. 3.5 and GPT-4.

In the English task, Qwen-72B achieved the highest score among open source models in the MMLU benchmark test; in the Chinese task, Qwen-72B dominated C-Eval, CMMLU, GaokaoBench and other benchmarks, with scores exceeding GPT-4; in terms of mathematical reasoning, Qwen-72B In GSM8K and MATH evaluations, it leads other open source models. In terms of code understanding, Qwen-72B’s performance in HumanEval, MBPP and other evaluations has been greatly improved, and its coding ability has made a qualitative leap.

In the top 10 authoritative evaluations, Tongyi Qianwen’s 72 billion parameter model won the best score among open source models.

Some results of Tongyi Qianwen’s 72 billion open source model surpassed the closed source GPT-3.5 and GPT-4

Qwen-72B can handle long text input of up to 32k, and has achieved results beyond ChatGPT-3.5–16k on the long text understanding test set LEval. The R&D team has optimized Qwen-72B’s instruction following, tool usage and other skills so that it can be better integrated with downstream applications. For example, Qwen-72B is equipped with a powerful system command (System Prompt) capability. Users can customize the AI assistant with just one prompt word, asking the large model to play a certain role or perform a specific reply task.

Users can create their own AI assistant with just one prompt word

Previously, there was no high-quality open source model that could compete with Llama 2–70B in China’s large model market. Qwen-72B fills the domestic gap. With the advantages of high performance, high controllability and high cost performance, it provides a choice that is no less than that of commercial closed-source large models. Based on Qwen-72B, large and medium-sized enterprises can develop commercial applications, and universities and scientific research institutes can carry out scientific research such as AI for Science.

From 1.8B to 72B, Tongyi Qianwen takes the lead in realizing full-size open source

If Qwen-72B “reaches upwards”, it raises the size and performance ceiling of open source large models; Qwen-1.8B, another open source model at the press conference, “reaches downwards” and becomes the smallest Chinese open source large model , inferring 2K-length text content requires only 3G video memory and can be deployed on consumer-grade terminals.

From 1.8 billion, 7 billion, 14 billion to 72 billion parameter scales, Tongyi Qianwen has become the industry’s first “full-size open source” large model. Users can directly experience the Qwen series model effects in the Moda community, call the model API through the Alibaba Cloud Lingji platform, or customize large model applications based on the Alibaba Cloud Bailian platform. Alibaba Cloud’s artificial intelligence platform PAI has deeply adapted to the entire series of Tongyi Qianwen models and launched services such as lightweight fine-tuning, full-parameter fine-tuning, distributed training, offline reasoning and verification, and online service deployment.

Alibaba Cloud is the first technology company in China to open source large self-developed models. Since August, it has open sourced Qwen-7B, Qwen-14B and the visual understanding model Qwen-VL. Several models have successively appeared on the HuggingFace and Github large model lists, and have been favored by small and medium-sized enterprises and individual developers. The cumulative downloads have exceeded 1.5 million, and more than 150 new models and new applications have been spawned. At the press conference, many developer partners showed up and shared their practices of using Qwen to develop exclusive models and specific applications.

Alibaba Cloud CTO Zhou Jingren said that the open source ecosystem is crucial to promoting the technological progress and application implementation of China’s large models. Tongyi Qianwen will continue to invest in open source, hoping to become “the most open large model in the AI era” and work with partners to promote large models. Model ecological construction.

Tongyi Qianwen’s base model continues to evolve, leading the industry in multi-modal exploration

Tongyi Qianwen’s exploration in the field of multi-modal large models is also one step ahead of the industry. On that day, Alibaba Cloud open sourced the large audio understanding model Qwen-Audio for the first time.

Qwen-Audio can perceive and understand various speech signals such as human voices, natural sounds, animal sounds, and music sounds. Users can input a piece of audio and ask the model to provide an understanding of the audio, and even perform literary creation, logical reasoning, story continuation, etc. based on the audio. Audio understanding can give large models near-human hearing capabilities.

The large model of Tongyi can both “listen” and “see”. Tongyi Qianwen open sourced the large visual understanding model Qwen-VL in August, which quickly became one of the best practices in the international open source community. This conference also announced a major update of Qwen-VL, which has greatly improved the basic capabilities of general OCR, visual reasoning, and Chinese text understanding. It can also handle images of various resolutions and specifications, and can even “look at pictures and answer questions.” Regardless of authoritative evaluation results or real-person experience effects, Qwen-VL’s Chinese text understanding capabilities greatly surpass GPT-4V.

The Tongyi Qianwen closed-source model is also continuing to evolve. The Tongyi Qianwen 2.0 closed-source model released a month ago has recently been upgraded to version 2.1, with the context window length extended to 32k, code understanding and generation capabilities, and mathematical reasoning capabilities. , Chinese and English encyclopedia knowledge, and hallucination induction resistance increased by 30%, 10%, nearly 5%, and 14% respectively. Users can experience the latest version of the closed-source model for free on the Tongyi Qianwen APP.

Tongyi Qianwen open sourced two large language models, 72B and 1.8B, and open sourced a large audio model for the first time.

Written by Piyush C. Lamsoge