Alibaba Cloud Unveils‍ Qwen2.5, the World’s Largest ⁤Open-Source Model

At the Yunqi‌ Conference on September 19, Alibaba Cloud CTO Zhou Jingren released the new generation of open-source models Qwen2.5 ⁢of Tongyi Qianwen. The flagship model Qwen2.5-72B surpassed Llama 405B in performance and once again became‌ the world’s largest open-source model.

The full series of Qwen2.5 covers large ⁤language models, multimodal⁤ models,⁢ mathematical models, and code models⁢ of multiple sizes.⁣ Each size has a basic version, ⁢an instruction-following⁤ version, and a quantized version. A total of ⁣more than 100 models are⁤ available, setting a new industry ‍record.

Improved Performance and‍ Capabilities

All Qwen2.5 models are ⁢pre-trained on⁢ 18T tokens data. Compared with Qwen2, the overall performance is improved ⁤by more than 18%,⁣ with more knowledge, stronger programming and mathematical capabilities. The Qwen2.5-72B model scores as high as 86.8, 88.2, and 83.1 on the MMLU-rudex benchmark⁣ (testing general knowledge),⁣ MBPP benchmark (testing ⁤coding ability), and MATH ⁢benchmark (testing mathematical ability).

Qwen2.5 supports context lengths up to 128K and can generate up to⁣ 8K content. The model has strong multilingual capabilities and supports more than 29⁣ languages, including Chinese, English,⁢ French, Spanish, Russian, Japanese, Vietnamese, Arabic, etc. The ‌model can smoothly respond⁤ to a ‌variety⁤ of system prompts and⁤ implement tasks such as role-playing and ⁣chatbots.

Language Models and⁤ Specialized Models

In terms of‌ language models, Qwen2.5 has ‍open-sourced 7 sizes: 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. They have achieved the best results in the industry‌ in the same parameter track. The model setting ⁢fully considers the ⁣different‌ needs of downstream scenarios.

72B is the flagship model of ‍the Qwen2.5⁤ series. Its instruction follower‍ version Qwen2.5-72B-Instruct has performed well in authoritative evaluations ⁢such as MMLU-redux, ‍MATH, MBPP, LiveCodeBench, Arena-Hard, ‍AlignBench, MT-Bench, and MultiPL-E.

In terms of specialized‌ models, Qwen2.5-Coder for programming and Qwen2.5-Math for mathematics have made substantial progress over their predecessors. Qwen2.5-Coder was trained on up to 5.5T tokens of programming-related data, and the‍ 1.5B and 7B versions were open-sourced⁤ on⁢ the same ‌day, and the 32B version will be open-sourced in the future.

Multimodal Models

The much-anticipated visual language model Qwen2-VL-72B has been officially open-sourced. Qwen2-VL can recognize images of different resolutions⁣ and aspect ratios, understand‌ videos ‍longer than ‌20 minutes, ⁤and has the ability ⁢to autonomously operate mobile phones and⁣ robots as a visual agent.

Recently, the ⁢authoritative evaluation LMSYS Chatbot Arena Leaderboard released the latest visual model performance⁢ evaluation results, and Qwen2-VL-72B became⁣ the world’s highest-scoring⁣ open-source model.

Ecological Network and Downloads

Since its open-source release in August 2023, Tongyi has become the preferred model for ⁢developers, especially Chinese developers, in the field of global open-source big models. In terms of⁣ performance, Tongyi’s big models ‌have gradually surpassed the strongest open-source model in the United‍ States,‌ Llama, and have topped the‌ Hugging Face global big model list many times.

As of mid-September 2024, the download volume of Tongyi Qianwen open-source models exceeded 40 million, and‍ the total number of‍ Qwen series⁣ derivative models exceeded 50,000, becoming a world-class model group‍ second only‌ to ⁣Llama.

Qwen2-VL-72B becomes the world's highest-scoring open-source visual understanding model

HuggingFace data shows⁢ that as of mid-September, the total number of Qwen series native models and derivative models exceeded 50,000

Qwen2.5 Dethrones Meta: The Unstoppable Rise of the World’s Top Open-Source Large Model

Alibaba Cloud Unveils‍ Qwen2.5, the World’s Largest ⁤Open-Source Model

Improved Performance and‍ Capabilities

Language Models and⁤ Specialized Models

Multimodal Models

Ecological Network and Downloads

Related

Qwen2.5 Dethrones Meta: The Unstoppable Rise of the World’s Top Open-Source Large Model

Alibaba Cloud Unveils‍ Qwen2.5, the World’s Largest ⁤Open-Source Model

Improved Performance and‍ Capabilities

Language ​Models and⁤ Specialized Models

Multimodal Models

Ecological Network and Downloads

Share this:

Related

Language Models and⁤ Specialized Models