通义千问重磅开源Qwen2.5,性能超越Llama

2024-09-19 16:12:00

On September 19, at the Apsara Conference, Zhou Jingren, CTO of Alibaba Cloud, released the new generation of open source model Qwen2.5 of Tongyi Qianwen, and the flagship model Qwen2.5-72B surpassed Llama 405B in performance, and once again ascended the throne of the global open source large model. The Qwen 2.5 series covers large language models, multimodal models, mathematical models and code models of multiple sizes, and each size has a basic version, an instruction-following version, and a quantized version, with a total of more than 100 models on the shelves, setting a new industry record.

The whole series of Qwen2.5 models are pre-trained on 18T tokens data, and the overall performance is improved by more than 18% compared with Qwen2, with more knowledge, stronger programming and mathematical skills. The Qwen2.5-72B model scored 86.8, 88.2, and 83.1 on the MMLU-rudex benchmark (for general knowledge), MBPP (for code ability), and MATH (for math ability).

Qwen 2.5 supports contextual lengths of up to 128K and can generate up to 8K content. The model has strong multilingual capabilities, supporting more than 29 languages such as Chinese, English, French, Spain, Russian, Japanese, Viet Nam, Arabic, etc. The model is able to respond silkily to a variety of system prompts, enabling tasks such as role-playing and chatbots. Qwen 2.5 has made significant progress in following instructions, understanding structured data (e.g., tables), and generating structured output (especially JSON).

In terms of language models, Qwen2.5 has been open-sourced in 7 sizes, 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B, which have created the best results in the industry in the same parameter track. The 32B is the most anticipated "king of price/performance" by developers, with the best balance between performance and power consumption, and the overall performance of the Qwen2.5-32B surpasses that of the Qwen2-72B.

在MMLU-redux等十多个基准测评中,Qwen2.5-72B表现超越Llama3.1-405B

72B是Qwen2.5系列的旗舰模型,其指令跟随版本Qwen2.5-72B-Instruct在MMLU-redux、MATH、MBPP、LiveCodeBench、Arena-Hard、AlignBench、MT-Bench、MultiPL-E等权威测评中表现出色,在多个核心任务上,以不到1/5的参数超越了拥有4050亿巨量参数的Llama3.1-405B,继续稳居"全球最强开源大模型"的位置。

In terms of specialized models, Qwen2.5-Coder for programming and Qwen2.5-Math for mathematics are both substantial improvements over their predecessors. Qwen2.5-Coder is trained on programming-related data of up to 5.5T tokens, and is open-sourced versions 1.5B and 7B on the same day, and will be open-sourced in the future 32B version. 72B in three sizes and a mathematical reward model, Qwen2.5-Math-RM.

In terms of multimodal models, the widely anticipated visual language model Qwen2-VL-72B is officially open-sourced, which can recognize images with different resolutions and aspect ratios, understand long videos of more than 20 minutes, and have the ability to autonomously operate mobile phones and robots. Recently, the authoritative evaluation LMSYS Chatbot Arena Leaderboard released the latest visual model performance evaluation results, and Qwen2-VL-72B became the world's highest-scoring open source model.

Qwen2-VL-72B在权威测评LMSYS Chatbot Arena Leaderboard成为成为全球得分最高的开源视觉理解模型

Since its open source in August 2023, Tongyi has lagged behind in the field of global open source large models and has become the preferred model for developers, especially Chinese developers. In terms of performance, the Tongyi large model has gradually caught up with Llama, the strongest open source model in United States, and has topped the Hugging Face global large model list for many times; As of mid-September 2024, the number of downloads of Tongyi Qianwen open source models has exceeded 40 million, and the total number of Qwen series derivative models has exceeded 50,000, making it a world-class model group second only to Llama.

According to HuggingFace data, as of mid-September, the total number of original and derived models of the Qwen series exceeded 50,000

通义千问重磅开源Qwen2.5,性能超越Llama

Read on

The release time of the new iPad is exposed, and the price may rise/ Huawei may release new products for 5 consecutive days/Tongyi Qianwen open-source 32 billion parameter model

Tongyi Qianwen landed MediaTek mobile phone chip offline can also continue to talk to AI

Technology tycoon Musk once predicted that many of China's AI capabilities have huge potential, even leading the world. It turns out

Tongyi Qianwen open source king fried, 110 billion parameters dominate the open source list, Chinese ability is the first in the world

The throne of the open source large model changed hands again, Tongyi Qianwen won the SOTA with 100 billion parameters, and 8 models have been launched in March

Alibaba Cloud released Tongyi Qianwen 2.5, which comprehensively catches up with GPT-4 Turbo and has the strongest Chinese ability

Alibaba Cloud released Tongyi Qianwen 2.5 to catch up with GPT-4 Turbo in performance

Tongyi Qianwen 2.5 is officially released! The 110 billion parameter open-source model surpasses Llama 3

Catch up with GPT-4 in an all-round way, and Xiaomi mobile phones will be carried! Alibaba Cloud Tongyi Qianwen 2.5 model was released

Kimi, Tongyi Qianwen, and Claude "transformed" into Musk, and they commented on Lei Jun

Zhou Jingren, CTO of Alibaba Cloud: Tongyi Qianwen has flattened the gap between open source and closed-source models WAIC 2024

Which one is stronger in cloud AI developer service capabilities? Recently, Gartner, an international research institution, released the 2024 "Magic Quadrant for Cloud AI Developer Services" report

Industry information| Ali Tongyi Qianwen AI model will be unveiled at the Paris Olympics; Quick hand

TRIZ causal chain analysis tool leverages GPT to solve technical problems

Kelin AI released the V1.5 video model丨Tongyi Qianwen released the open source model Qwen2.5 series