qwen Archives - AI News

Alibaba Cloud targets global AI growth with new models and tools

Ryan Daws — Tue, 08 Apr 2025 17:56:13 +0000

Alibaba Cloud has expanded its AI portfolio for global customers with a raft of new models, platform enhancements, and Software-as-a-Service (SaaS) tools.

The announcements, made during its Spring Launch 2025 online event, underscore the drive by Alibaba to accelerate AI innovation and adoption on a global scale.

The digital technology and intelligence arm of Alibaba is focusing on meeting increasing demand for AI-driven digital transformation worldwide.

Selina Yuan, President of International Business at Alibaba Cloud Intelligence, said: “We are launching a series of Platform-as-a-Service（PaaS) and AI capability updates to meet the growing demand for digital transformation from across the globe.

“These upgrades allow us to deliver even more secure and high-performance services that empower businesses to scale and innovate in an AI-driven world.”

Alibaba expands access to foundational AI models

Central to the announcement is the broadened availability of Alibaba Cloud’s proprietary Qwen large language model (LLM) series for international clients, initially accessible via its Singapore availability zones.

This includes several specialised models:

Qwen-Max: A large-scale Mixture of Experts (MoE) model.
QwQ-Plus: An advanced reasoning model designed for complex analytical tasks, sophisticated question answering, and expert-level mathematical problem-solving.
QVQ-Max: A visual reasoning model capable of handling complex multimodal problems, supporting visual input and chain-of-thought output for enhanced accuracy.
Qwen2.5-Omni-7b: An end-to-end multimodal model.

These additions provide international businesses with more powerful and diverse tools for developing sophisticated AI applications.

Platform enhancements power AI scale

To support these advanced models, Alibaba Cloud’s Platform for AI (PAI) received significant upgrades aimed at delivering scalable, cost-effective, and user-friendly generative AI solutions.

Key enhancements include the introduction of distributed inference capabilities within the PAI-Elastic Algorithm Service (EAS). Utilising a multi-node architecture, this addresses the computational demands of super-large models – particularly those employing MoE structures or requiring ultra-long-text processing – to overcome limitations inherent in traditional single-node setups.

Furthermore, PAI-EAS now features a prefill-decode disaggregation function designed to boost performance and reduce operational costs.

Alibaba Cloud reported impressive results when deploying this with the Qwen2.5-72B model, achieving a 92% increase in concurrency and a 91% boost in tokens per second (TPS).

The PAI-Model Gallery has also been refreshed, now offering nearly 300 open-source models—including the complete range of Alibaba Cloud’s own open-source Qwen and Wan series. These are accessible via a no-code deployment and management interface.

Additional new PAI-Model Gallery features – like model evaluation and model distillation (transferring knowledge from large to smaller, more cost-effective models) – further enhance its utility.

Alibaba integrates AI into data management

Alibaba Cloud’s flagship cloud-native relational database, PolarDB, now incorporates native AI inference powered by Qwen.

PolarDB’s in-database machine learning capability eliminates the need to move data for inference workflows, which significantly cuts processing latency while improving efficiency and data security.

The feature is optimised for text-centric tasks such as developing conversational Retrieval-Augmented Generation (RAG) agents, generating text embeddings, and performing semantic similarity searches.

Additionally, the company’s data warehouse, AnalyticDB, is now integrated into Alibaba Cloud’s generative AI development platform Model Studio.

This integration serves as the recommended vector database for RAG solutions. This allows organisations to connect their proprietary knowledge bases directly with AI models on the platform to streamline the creation of context-aware applications.

New SaaS tools for industry transformation

Beyond infrastructure and platform layers, Alibaba Cloud introduced two new SaaS AI tools:

AI Doc: An intelligent document processing tool using LLMs to parse diverse documents (reports, forms, manuals) efficiently. It extracts specific information and can generate tailored reports, such as ESG reports when integrated with Alibaba Cloud’s Energy Expert sustainability solution.
Smart Studio: An AI-powered content creation platform supporting text-to-image, image-to-image, and text-to-video generation. It aims to enhance marketing and creative outputs in sectors like e-commerce, gaming, and entertainment, enabling features like virtual try-ons or generating visuals from text descriptions.

All these developments follow Alibaba’s announcement in February of a $53 billion investment over the next three years dedicated to advancing its cloud computing and AI infrastructure.

This colossal investment, noted as exceeding the company’s total AI and cloud expenditure over the previous decade, highlights a deep commitment to AI-driven growth and solidifies its position as a major global cloud provider.

“As cloud and AI become essential for global growth, we are committed to enhancing our core product offerings to address our customers’ evolving needs,” concludes Yuan.

Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is co-located with other leading events including Intelligent Automation Conference, BlockX, Digital Transformation Week, and Cyber Security & Cloud Expo.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Alibaba Cloud targets global AI growth with new models and tools appeared first on AI News.

Alibaba Qwen QwQ-32B: Scaled reinforcement learning showcase

Ryan Daws — Thu, 06 Mar 2025 09:14:13 +0000

The Qwen team at Alibaba has unveiled QwQ-32B, a 32 billion parameter AI model that demonstrates performance rivalling the much larger DeepSeek-R1. This breakthrough highlights the potential of scaling Reinforcement Learning (RL) on robust foundation models.

The Qwen team have successfully integrated agent capabilities into the reasoning model, enabling it to think critically, utilise tools, and adapt its reasoning based on environmental feedback.

“Scaling RL has the potential to enhance model performance beyond conventional pretraining and post-training methods,” the team stated. “Recent studies have demonstrated that RL can significantly improve the reasoning capabilities of models.”

QwQ-32B achieves performance comparable to DeepSeek-R1, which boasts 671 billion parameters (with 37 billion activated), a testament to the effectiveness of RL when applied to robust foundation models pretrained on extensive world knowledge. This remarkable outcome underscores the potential of RL to bridge the gap between model size and performance.

The model has been evaluated across a range of benchmarks, including AIME24, LiveCodeBench, LiveBench, IFEval, and BFCL, designed to assess its mathematical reasoning, coding proficiency, and general problem-solving capabilities.

The results highlight QwQ-32B’s performance in comparison to other leading models, including DeepSeek-R1-Distilled-Qwen-32B, DeepSeek-R1-Distilled-Llama-70B, o1-mini, and the original DeepSeek-R1.

Benchmark results:

AIME24: QwQ-32B achieved 79.5, slightly behind DeepSeek-R1-6718’s 79.8, but significantly ahead of OpenAl-o1-mini’s 63.6 and the distilled models.
LiveCodeBench: QwQ-32B scored 63.4, again closely matched by DeepSeek-R1-6718’s 65.9, and surpassing the distilled models and OpenAl-o1-mini’s 53.8.
LiveBench: QwQ-32B achieved 73.1, with DeepSeek-R1-6718 scoring 71.6, and outperforming the distilled models and OpenAl-o1-mini’s 57.5.
IFEval: QwQ-32B scored 83.9, very close to DeepSeek-R1-6718’s 83.3, and leading the distilled models and OpenAl-o1-mini’s 59.1.
BFCL: QwQ-32B achieved 66.4, with DeepSeek-R1-6718 scoring 62.8, demonstrating a lead over the distilled models and OpenAl-o1-mini’s 49.3.

The Qwen team’s approach involved a cold-start checkpoint and a multi-stage RL process driven by outcome-based rewards. The initial stage focused on scaling RL for math and coding tasks, utilising accuracy verifiers and code execution servers. The second stage expanded to general capabilities, incorporating rewards from general reward models and rule-based verifiers.

“We find that this stage of RL training with a small amount of steps can increase the performance of other general capabilities, such as instruction following, alignment with human preference, and agent performance, without significant performance drop in math and coding,” the team explained.

QwQ-32B is open-weight and available on Hugging Face and ModelScope under the Apache 2.0 license, and is also accessible via Qwen Chat. The Qwen team views this as an initial step in scaling RL to enhance reasoning capabilities and aims to further explore the integration of agents with RL for long-horizon reasoning.

“As we work towards developing the next generation of Qwen, we are confident that combining stronger foundation models with RL powered by scaled computational resources will propel us closer to achieving Artificial General Intelligence (AGI),” the team stated.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Alibaba Qwen QwQ-32B: Scaled reinforcement learning showcase appeared first on AI News.

Qwen 2.5-Max outperforms DeepSeek V3 in some benchmarks

Ryan Daws — Wed, 29 Jan 2025 10:03:48 +0000

Alibaba’s response to DeepSeek is Qwen 2.5-Max, the company’s latest Mixture-of-Experts (MoE) large-scale model.

Qwen 2.5-Max boasts pretraining on over 20 trillion tokens and fine-tuning through cutting-edge techniques like Supervised Fine-Tuning (SFT) and Reinforcement Learning from Human Feedback (RLHF).

With the API now available through Alibaba Cloud and the model accessible for exploration via Qwen Chat, the Chinese tech giant is inviting developers and researchers to see its breakthroughs firsthand.

Outperforming peers

When comparing Qwen 2.5-Max’s performance against some of the most prominent AI models on a variety of benchmarks, the results are promising.

Evaluations included popular metrics like the MMLU-Pro for college-level problem-solving, LiveCodeBench for coding expertise, LiveBench for overall capabilities, and Arena-Hard for assessing models against human preferences.

According to Alibaba, “Qwen 2.5-Max outperforms DeepSeek V3 in benchmarks such as Arena-Hard, LiveBench, LiveCodeBench, and GPQA-Diamond, while also demonstrating competitive results in other assessments, including MMLU-Pro.”

(Credit: Alibaba)

The instruct model – designed for downstream tasks like chat and coding – competes directly with leading models such as GPT-4o, Claude-3.5-Sonnet, and DeepSeek V3. Among these, Qwen 2.5-Max managed to outperform rivals in several key areas.

Comparisons of base models also yielded promising outcomes. While proprietary models like GPT-4o and Claude-3.5-Sonnet remained out of reach due to access restrictions, Qwen 2.5-Max was assessed against leading public options such as DeepSeek V3, Llama-3.1-405B (the largest open-weight dense model), and Qwen2.5-72B. Again, Alibaba’s newcomer demonstrated exceptional performance across the board.

“Our base models have demonstrated significant advantages across most benchmarks,” Alibaba stated, “and we are optimistic that advancements in post-training techniques will elevate the next version of Qwen 2.5-Max to new heights.”

The burst of DeepSeek V3 has attracted attention from the whole AI community to large-scale MoE models. Concurrently, we have been building Qwen2.5-Max, a large MoE LLM pretrained on massive data and post-trained with curated SFT and RLHF recipes. It achieves competitive… pic.twitter.com/oHVl16vfje
— Qwen (@Alibaba_Qwen) January 28, 2025

Making Qwen 2.5-Max accessible

To make the model more accessible to the global community, Alibaba has integrated Qwen 2.5-Max with its Qwen Chat platform, where users can interact directly with the model in various capacities—whether exploring its search capabilities or testing its understanding of complex queries.

For developers, the Qwen 2.5-Max API is now available through Alibaba Cloud under the model name “qwen-max-2025-01-25”. Interested users can get started by registering an Alibaba Cloud account, activating the Model Studio service, and generating an API key.

The API is even compatible with OpenAI’s ecosystem, making integration straightforward for existing projects and workflows. This compatibility lowers the barrier for those eager to test their applications with the model’s capabilities.

Alibaba has made a strong statement of intent with Qwen 2.5-Max. The company’s ongoing commitment to scaling AI models is not just about improving performance benchmarks but also about enhancing the fundamental thinking and reasoning abilities of these systems.

“The scaling of data and model size not only showcases advancements in model intelligence but also reflects our unwavering commitment to pioneering research,” Alibaba noted.

Looking ahead, the team aims to push the boundaries of reinforcement learning to foster even more advanced reasoning skills. This, they say, could enable their models to not only match but surpass human intelligence in solving intricate problems.

The implications for the industry could be profound. As scaling methods improve and Qwen models break new ground, we are likely to see further ripples across AI-driven fields globally that we’ve seen in recent weeks.

(Photo by Maico Amorim)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Qwen 2.5-Max outperforms DeepSeek V3 in some benchmarks appeared first on AI News.

Alibaba Cloud unleashes over 100 open-source AI models

Ryan Daws — Fri, 20 Sep 2024 13:08:45 +0000

Alibaba Cloud has open-sourced more than 100 of its newly-launched AI models, collectively known as Qwen 2.5. The announcement was made during the company’s annual Apsara Conference.

The cloud computing arm of Alibaba Group has also unveiled a revamped full-stack infrastructure designed to meet the surging demand for robust AI computing. This new infrastructure encompasses innovative cloud products and services that enhance computing, networking, and data centre architecture, all aimed at supporting the development and wide-ranging applications of AI models.

Eddie Wu, Chairman and CEO of Alibaba Cloud Intelligence, said: “Alibaba Cloud is investing, with unprecedented intensity, in the research and development of AI technology and the building of its global infrastructure. We aim to establish an AI infrastructure of the future to serve our global customers and unlock their business potential.”

The newly-released Qwen 2.5 models range from 0.5 to 72 billion parameters in size and boast enhanced knowledge and stronger capabilities in maths and coding. Supporting over 29 languages, these models cater to a wide array of AI applications both at the edge and in the cloud across various sectors, from automotive and gaming to scientific research.

Alibaba Cloud’s open-source AI models gain traction

Since its debut in April 2023, the Qwen model series has garnered significant traction, surpassing 40 million downloads across platforms such as Hugging Face and ModelScope. These models have also inspired the creation of over 50,000 derivative models on Hugging Face alone.

Jingren Zhou, CTO of Alibaba Cloud Intelligence, commented: “This initiative is set to empower developers and corporations of all sizes, enhancing their ability to leverage AI technologies and further stimulating the growth of the open-source community.”

In addition to the open-source models, Alibaba Cloud announced an upgrade to its proprietary flagship model, Qwen-Max. The enhanced version reportedly demonstrates performance on par with other state-of-the-art models in areas such as language comprehension, reasoning, mathematics, and coding.

The company has also expanded its multimodal capabilities with a new text-to-video model as part of its Tongyi Wanxiang large model family. This model can generate high-quality videos in various visual styles, from realistic scenes to 3D animation, based on Chinese and English text instructions.

Furthermore, Alibaba Cloud introduced Qwen2-VL, an updated vision language model capable of comprehending videos lasting over 20 minutes and supporting video-based question-answering. The company also launched an AI Developer, a Qwen-powered AI assistant designed to support programmers in automating tasks such as requirement analysis, code programming, and bug identification and fixing.

To support these AI advancements, Alibaba Cloud has announced several infrastructure upgrades, including:

CUBE DC 5.0, a next-generation data centre architecture that increases energy and operational efficiency.
Alibaba Cloud Open Lake, a solution to maximise data utility for generative AI applications.
PAI AI Scheduler, a proprietary cloud-native scheduling engine for enhanced computing resource management.
DMS: OneMeta+OneOps, a platform for unified management of metadata across multiple cloud environments.
9th Generation Enterprise Elastic Compute Service (ECS) instance, offering improved performance for various applications.

These updates from Alibaba Cloud – including the release of over 100 open-source models – aim to provide comprehensive support for customers and partners to maximise the benefits of the latest technology in building more efficient, sustainable, and inclusive AI applications.

(Image Source: www.alibabagroup.com)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Alibaba Cloud unleashes over 100 open-source AI models appeared first on AI News.

Qwen2-Math: A new era for AI maths whizzes

Ryan Daws — Fri, 09 Aug 2024 12:46:18 +0000

Alibaba Cloud’s Qwen team has unveiled Qwen2-Math, a series of large language models specifically designed to tackle complex mathematical problems.

These new models – built upon the existing Qwen2 foundation – demonstrate remarkable proficiency in solving arithmetic and mathematical challenges, and outperform former industry leaders.

The Qwen team crafted Qwen2-Math using a vast and diverse Mathematics-specific Corpus. This corpus comprises a rich tapestry of high-quality resources, including web texts, books, code, exam questions, and synthetic data generated by Qwen2 itself.

Rigorous evaluation on both English and Chinese mathematical benchmarks – including GSM8K, Math, MMLU-STEM, CMATH, and GaoKao Math – revealed the exceptional capabilities of Qwen2-Math. Notably, the flagship model, Qwen2-Math-72B-Instruct, surpassed the performance of proprietary models such as GPT-4o and Claude 3.5 in various mathematical tasks.

“Qwen2-Math-Instruct achieves the best performance among models of the same size, with RM@8 outperforming Maj@8, particularly in the 1.5B and 7B models,” the Qwen team noted.

This superior performance is attributed to the effective implementation of a math-specific reward model during the development process.

Further showcasing its prowess, Qwen2-Math demonstrated impressive results in challenging mathematical competitions like the American Invitational Mathematics Examination (AIME) 2024 and the American Mathematics Contest (AMC) 2023.

To ensure the model’s integrity and prevent contamination, the Qwen team implemented robust decontamination methods during both the pre-training and post-training phases. This rigorous approach involved removing duplicate samples and identifying overlaps with test sets to maintain the model’s accuracy and reliability.

Looking ahead, the Qwen team plans to expand Qwen2-Math’s capabilities beyond English, with bilingual and multilingual models in the pipeline. This commitment to inclusivity aims to make advanced mathematical problem-solving accessible to a global audience.

“We will continue to enhance our models’ ability to solve complex and challenging mathematical problems,” affirmed the Qwen team.

You can find the Qwen2 models on Hugging Face here.

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Qwen2-Math: A new era for AI maths whizzes appeared first on AI News.

Alibaba Cloud launches English version of AI model hub

Ryan Daws — Tue, 25 Jun 2024 12:16:49 +0000

Alibaba Cloud has taken a step towards globalising its AI offerings by unveiling an English version of ModelScope, its open-source AI model community. The move aims to bring generative AI capabilities to a wider audience of businesses and developers worldwide.

ModelScope, which embodies Alibaba Cloud’s concept of “Model-as-a-Service,” transforms AI models into readily available and deployable services. Since its launch in mainland China in 2022, the platform has grown to become the country’s largest AI model community, boasting over five million developer users.

With this international expansion, developers around the globe will now have access to more than 5,000 advanced AI models. The platform also welcomes user-contributed models, fostering a collaborative ecosystem for AI development.

The English version of ModelScope provides a comprehensive suite of tools and resources to support developers in bringing their AI projects to fruition. This includes access to over 1,500 high-quality Chinese-language datasets and an extensive range of toolkits for data processing. Moreover, the platform offers various modules that allow developers to customise model inference, training, and evaluation with minimal coding requirements.

Alibaba Cloud announced the English version of ModelScope during the 2024 Computer Vision and Pattern Recognition (CVPR) Conference in Seattle. This annual event brings together academics, researchers, and business leaders for a five-day exploration of cutting-edge developments in AI and machine learning through workshops, panels, and keynotes.

The company’s presence at CVPR was further bolstered by the acceptance of more than 30 papers from Alibaba Group, with six selected as oral and highlighted papers. This achievement underscores Alibaba’s commitment to advancing the field of AI research and development.

Conference attendees also had the opportunity to experience firsthand the capabilities of Alibaba’s proprietary Qwen model series at the company’s booth. The demonstration showcased the model’s impressive image and video generation capabilities, providing a glimpse into the potential applications of Alibaba’s AI technologies.

The launch of the English version of ModelScope represents a significant milestone in Alibaba Cloud’s strategy to expand its AI offerings globally.

As businesses and developers worldwide increasingly seek to harness the power of AI, platforms like ModelScope are set to play a crucial role in democratising access to advanced AI capabilities. With its extensive collection of models, datasets, and development tools, Alibaba Cloud’s ModelScope will help to accelerate AI innovation and adoption on a global scale.

(Image Source: www.alibabagroup.com)

Explore other upcoming enterprise technology events and webinars powered by TechForge here.

The post Alibaba Cloud launches English version of AI model hub appeared first on AI News.