Z.ai's GLM-4.5 Model Offers 87% Lower Output Token Cost Than DeepSeek R1

BEIJING – Chinese AI company Z.ai, formerly known as Zhipu, has announced the imminent release of its new flagship large language model, GLM-4.5, signaling a significant advancement in cost-efficient and agentic AI capabilities. The model, along with its lighter counterpart GLM-4.5-Air, will be hosted under the newly established zai-org organization on both GitHub and Hugging Face, making it readily accessible to developers globally. This move positions Z.ai as a formidable competitor in the rapidly evolving AI landscape, particularly in the open-source domain.

Z.ai, founded in 2019, has rapidly grown into a prominent AI startup in China, backed by over $1.5 billion in funding from investors including Alibaba, Tencent, Qiming Venture Partners, and Aramco-backed Prosperity7 Ventures. The company has a history of developing advanced models, including the GLM pre-training framework launched in 2020, and has seen its previous models accumulate over 40 million global downloads. The introduction of GLM-4.5 coincides with a wave of new AI model releases from Chinese firms, highlighted at the World Artificial Intelligence Conference in Shanghai.

The GLM-4.5 series models are designed as foundational models for intelligent agents, unifying reasoning, coding, and agentic capabilities. GLM-4.5 boasts 355 billion total parameters with 32 billion active parameters, while GLM-4.5-Air offers a more compact design with 106 billion total parameters and 12 billion active parameters. Both models utilize a Mixture-of-Experts (MoE) architecture and support hybrid reasoning modes, including a "Thinking Mode" for complex tasks and a "Non-Thinking Mode" for instant responses.

A key differentiator for GLM-4.5 is its remarkable cost-efficiency. Z.ai plans to charge 11 cents per million input tokens, slightly less than DeepSeek R1's 14 cents. More notably, output tokens for GLM-4.5 will cost 28 cents per million, a substantial 87% reduction compared to DeepSeek R1's $2.19. This aggressive pricing strategy, combined with the model's efficiency, challenges established players and aims to make advanced AI more accessible.

In comprehensive evaluations across 12 industry-standard benchmarks, GLM-4.5 achieved an impressive score of 63.2, ranking third among all proprietary and open-source models. The model has demonstrated performance comparable to, and in some cases surpassing, leading proprietary models such as Claude 4 Sonnet, Claude 4 Opus, and Gemini 2.5 Pro in specific evaluations. This strong performance, coupled with its open-source nature and cost-effectiveness, positions GLM-4.5 to significantly impact the development of next-generation AI applications.

"🔜 GLM-4.5 is coming soon! This new chapter for our models will be hosted under the new zai-org organization on both GitHub and Hugging Face," Z.ai announced in a tweet, emphasizing the strategic shift towards broader accessibility and collaboration. The company's commitment to open-sourcing its MoE architecture marks a significant step in democratizing advanced AI research and development.