GLM-5: 744B Sparse Model with 40B Active Parameters

Someone noticed that Zhipu AI just dropped GLM-5, a massive sparse model that’s pretty interesting for anyone working on complex automation tasks.

The specs are wild:

744B total parameters (40B active at once)
Trained on 28.5T tokens
Uses DeepSeek Sparse Attention to keep costs reasonable

The sparse setup means it only activates 40B parameters per forward pass instead of loading the whole 744B model, which cuts deployment costs without killing performance on long-context work.

Quick links to check it out:

Blog: https://z.ai/blog/glm-5
Hugging Face: https://huggingface.co/zai-org/GLM-5
GitHub: https://github.com/zai-org/GLM-5

Turns out it’s specifically built for “long-horizon agentic tasks” - basically stuff where an AI needs to plan multiple steps ahead. Could be handy for complex coding projects or multi-step system engineering problems.

GLM-5: 744B Sparse Model with 40B Active Parameters

Related Tips

20B Parameter Model Runs Locally in Browser

30B Model Handles 10M Tokens via Subquadratic Attention

KaniTTS2: Fast Local Text-to-Speech with Cloning