chatgpt

GLM-5: 744B Sparse Model with 40B Active Parameters

GLM-5 is a 744-billion parameter sparse language model that activates only 40 billion parameters per forward pass, achieving efficient performance through

Someone noticed that Zhipu AI just dropped GLM-5, a massive sparse model that’s pretty interesting for anyone working on complex automation tasks.

The specs are wild:

  • 744B total parameters (40B active at once)
  • Trained on 28.5T tokens
  • Uses DeepSeek Sparse Attention to keep costs reasonable

The sparse setup means it only activates 40B parameters per forward pass instead of loading the whole 744B model, which cuts deployment costs without killing performance on long-context work.

Quick links to check it out:

Turns out it’s specifically built for “long-horizon agentic tasks” - basically stuff where an AI needs to plan multiple steps ahead. Could be handy for complex coding projects or multi-step system engineering problems.