Step-3.5-Flash: 11B Beats DeepSeek v3.2 on Code

Someone dug into the new Stepfun Step-3.5-Flash model and found it beats DeepSeek v3.2 on coding benchmarks while being way smaller.

The size difference is wild - Step-3.5-Flash runs with just 196B total parameters (11B active), while DeepSeek v3.2 has 671B total (37B active). That’s roughly 3.4x fewer active parameters but better performance on agentic tasks.

Try it:

https://huggingface.co/stepfun-ai/Step-3.5-Flash

Pretty interesting for anyone running local models or dealing with API costs. Smaller usually means faster inference and lower memory requirements. The model uses a mixture-of-experts architecture, so only a fraction of parameters activate per query - explains how it stays efficient without sacrificing quality.

Worth testing if coding assistance or agent-style workflows are the main use case.

Step-3.5-Flash: 11B Beats DeepSeek v3.2 on Code

Related Tips

KaniTTS2: Fast Local Text-to-Speech with Cloning

AdaLLM: True FP4 Inference on RTX 4090s Without FP16 Fallbac

Chatbot Framework Rebuilt in Rust: 10MB Binary