Step-3.5-Flash: 11B Beats DeepSeek v3.2 on Code
Step-3.5-Flash, an 11-billion parameter model, demonstrates superior performance compared to DeepSeek v3.2 in coding tasks, marking a significant advancement
Someone dug into the new Stepfun Step-3.5-Flash model and found it beats DeepSeek v3.2 on coding benchmarks while being way smaller.
The size difference is wild - Step-3.5-Flash runs with just 196B total parameters (11B active), while DeepSeek v3.2 has 671B total (37B active). That’s roughly 3.4x fewer active parameters but better performance on agentic tasks.
Try it:
https://huggingface.co/stepfun-ai/Step-3.5-Flash
Pretty interesting for anyone running local models or dealing with API costs. Smaller usually means faster inference and lower memory requirements. The model uses a mixture-of-experts architecture, so only a fraction of parameters activate per query - explains how it stays efficient without sacrificing quality.
Worth testing if coding assistance or agent-style workflows are the main use case.
Related Tips
KaniTTS2: Fast Local Text-to-Speech with Cloning
KaniTTS2 provides a fast, locally-run text-to-speech system with voice cloning capabilities, enabling users to generate natural-sounding speech from text while
AdaLLM: True FP4 Inference on RTX 4090s Without FP16 Fallbac
AdaLLM enables genuine 4-bit floating-point inference on RTX 4090 GPUs without reverting to 16-bit precision, delivering faster and more memory-efficient large
Chatbot Framework Rebuilt in Rust: 10MB Binary
A chatbot framework originally written in another language has been completely rewritten in Rust, resulting in a remarkably compact 10MB binary that