Evolution Beats Backprop for LLM Fine-Tuning

Someone found that evolutionary strategies can replace backpropagation for fine-tuning language models, which sounds ridiculous but actually works.

The paper at https://arxiv.org/abs/2509.24372 showed that 30 random gaussian perturbations can approximate gradients well enough to beat GRPO on RLVR tasks. Zero overfitting, way faster training since there’s no backward pass.

A developer tested it themselves and got it working: https://github.com/Green0-0/propagate

The repo now includes LoRA support and pass@k training. Pretty wild that you can fine-tune models by just adding random noise and seeing what works better - no gradient calculations needed.

Worth checking out if standard fine-tuning feels too slow or memory-heavy. The approach trades computational precision for speed, and apparently the tradeoff works surprisingly well for RL tasks.

Evolution Beats Backprop for LLM Fine-Tuning

Related Tips

GPU Shortage Tracker Shows Grim Hardware Upgrade Outlook

Snapchat Scammers Use Open-Source LLMs for Sextortion

Samsung's SOCAMM2: Replaceable LPDDR5X for AI Servers