general

LLMs Can Now Play Balatro Autonomously via API

An article discusses how large language models have gained the ability to autonomously play the poker-themed roguelike deck-building game Balatro through API

Someone built a framework that lets local LLMs play Balatro autonomously - like watching an AI struggle through poker hands in real time.

The setup uses two components: BalatroBot (a mod that exposes game state via HTTP API) and BalatroLLM (the bot framework). Works with any OpenAI-compatible endpoint:

# Clone and set up with Ollama, vLLM, etc.
git clone https://github.com/coder/balatrollm

The interesting part is custom strategies - Jinja2 templates that define how the LLM sees the game and makes decisions. Same model, completely different playstyles depending on the prompt strategy.

Performance varies wildly by model. Full benchmark results comparing open-weight and commercial models are tracked at https://balatrobench.com/

There’s even a Twitch stream (https://www.twitch.tv/S1M0N38) where you can watch models like Opus 4.6 make questionable decisions in real time. Pretty entertaining watching an LLM overthink whether to discard a 2 of clubs.