chatgpt

20B Parameter Model Runs Locally in Browser

A 20 billion parameter AI language model has been successfully optimized to run entirely within a web browser, enabling local deployment without requiring

Someone got a 20 billion parameter language model running completely in the browser using WebGPU. No server calls, everything processes locally.

The demo uses Transformers.js v4 (still in preview) with ONNX Runtime Web to make it work. Pretty wild that a model this size can run client-side now.

Try it here:

The whole setup runs on WebGPU, which explains how it handles the compute without melting the browser. Source code is available in the demo link if anyone wants to poke around the implementation.

Main benefit is privacy - prompts never leave the machine. Performance obviously depends on GPU, but the fact that it works at all for a 20B model is impressive.