coding

Running Meta's SAM-Audio on 4GB GPUs with AudioGhost

AudioGhost enables running Meta's SAM-Audio model on 4GB GPUs through memory optimization techniques, making advanced audio segmentation accessible on consumer

Someone got Meta’s SAM-Audio running on regular GPUs after the original version kept crashing with out-of-memory errors.

The trick was stripping out unused vision encoders and rankers that ship with the default setup. This dropped VRAM usage from 20GB+ down to 4-6GB for the Small model and around 10GB for Large - now it actually runs on laptop cards.

They built AudioGhost AI as a Windows-friendly wrapper with a one-click installer that handles all the FFmpeg and TorchCodec dependency nightmares automatically. The interface shows real-time waveforms and lets you mix extracted stems on the fly.

Pretty useful if you want to isolate instruments from tracks using natural language prompts (like “extract the violin”) without needing a datacenter GPU.

Runs 100% locally, processes a 4-minute song in under a minute on a 4090.

GitHub: https://github.com/0x0funky/audioghost-ai

The install.bat script does all the heavy lifting - just clone and run it.