Google Releases Gemma Scope 2 for Model Interpretability
Google releases Gemma Scope 2, an advanced interpretability tool that helps researchers understand and analyze the internal workings of AI language models
Google just dropped Gemma Scope 2, which is pretty useful for anyone trying to understand what’s actually happening inside language models.
It’s basically a collection of pre-trained sparse autoencoders (SAEs) that let researchers peek into Gemma 2’s internal workings. The models are available at https://huggingface.co/collections/google/gemma-scope-2 and cover the 2B, 9B, and 27B parameter versions.
Quick setup:
sae = AutoModel.from_pretrained("google/gemma-scope-2b-pt-res")
The cool part is they’ve trained these on multiple layers and attention heads, so you can see how different features activate for specific inputs. Turns out this makes interpretability research way more accessible since you don’t need to train your own SAEs from scratch.
Particularly handy for safety research - figuring out why models behave certain ways or what triggers specific outputs.
Related Tips
Supertonic: 66M Parameter TTS Runs 166x Real-Time Locally
Supertonic is a 66 million parameter text-to-speech model that runs 166 times faster than real-time on local hardware, enabling efficient voice synthesis
GLM-4.7: New Chinese 7B Model with 128k Context
GLM-4.7 is a newly released 7 billion parameter Chinese language model featuring a 128,000 token context window, offering improved performance for long-form
DeepSeek-R1: Budget AI Rivaling GPT-4 Performance
DeepSeek-R1 emerges as a budget-friendly AI model that delivers performance comparable to GPT-4, offering advanced reasoning capabilities at a fraction of the