coding by Promptsicle Team

Local LLMs Filter Gmail Without Cloud Privacy Risks

A guide explaining how to use locally-run large language models to filter and organize Gmail messages while maintaining complete privacy by avoiding

Local LLM Screens Gmail for Smart Notifications

Cloud-based email filters like SaneBox and Hey rely on remote servers to process messages, raising privacy concerns for users handling sensitive correspondence. A growing number of developers are instead running local language models to screen Gmail accounts, keeping all analysis on personal hardware while delivering intelligent notification filtering.

The Story

Privacy-focused developers have begun deploying locally-hosted LLMs to analyze incoming Gmail messages and determine which emails warrant immediate attention. Unlike traditional rule-based filters that match keywords or sender addresses, these systems use natural language understanding to evaluate message urgency, context, and relevance.

The typical setup involves a small language model running on consumer hardware—often a 7B or 13B parameter model like Mistral or Llama 2—connected to Gmail through OAuth authentication. The model reads new messages, analyzes their content against user-defined criteria, and triggers notifications only for emails meeting specific thresholds. All processing happens locally, meaning message content never leaves the user’s device.

One implementation uses a Python script that polls Gmail’s API every few minutes, feeds new message content to a locally-running Ollama instance, and sends push notifications through services like Pushover or ntfy.sh. The prompt engineering proves crucial: users typically instruct the model to identify urgent work requests, time-sensitive opportunities, or messages from specific relationship categories while ignoring newsletters, automated updates, and low-priority correspondence.

import ollama
from google.oauth2.credentials import Credentials
from googleapiclient.discovery import build

def analyze_email(subject, body):
    prompt = f"""Analyze this email and rate urgency (1-10):
    Subject: {subject}
    Body: {body}
    
    Consider: deadlines, requests for action, personal messages.
    Respond with only a number."""
    
    response = ollama.generate(model='mistral', prompt=prompt)
    return int(response['response'].strip())

Significance

This approach addresses a fundamental tension in modern email management. Cloud-based intelligent filtering requires uploading message content to third-party servers, creating security risks for professionals handling confidential information. Legal, medical, and financial workers often cannot use these services due to compliance requirements.

Local LLM filtering solves this problem while delivering superior contextual understanding compared to traditional rules. A keyword filter might flag all messages containing “urgent,” but a language model can distinguish between genuine emergencies and marketing emails using urgency as a manipulation tactic.

The computational requirements remain modest. A quantized 7B parameter model running on a modern laptop can process dozens of emails per minute while consuming minimal resources. Users report battery impact comparable to running a music player, making the approach viable even on portable devices.

Performance metrics from early adopters show promising results. One developer tracking notification accuracy over three months reported 94% precision—meaning 94% of notifications represented genuinely important emails—compared to 67% with Gmail’s default priority inbox. False negative rates (important emails missed) stayed below 3%.

Industry Response

The open-source community has embraced this application, with several GitHub repositories gaining traction. Projects like “gmail-llm-filter” and “local-mail-assistant” provide ready-to-deploy solutions requiring minimal technical knowledge. These tools typically bundle model downloading, Gmail authentication, and notification configuration into single-command installations.

https://github.com/ollama/ollama has become the de facto standard for running these local models, offering simple APIs that email filtering scripts can easily integrate. The project’s focus on efficient inference and model quantization makes it particularly suitable for always-on filtering tasks.

Privacy advocates have highlighted these implementations as practical examples of on-device AI, demonstrating that useful intelligence doesn’t require cloud infrastructure. The approach aligns with broader movements toward local-first software and data sovereignty.

Some users have extended the basic concept, using local LLMs to generate email summaries, draft responses, or categorize messages into custom taxonomies—all without external API calls or subscription fees.

Next Steps

Developers interested in implementing local Gmail filtering should start with Ollama and a lightweight model like Mistral 7B. The setup requires Gmail API credentials (obtained through Google Cloud Console), a Python environment, and approximately 8GB of disk space for the quantized model.

Fine-tuning models on personal email patterns represents the next frontier. Users can create small datasets of labeled emails—marking which should trigger notifications—and use parameter-efficient fine-tuning methods to customize model behavior without requiring extensive computational resources.

The technique extends beyond Gmail to any email service offering API access, including Outlook, ProtonMail, and self-hosted solutions. As local LLM performance continues improving while hardware requirements decrease, intelligent email filtering may become a standard feature of privacy-conscious computing setups.