Privacy on a feed filter — what we send, what we don't

June 15, 2026

The reasonable concern

The first question anyone serious about privacy asks before installing a content-filter extension is: what does this thing send back to its servers? It's the right question. A feed-filter extension, by definition, has to be able to read the contents of your feed. The interesting question is what happens with that data after the extension reads it.

There are basically three architectures a feed filter can use, and they have very different privacy profiles. This post walks each one, explains the trade-offs, and tells you what PureFeed specifically does — and doesn't — so you can decide.

The three architectures

1. Pure local inference (everything happens on your device)

Some filters bundle a small machine learning model directly into the extension. Every post is classified locally, in your browser. Nothing leaves your device.

Pros: maximum privacy. Zero post text ever leaves your machine. Works offline.

Cons: the model has to be small enough to bundle (typically under 50MB), which means it's less accurate than a modern LLM. Updates require pushing a new extension version. Quality of classification on borderline cases is meaningfully worse.

Examples: AI Social Filter (Chrome Web Store).

2. Backend-classified with shared cache (PureFeed's approach)

The extension reads visible posts, computes a content hash, and sends just the post text (no user identifier, no IP-linked metadata) to a backend, which uses a real LLM to classify and caches the result by content hash. The second person to see that viral Reddit post never re-sends it — the cache already has the answer.

Pros: much higher classification accuracy. Cache means cost scales sub-linearly with users, which keeps the service affordable. Single update changes the model for everyone.

Cons: the post text does leave your device. It goes to a backend, which forwards it to an LLM provider. The mitigation is that no user identifier is attached, results are content-keyed not user-keyed, and the LLM provider's commercial API terms forbid training on the data.

3. Cloud-classified without caching (sends everything)

Some extensions send every post to a third-party AI API for every user every time. No dedup, no cache, just inference-as-a-service per request.

Pros: simplest implementation.

Cons: worst privacy posture (the third-party AI sees the same content over and over from different sources), worst economics (every classification billed separately), worst latency.

You generally don't want this one. It's worth checking what an extension is actually doing before installing.

What PureFeed specifically sends

Concretely, for an uncached Reddit post that appears in your feed:

Sent: the post's title, the visible description/body, the public creator handle, the public URL. Over HTTPS, to purefeed.io/api/classify on Vercel.
NOT sent: your IP address as a user identifier (Vercel may briefly log request IPs for abuse prevention; they're not joined to anything), your install ID, your account info, your Reddit username, anything you posted yourself, anything behind authentication, cookies, history, DMs, drafts.

The backend forwards the post text (and nothing else — no IP, no install ID, no user identifier) to OpenRouter, which routes the request to Google's Gemini Flash model. Google returns a category and numeric scores. The result is written to our cache keyed by a SHA-256 hash of the normalized text — not by any user identifier.

The next time anyone — you or anyone else with the extension — sees the same content, the cache returns the existing classification and nothing leaves either device or our backend for that post.

Per OpenRouter's and Google's API terms, content submitted via the paid Gemini API isn't used to train their models.

What PureFeed never sends

Things that no architecture should send for a feed filter, that some unfortunately do:

Your browsing history outside the three supported platforms
Anything from a logged-in session that isn't a visible post (account settings, message inboxes, drafts)
Your IP address for tracking or analytics
Anything for marketing purposes — we don't have ads, the business model is paid subscriptions
Anything from your password manager, autofill data, or form contents

If a feed filter is asking for tabs, webRequest, cookies, history, or <all_urls> permissions, it has the technical capability to do things in this list. Chrome's permissions screen is the first thing to read.

PureFeed's manifest requests storage, alarms, and host permissions limited to reddit.com, youtube.com, x.com, and twitter.com. Nothing else.

How to evaluate any extension in five minutes

A short checklist you can run on any feed-filter extension before installing:

1. Permissions screen. Hosts should be limited to specific domains. Avoid <all_urls> unless the extension genuinely needs to work on every site. Avoid webRequest, cookies, history unless there's a clear stated reason for each. 2. Privacy policy. Specifically look for a sub-processor list (which third parties get your data and why). If there isn't one, that's a flag. 3. Open-source or not. Open-source doesn't guarantee privacy — it's still possible to ship a malicious build — but it makes verification possible at all. 4. Network panel test. Install in a fresh Chrome profile. Open DevTools → Network. Browse a feed. Watch what gets sent and where. Five minutes of looking is more reliable than an hour of reading marketing copy. 5. Independent reviews. Search the extension name + "privacy" on Reddit and HackerNews. Real users will find the things the developer didn't disclose.

If you do this five-minute check before installing anything, you'll catch ~95% of the actually-problematic extensions before they get on your device.

Why we're not local-only

The honest answer is that local inference doesn't currently work well enough for the kind of multi-axis scoring PureFeed does. A 40MB bundled model can do binary spam-detection competently; it can't do "negativity 73, sensationalism 61, usefulness 24, learning 8, NSFW 0" with the same accuracy as a modern LLM. We picked the architecture that produces a better classification with explicit privacy trade-offs we can spell out — instead of the architecture that's better-marketed for privacy but produces worse results.

If a small model that's both accurate enough and small enough to bundle exists in the future, we'll evaluate it. As of mid-2026, that model isn't there for this use case.

The honest endpoint

Every extension that classifies content faces this exact decision: bundle a weaker model to keep everything local, or use a real LLM and be explicit about what leaves the device. There's no right answer for everyone. For users who want maximum local-only privacy, AI Social Filter and similar are the right pick. For users who want better classification with explicit, narrow data-sharing, PureFeed is a reasonable choice. Both are defensible.

What's not defensible — from any extension in this category — is being vague about what gets sent where. If you can't answer the question "what data leaves my browser" from an extension's docs in under two minutes, that's the actual privacy problem.

Want this in your own feeds?

PureFeed runs the techniques in this post automatically on Reddit and YouTube.

Add to Chrome