AI Gateway vs AI Proxy vs LLM Router: What's the Difference?

Key points

LLM Router (LiteLLM): picks which model handles each request — failover, cost optimization, load balancing
AI Gateway (Portkey, Helicone): logs, analytics, rate limiting — but stores your prompts in a third-party system
Privacy Proxy (Privedge): intercepts content and strips PII before it reaches the LLM — the compliance layer
Best architecture: privacy proxy upstream + gateway downstream — observability on already-anonymized data

If you’ve spent any time in AI developer communities recently, you’ve noticed a proliferation of tools all claiming to sit “between your application and the LLM.” Portkey, Helicone, LiteLLM, Privedge, BrainTrust, Martian — each markets itself slightly differently, but the terminology overlaps enough to cause genuine confusion.

This article is a clean breakdown of three distinct architectural patterns: LLM routers, AI gateways, and privacy-focused AI proxies. They solve different problems. Knowing which one you need will save you time and prevent a class of compliance mistakes that no amount of observability tooling can fix.

The terminology problem

The confusion exists because all three tools look the same from 10,000 feet: your application sends a request to Tool X instead of directly to OpenAI, and Tool X does something before forwarding it.

But “doing something” covers a wide range of behaviors:

Picking which model to use based on cost or availability (routing)
Logging the request and response for debugging (observability)
Stripping PII from the request before forwarding it (privacy proxy)

These are fundamentally different operations with different security implications. A router that logs your prompts doesn’t protect privacy. A gateway that adds observability doesn’t help with failover. And a privacy proxy that strips PII doesn’t tell you which Claude model responded faster.

LLM Router: load balancing and fallback

An LLM router’s primary job is deciding which model handles a given request. The canonical example is LiteLLM: you configure a list of providers and models, and LiteLLM routes traffic to them based on availability, cost per token, latency, or your own custom logic.

Routing use cases:

Failover: if OpenAI returns a 429, automatically retry on Anthropic
Cost optimization: use GPT-4o for complex tasks, route simple tasks to cheaper models
Load distribution: spread traffic across multiple API keys or deployments
Model evaluation: A/B test responses from different models

What routers typically do NOT do:

Inspect or transform prompt content
Protect personal data in transit
Provide compliance documentation

LiteLLM, Martian, and OpenRouter are primarily routers. They’re excellent at what they do. They were not designed for data protection.

AI Gateway: observability and rate limiting

An AI gateway adds an operational layer on top of LLM API calls. The two most prominent examples are Portkey and Helicone (note: Helicone was acquired by Mintlify in March 2026). Core gateway features include:

Request/response logging — every prompt and completion stored for inspection
Rate limiting — prevent any single user or team from burning your token budget
Caching — return cached responses for identical prompts to cut costs
Analytics — latency percentiles, token costs by user, error rates
Prompt versioning — manage prompt templates and track performance over time
Guardrails — basic content filtering (block certain topics, enforce format)

Gateways are invaluable for production AI applications. If you’re running a high-volume system and need to understand what’s happening, gateways give you the visibility to operate it well.

The important caveat: AI gateways log your prompts. That’s the entire point of the observability layer. If your prompts contain PII, PHI, or confidential business information, a gateway centralizes all of it in a database somewhere. Helicone’s HIPAA-compliant tier (introduced at $799/month in late 2025) handles this with a BAA and encrypted storage — but the data is still stored.

For certain compliance frameworks — particularly GDPR’s data minimization requirement and strict HIPAA interpretations — “encrypted storage” is still “storage.” The data still exists in someone else’s system.

AI Proxy with privacy: intercepting and transforming content

A privacy-focused AI proxy operates on the content of requests and responses, not just on routing or logging them. The core operation is PII interception: the proxy reads every prompt, identifies personal or sensitive data, replaces it with reversible tokens, and forwards the sanitized version to the LLM.

This is architecturally different from the first two categories because it changes what the LLM provider sees. A router decides where to send data. A gateway decides what to do with the data after it’s been sent. A privacy proxy decides what data is allowed to be sent at all.

Privedge is built on this model. It runs on Cloudflare Workers at the edge, processes prompts in sub-millisecond time, and ensures that personal data is tokenized before it ever leaves your network perimeter. The LLM provider receives anonymized tokens; the key that maps tokens back to real values never leaves your control.

This architecture makes data leakage architecturally impossible — not just policy-restricted, not just encrypted at rest, but technically prevented from occurring in the first place.

Comparison table

Feature	LLM Router (LiteLLM)	AI Gateway (Portkey / Helicone)	Privacy Proxy (Privedge)
Multi-model routing	✅	✅ (limited)	✅ (pass-through)
Observability & logging	❌	✅	✅ (audit logs, no raw PII)
Rate limiting	❌	✅	✅
PII detection & masking	❌	❌	✅
Data never reaches LLM	❌	❌	✅
GDPR Art. 5 minimization	❌	❌	✅
BAA available	❌	✅ ($799/mo tier)	✅
Edge-native (Cloudflare)	❌	❌	✅
Open-source SDK	✅	❌	✅
Drop-in OpenAI replacement	✅	✅	✅

Which one do you need?

Use a router if your primary pain point is reliability or cost. You want automatic failover between providers, or you need to route different request types to different models. LiteLLM is the standard choice.

Use a gateway if your primary pain point is visibility and operations. You want logs, analytics, caching, and rate limiting. Portkey and the new Mintlify/Helicone stack are mature choices here.

Use a privacy proxy if your primary pain point is compliance and data protection. You’re building for healthcare, legal, finance, or any context where personal data in prompts creates regulatory exposure. The difference between a gateway and a privacy proxy isn’t just features — it’s the fundamental question of whether personal data should reach the LLM at all.

Can you use a gateway and a privacy proxy together?

Yes — and this is often the right architecture for enterprise teams. Privedge sits before any downstream tooling. Your request flow becomes:

Application → Privedge (PII stripped) → Portkey (routing + logging) → OpenAI

What Portkey logs is already anonymized. What OpenAI processes is already anonymized. You get the full observability stack without any of the privacy risk, because the privacy problem was solved upstream before any logging or routing occurred.

The tools are complementary, not competitive — they operate at different layers of the stack.

Conclusion

The distinction matters because choosing the wrong tool gives you false confidence. A gateway with excellent logging tells you exactly what PII leaked to OpenAI — it doesn’t prevent the leak. A router with perfect failover ensures your PHI reaches some LLM provider, not that it should have.

For applications handling personal data, the privacy proxy layer is the foundation. Everything else — routing, observability, caching — is valuable on top of that foundation, but cannot substitute for it.