Comparing Open Source LLM Gateways
Where they overlap, where they don't, and why connectivity matters
If you're running multiple LLM providers and want a single API in front of them, you have options. This post compares open source LLM gateways, including our own OpenZiti llm-gateway. We'll try to be honest about where each one is strong and where ours fits.
The Problem They All Solve
You have applications that talk to LLMs. Maybe OpenAI for general tasks, Claude for coding, and a private inference cluster for sensitive workloads. Each provider has a different API, different auth, different SDKs, but you want one endpoint that handles routing, translation, and access control so your applications don't have to.
The Options
LiteLLM
LiteLLM is the most widely adopted open source LLM proxy. It supports 100+ providers, has a polished admin UI, and handles translation between API formats. If your primary need is broad provider support, LiteLLM is hard to beat.
Strengths:
100+ provider integrations out of the box
Admin dashboard with spend tracking, rate limits, and team management
Virtual API keys with per-key budgets
Well-documented, large community
Python-based, easy to extend
Tradeoffs:
Security is at the application layer: API keys in headers, HTTPS for transport
Designed to be deployed on a network and accessed over HTTP
Requires a database (PostgreSQL) for the admin features
Python runtime and dependencies
Portkey
Portkey is a lightweight TypeScript gateway focused on reliability and observability. It supports caching, retries, fallbacks, and load balancing with a clean config format.
Strengths:
Simple, focused feature set
Caching and automatic retries built in
Fallback chains across providers
Lightweight Node.js runtime
Good observability with logging and analytics
Tradeoffs:
Fewer provider integrations than LiteLLM
No built-in access control or API key management in the open source version (available in their cloud offering)
Security model is standard HTTPS
Kong AI Gateway
Kong adds AI capabilities as plugins to the Kong API gateway. If you already run Kong, this is a natural extension.
Strengths:
Builds on Kong's mature API gateway infrastructure
Rate limiting, auth, and observability from the Kong ecosystem
Enterprise support available
Plugin architecture for extensibility
Tradeoffs:
Requires running the full Kong gateway stack
Significant operational complexity if you don't already use Kong
AI features are plugins, not a purpose-built gateway
Community edition has limitations; some features require Enterprise license
Cloudflare AI Gateway
Cloudflare AI Gateway runs on Cloudflare's edge network. It's fast to set up if you're already in the Cloudflare ecosystem.
Strengths:
Global edge deployment, low latency
Built-in caching and rate limiting
Simple setup if you use Cloudflare
Analytics dashboard
Tradeoffs:
Not self-hosted - your traffic goes through Cloudflare
Vendor lock-in to the Cloudflare ecosystem
Not open source
Limited customization compared to self-hosted options
llm-gateway (ours)
llm-gateway is our entry. It's an OpenAI-compatible proxy built on zrok and OpenZiti that adds zero-trust networking as a foundational layer.
Strengths:
Dark by default: the gateway can run with zero listening ports, invisible to network scanners
End-to-end encryption through the OpenZiti/zrok overlay (not just HTTPS)
Private model mesh: connect to private inference instances on other machines without opening ports or VPN
Semantic routing with a three-layer cascade (heuristics, embeddings, LLM classifier)
Weighted load balancing across multiple inference instances with health checks and failover
Single Go binary, no runtime dependencies, no database required
Apache 2.0
Tradeoffs:
Fewer provider integrations than LiteLLM (currently OpenAI, Anthropic, OpenAI-compatible local inference)
No admin UI (CLI and config file only)
Newer project, smaller community
The zero-trust features require zrok, which adds a concept to learn
Where Each One Fits
If you need broad provider support and an admin dashboard, LiteLLM is the pragmatic choice. It has the widest adoption and the most integrations.
If you want something lightweight with good retry/fallback behavior, Portkey keeps things simple.
If you already run Kong, the AI Gateway plugins are the path of least resistance.
If you're already on Cloudflare and don't need self-hosting, Cloudflare AI Gateway is quick to set up.
If your concern is security - specifically, if you don't want your LLM gateway to be a network-accessible endpoint, if you need to connect to models across networks without opening ports, or if you need cryptographic identity per client rather than shared API keys - that's where llm-gateway fits. The security model is fundamentally different from the others. The other gateways are good at what they do. We're not trying to be better at those same things. The difference is the connectivity layer underneath - that's where we bring something they don't have.
The Security Gap in LLM Gateways
This is the angle we think matters most, so it's worth expanding on.
Every other gateway on this list is designed to be deployed as an HTTP endpoint on a network. Clients connect to it over HTTPS, authenticate with API keys, and make requests. The security model is: put it behind a load balancer, use HTTPS, and manage API keys carefully.
This may be fine for a lot of scenarios, but it has some gaps. API keys can get committed to repos, leaked in logs, or shared among team members. The endpoint is scannable - anyone who can reach the network can find it and try to authenticate. If you're connecting to models on another network (e.g., an inference cluster on a different subnet), you need to either open firewall ports or set up a VPN.
llm-gateway takes a different approach. When you run it with zrok, the gateway doesn't listen on any port. It connects outbound to the zrok overlay and is reachable only by clients with the right cryptographic identity. You can connect to model instances on other machines the same way - no ports, no DNS, no VPN. The traffic is encrypted end to end through the overlay, not just to the gateway's TLS termination point.
This isn't better for everyone. If you're running a local development proxy, maybe you don't need this. But if you're deploying an LLM gateway for a team, connecting to models across networks, or operating in an environment where "put it on the network and secure it with API keys" doesn't meet your security requirements, the zero-trust model is worth looking at.
Try It
go install github.com/openziti/llm-gateway/cmd/llm-gateway@latest
The getting started guide walks through setup from a minimal Ollama config to a full deployment with semantic routing and zrok integration.
If you try it and have feedback, we'd appreciate it. Post on Discourse or open an issue on the repo. And of course a star is always appreciated.




