HeadsUpAI

Warp Adds Custom Inference Endpoints and Bring Your Own Key Support

Warp, an agentic development environment combining a terminal and code editor, launched support for Bring Your Own API Key (BYOK) and custom inference endpoints (the connection points used to access AI models). This update allows users to connect the Warp Agent to their own OpenAI, Anthropic, or Google accounts.
Supported providers
OpenAI, Anthropic, Google
Compatible endpoints
OpenRouter, LiteLLM, DeepSeek, and more
Key storage
Local only
Availability
Free and eligible paid plans
Organization limit
10 or fewer employees for Free/Pro

This shift addresses the demand for infrastructure flexibility in agentic coding. By using their own keys, developers remove dependency on Warp's internal credit system and can access low-cost models like DeepSeek. It follows recent updates that integrated Gemini 3.5 Flash to speed up terminal-based agent loops.

You can enable BYOK by searching for "API keys" in Warp settings. Keys are stored locally and never synced to the cloud. Once configured, a key icon in the model picker indicates requests will route through your own provider account. This is available on Free and eligible paid plans.

Warp
Warp
@warpdotdev
X

You can also connect to inference endpoints that follow the OpenAI Chat Completions API. This includes @OpenRouter, @LiteLLM, @Zai_org, @deepseek_ai, and more. Here's engineer Dagm Assefa showing how to connect to DeepSeek and OpenRouter. Docs: https://t.co/28WXdLHWRR 🔖 https://t.co/wLNBeqGDrQ

1retweets12likes
View on X

Still wondering? A few quick answers below.

Bring Your Own API Key is a feature that allows Warp users to connect their own Anthropic, OpenAI, or Google API accounts to the terminal's AI agents. This gives developers full control over model selection and data routing while ensuring that agent requests are billed directly through their provider instead of consuming Warp credits.

When you add your own model API keys to Warp, they are stored locally on your device and are never synced to the cloud or stored on Warp's servers. Because these keys remain local, they are not available for cloud-hosted agent runs, which will continue to consume Warp's internal AI credits instead.

Warp supports any custom inference endpoint that follows the OpenAI Chat Completions API standard. This includes popular aggregators and gateways like OpenRouter, LiteLLM, and z.ai, as well as specific model providers like DeepSeek. Users can configure these endpoints to route terminal agent requests through their preferred infrastructure or internal gateways.

These features are available to all individual users on the Free plan and eligible paid plans. For organizations, BYOK is currently available to teams with 10 or fewer employees. Larger organizations require a Business or Enterprise plan, while centrally managed model routing is reserved for the Enterprise-level Bring Your Own LLM offering.

Warp's Auto models always consume Warp credits because the routing logic depends on internal infrastructure. To use your own API key, you must manually select a specific provider model, such as a specific version of Claude or GPT, from the model picker. These supported models will display a key icon once configured.

Share this update