Chutes Launches End-to-End Encryption for AI Inference Requests

Chutes

Mar 15, 2026 · Updated Jun 5, 2026

Chutes, an AI inference platform, now encrypts prompts on the client's machine using post-quantum cryptography — making requests unreadable to the platform, network, and GPU operators. Only the TEE-isolated GPU instance running the model can decrypt the payload.

Chutes, an AI inference platform on a decentralized GPU network, has shipped end-to-end encryption for AI inference. Prompts are encrypted client-side using ML-KEM-768 (a NIST-standardized post-quantum key encapsulation), HKDF-SHA256, and ChaCha20-Poly1305, then sent as ciphertext through Chutes' API and load balancers. Only the GPU instance inside a Trusted Execution Environment (TEE) sees the plaintext. A fresh ephemeral keypair per request provides forward secrecy.

This shifts AI privacy from "trust the provider" to mathematically excluding the provider from the trust chain. For teams handling sensitive data — legal, medical, or financial — inference can run on external infrastructure without exposing prompt content. Available across all models today, with strongest guarantees on TEE-enabled models.

Teams using the OpenAI Python SDK can activate encryption via Chutes' chutes-e2ee transport. Teams on other platforms can run the e2ee-proxy Docker container, which supports both OpenAI-compatible APIs and Anthropic's Messages API. Both are MIT-licensed open source.

View the full update on github.com

Chutes

@chutes_aiMar 12

Most AI providers ask you to trust them with your data. We just removed ourselves from the equation. Today we're shipping end-to-end encryption for AI inference on Chutes. Here's what that actually means: https://t.co/u6iWo0ZaoC

View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Keep reading

Cloudflare Targets 2029 for Post-Quantum Security as AI Breaks Encryption Faster

Cloudflare set a 2029 deadline for achieving full post-quantum security across its network to counter AI-driven threats to current encryption. Engineers warn that AI is accelerating the ability to break standard cryptographic protocols, moving the timeline for quantum-based attacks much closer than previously expected.

ClaudeMay 19

Anthropic Launches Self Hosted Sandboxes to Run Claude Agents Inside Your Perimeter

Anthropic introduced self-hosted sandboxes and MCP tunnels to allow Claude Managed Agents to execute tools and access data within a company's private infrastructure. This update addresses enterprise security concerns by decoupling the AI's reasoning loop from the sensitive environments where code is run and data is stored.

CohereMay 21

Cohere Releases Command A+ W4A4 Weights for Single GPU Serving

Cohere released W4A4 quantized weights for its 218-billion parameter Command A+ model, enabling frontier-class reasoning on a single NVIDIA B200 GPU. By using quantization-aware distillation to maintain performance, the update allows enterprises to deploy massive agentic models with a significantly smaller hardware footprint.

PerplexityJun 2

Perplexity Computer announces hybrid inference to balance local privacy and cloud power

Perplexity is launching a hybrid agentic inference system that automatically routes tasks between on-device models and cloud-based frontier models. The update allows sensitive data to remain local while utilizing server-side compute for complex reasoning.