🚀 New models just dropped on ASI:Cloud Now live for inference: • MiniMax M2.5 - crushing benchmarks • Qwen 3.5-35B-A3B - Sonnet-level performance, efficient MoE • GLM 4.7 Flash - Ultra-fast Permissionless AI inference. No waitlists. Try them now 👇 https://t.co/L0LuDk8BBv https://t.co/MZQdTWlQGx
ASI:Cloud Adds MiniMax M2.5, Qwen, and GLM Models for Inference
· Updated
ASI:Cloud, a serverless AI inference platform, added three new models for immediate use: MiniMax M2.5, Qwen 3.5-35B-A3B, and GLM 4.7 Flash. No waitlists — all three are live now via its OpenAI-compatible API.
MiniMax M2.5 (229B params, $0.26/1K input tokens), Qwen 3.5-35B-A3B (a Mixture-of-Experts model the team describes as Sonnet-level performance), and GLM 4.7 Flash (ultra-fast). All three are live immediately with no waitlist.ASI:Cloud positions itself as permissionless inference — no waitlists, no gating — with pay-per-token pricing on enterprise-grade NVIDIA GPUs. The platform also offers a path from serverless inference to dedicated API endpoints as usage scales.
All three models are immediately accessible via ASI:Cloud's OpenAI-compatible API — no new integration setup needed to add any of them to an existing pipeline.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →





