ASI:Cloud Adds MiniMax M2.5, Qwen, and GLM Models for Inference

CUDOSCUDOS

· Updated

ASI:Cloud, a serverless AI inference platform, added three new models for immediate use: MiniMax M2.5, Qwen 3.5-35B-A3B, and GLM 4.7 Flash. No waitlists — all three are live now via its OpenAI-compatible API.

ASI:Cloud, a serverless inference platform operated by the ASI Alliance (SingularityNET and CUDOS), added three open-source models to its inference catalog: MiniMax M2.5 (229B params, $0.26/1K input tokens), Qwen 3.5-35B-A3B (a Mixture-of-Experts model the team describes as Sonnet-level performance), and GLM 4.7 Flash (ultra-fast). All three are live immediately with no waitlist.

ASI:Cloud positions itself as permissionless inference — no waitlists, no gating — with pay-per-token pricing on enterprise-grade NVIDIA GPUs. The platform also offers a path from serverless inference to dedicated API endpoints as usage scales.

All three models are immediately accessible via ASI:Cloud's OpenAI-compatible API — no new integration setup needed to add any of them to an existing pipeline.

CUDOS
CUDOS
@CUDOS_
X

🚀 New models just dropped on ASI:Cloud Now live for inference: • MiniMax M2.5 - crushing benchmarks • Qwen 3.5-35B-A3B - Sonnet-level performance, efficient MoE • GLM 4.7 Flash - Ultra-fast Permissionless AI inference. No waitlists. Try them now 👇 https://t.co/L0LuDk8BBv https://t.co/MZQdTWlQGx

10retweets
View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Share this update