HeadsUpAI

ASI:Cloud Adds MiniMax M2.5, Qwen, and GLM Models for Inference

ยท Updated

ASI:Cloud, a serverless inference platform operated by the ASI Alliance (SingularityNET and CUDOS), added three open-source models to its inference catalog: MiniMax M2.5 (229B params, $0.26/1K input tokens), Qwen 3.5-35B-A3B (a Mixture-of-Experts model the team describes as Sonnet-level performance), and GLM 4.7 Flash (ultra-fast). All three are live immediately with no waitlist.

ASI:Cloud positions itself as permissionless inference โ€” no waitlists, no gating โ€” with pay-per-token pricing on enterprise-grade NVIDIA GPUs. The platform also offers a path from serverless inference to dedicated API endpoints as usage scales.

All three models are immediately accessible via ASI:Cloud's OpenAI-compatible API โ€” no new integration setup needed to add any of them to an existing pipeline.

CUDOS
CUDOS
@CUDOS_
X

๐Ÿš€ New models just dropped on ASI:Cloud Now live for inference: โ€ข MiniMax M2.5 - crushing benchmarks โ€ข Qwen 3.5-35B-A3B - Sonnet-level performance, efficient MoE โ€ข GLM 4.7 Flash - Ultra-fast Permissionless AI inference. No waitlists. Try them now ๐Ÿ‘‡ https://t.co/L0LuDk8BBv https://t.co/MZQdTWlQGx

10retweets
View on X

Share this update