HeadsUpAI

Gemini 3.1 Pro Tops ARC-AGI-2 Benchmark with 77.1% Score

· Updated

Gemini 3.1 Pro is Google's upgraded reasoning model in the Gemini 3 series. On ARC-AGI-2 - a benchmark testing ability to solve entirely novel logic patterns - it scored 77.1% in Thinking (High) mode, more than double the 31.1% from Gemini 3 Pro and ahead of Claude Opus 4.6 (68.8%). The upgrade targets core reasoning rather than expanded multimodal features.

The practical difference shows in tasks where simple answers fall short: building a live aerospace dashboard from a public telemetry stream, generating interactive animated SVGs in pure code, or reasoning through literary tone to design a functional portfolio site. These are existing capabilities made more capable by stronger underlying reasoning.

3.1 Pro is in preview via the Gemini API in Google AI Studio, Gemini CLI, and Antigravity. Consumer access is rolling out through the Gemini app and NotebookLM on Pro and Ultra plans.

Share this update