Arena.ai Adds Poolside Laguna Models for Public Agentic Coding Evaluation

Arena

May 1, 2026 · Updated Jun 12, 2026

Arena.ai integrated Poolside's Laguna XS.2 and M.1 models into its frontend coding leaderboard for community-driven blind testing. These models are specifically architected for agentic coding and long-horizon software engineering tasks rather than general-purpose chat.

Arena.ai, a community-driven AI model evaluation platform, added Poolside AI's Laguna XS.2 and M.1 models to its Code Arena: Front-end leaderboard. This follows the OpenRouter hosting of Laguna models, moving the model family from initial availability to public performance verification against established frontier systems.

Laguna XS.2 parameters: 33B total / 3B active MoE
Laguna M.1 parameters: 225B total / 23B active MoE
Laguna XS.2 license: Apache 2.0
Arena category: Code Arena: Front-end
Availability: Live for testing, scores coming soon

The entry of these models into the Arena allows for blind testing of Poolside's agentic coding claims. While recent rankings have seen DeepSeek V4 Pro and GPT-5.5 dominate, Poolside's models are built specifically for long-horizon tasks (complex workflows requiring multiple steps and reasoning over time).

You can now test both models in the Code Arena by submitting web development prompts and voting. Laguna XS.2 is an open-weight 33B Mixture of Experts (MoE) model (an architecture activating only a subset of parameters per request) designed for local execution, while M.1 is a larger 225B proprietary model.

View the full update on arena.ai

Arena.ai

@arenaMay 1

Laguna XS.2 & M.1 by @poolsideai are ready in the Code Arena: Front-end. Come bring your toughest agentic webdev tasks and vote for the outputs that deliver best for your use case. Scores coming soon. https://t.co/0yx9lGoapX

View on X

Still wondering? A few quick answers below.

Laguna XS.2 and M.1 are foundation models developed by Poolside AI specifically for agentic coding and long-horizon software engineering tasks. Unlike general-purpose chat models, these are architected to function within autonomous loops to solve complex, multi-step programming problems. They are now available for public evaluation on the Arena.ai platform to verify their performance.

Yes, Laguna XS.2 is an open-weight model released under the Apache 2.0 license. It is a 33B parameter Mixture of Experts model designed to run efficiently on a single GPU for local development. While the larger Laguna M.1 model remains proprietary, the open release of XS.2 allows developers to integrate Poolside's agentic coding capabilities locally.

You can test both Laguna XS.2 and M.1 through the Arena.ai Code Arena: Front-end category. The platform allows users to enter complex web development prompts and participate in blind battles where they vote on which model produces the best output. While the models are currently live for testing, their official Elo rankings will be published later.

Both models use a Mixture of Experts architecture, which improves efficiency by only activating a subset of parameters for any given request. Laguna XS.2 has 33B total parameters with 3B active, making it suitable for local use. The larger Laguna M.1 features 225B total parameters with 23B active, providing higher capacity for demanding software engineering tasks.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from Arena →

Keep reading

OpenRouter Hosts Poolside AI Laguna Models for Free Agentic Coding

OpenRouter integrated the first public foundation models from Poolside AI, featuring the flagship Laguna M.1 and the efficient Laguna XS.2. These models are architected specifically for long-horizon software engineering and agentic loops rather than general conversation.

Arena.ai Launches Task Specific Leaderboards to Map Frontend Coding Strengths

ArenaMay 8

Arena.ai Launches Task Specific Leaderboards to Map Frontend Coding Strengths

Arena.ai introduced seven new leaderboard categories for its Code Arena to measure how AI models perform on specific frontend development tasks like gaming and analytics. The data shows that aggregate rankings hide significant performance gaps, with different models excelling at aesthetic design versus logical simulations.

What are the Poolside Laguna AI models?

Is the Laguna XS.2 model open source?

How can I test the Laguna XS.2 and M.1 models?

What is the architecture of the Laguna models?

Keep reading

OpenRouter Hosts Poolside AI Laguna Models for Free Agentic Coding

OpenRouter Hosts Poolside AI Laguna Models for Free Agentic Coding

Arena.ai Launches Task Specific Leaderboards to Map Frontend Coding Strengths

Arena.ai Launches Task Specific Leaderboards to Map Frontend Coding Strengths

Keep reading

OpenRouter Hosts Poolside AI Laguna Models for Free Agentic Coding

OpenRouter Hosts Poolside AI Laguna Models for Free Agentic Coding

Arena.ai Launches Task Specific Leaderboards to Map Frontend Coding Strengths

Arena.ai Launches Task Specific Leaderboards to Map Frontend Coding Strengths