Anthropic: Agent-Friendly Infrastructure Crucial for AI in Biology

Anthropic

Jun 9, 2026 · Updated Jun 12, 2026

Anthropic published a new Science Blog post detailing why AI agents have advanced faster in coding than in biology. The research highlights that biological data infrastructure is often not designed for agents, leading to unreliable performance in scientific tasks. Building deterministic retrieval layers is crucial for agents to navigate scientific data effectively.

Anthropic's new Science Blog post, "Paving the way for agents in biology," argues that AI agents advance faster in coding than in biology because biological data infrastructure is not designed for them. This contrasts with software development's structured digital workflows that enable agentic coding (AI systems that autonomously plan, reason, and act).

Benchmark: VirBench
Agent performance (without gget virus): 16.9% to 91.3% mean accuracy
Agent performance (with gget virus): >90% for all agents, peaking at 99.7% for GPT-5.5
Deterministic retrieval layer: gget virus
Models tested: Claude Sonnet 4, Claude Opus 4.7, Biomni OSS, Edison Analysis, GPT-5.2-pro, GPT-5.5

Even frontier models like Claude and GPT struggled to retrieve viral sequence data from NCBI Virus, achieving accuracies as low as 16.9% with high variability. Small errors in biological data retrieval can have severe consequences, invalidating downstream analyses. The bottleneck is not just agent reasoning, but the absence of dependable execution layers.

Adding a deterministic retrieval layer, such as gget virus developed with NCBI, dramatically improved agent accuracy to nearly 100% and eliminated variability. This suggests making biological data infrastructure agent-friendly, with reliable access paths, is more critical for scientific agents than relying on model power. This research is part of Anthropic's Science Blog efforts.

View the full update on anthropic.com

Anthropic

@AnthropicAIJun 8

New Science Blog: Why has AI advanced faster in coding than in biology? To agents, bio databases are like cities built before cars—maddening to drive in because they're designed for different traffic. How do we build infrastructure agents can use? https://t.co/PQaNQ4GRJZ

4893.6k

View on X

Still wondering? A few quick answers below.

The main challenge is that biological data infrastructure, unlike software development environments, is often not designed for autonomous AI agents. It features idiosyncratic formats, scattered databases, and complex, human-centric retrieval processes that agents struggle to navigate reliably.

Anthropic developed VirBench, a benchmark with 120 realistic viral sequence queries across 40 pathogens. They tasked state-of-the-art scientific research agents (including Claude and GPT models) to retrieve data from NCBI Virus and compared their accuracy against manually verified ground-truth counts.

gget virus is a deterministic retrieval layer developed in collaboration with NCBI. It translates complex, browser-based viral data retrieval workflows into an accurate and reproducible interface. When agents were given access to gget virus, their accuracy rose to nearly 100%, and run-to-run variability was largely eliminated.

Deterministic retrieval ensures that the underlying data access—gene identifiers, schemas, retrieval logic, and data paths—is reliably executed. This foundational reliability is crucial because even small errors in scientific data can have severe consequences for downstream analyses, making consistent and accurate data access more critical than raw model reasoning power.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from Anthropic →

Keep reading

Anthropic Launches Science Blog for AI-Driven Scientific Research

Anthropic has launched a Science Blog to document how AI is reshaping scientific research. The blog will publish features, practical guides, and field notes covering work at Anthropic and across the broader research community. Two posts launch alongside the announcement.

ClaudeApr 24

Anthropic Launches Claude Managed Agents to Standardize Production Infrastructure

Anthropic launched Claude Managed Agents in public beta, providing a suite of APIs for building and hosting AI agents at scale. By handling the underlying infrastructure for sandboxing and session management, the platform allows teams to move from prototypes to production deployments in days.

What is the main challenge for AI agents in biology?

How did Anthropic test agent performance in biology?

What is gget virus and how did it help?

Why is deterministic retrieval important for scientific agents?

Keep reading

Anthropic Launches Science Blog for AI-Driven Scientific Research

Anthropic Launches Science Blog for AI-Driven Scientific Research

Anthropic Launches Claude Managed Agents to Standardize Production Infrastructure

Anthropic Launches Claude Managed Agents to Standardize Production Infrastructure

Keep reading

Anthropic Launches Science Blog for AI-Driven Scientific Research

Anthropic Launches Science Blog for AI-Driven Scientific Research

Anthropic Launches Claude Managed Agents to Standardize Production Infrastructure

Anthropic Launches Claude Managed Agents to Standardize Production Infrastructure