HeadsUpAI

Harvard Researchers Launch AutoScientists Using Self-Organizing Agent Teams

Harvard researchers released AutoScientists, an open-source framework for decentralized AI agent teams designed for long-running scientific discovery. Unlike systems following a single research trajectory, it deploys specialized Analyst and Experiment agents that coordinate through a shared state—a central repository of logs and dead-end registries—to explore multiple hypotheses in parallel.
BioML-Bench performance
74.4% mean leaderboard percentile
GPT optimization speedup
1.9x faster than autoresearch
ProteinGym improvement
+12.5% Spearman correlation
Biomedical tasks evaluated
24 tasks
Availability
Open-source GitHub repository

This decentralized approach addresses the local optima problem where single-agent loops—exemplified by Andrej Karpathy's autoresearch launch—often plateau. By allowing agents to critique each other and share failed results, the system mimics human scientific discovery, outperforming prior agents by 8.3% on biomedical benchmarks and optimizing GPT training 1.9x faster.

You can use the framework to automate research in drug discovery, protein engineering, and ML optimization. The project is open science, with the research paper and code available for deployment. It provides a blueprint for building resilient multi-agent systems that sustain autonomous experimentation over extended periods without human intervention.

Marinka Zitnik
Marinka Zitnik
@marinkazitnik
X

Scientific discovery is not a single chain of thought @GaoShanghua @AdaFang_ . It is a long-running process of competing hypotheses, failed experiments, shared insights, and changing research directions AutoScientists lets AI agents do the same https://t.co/JHAkbEx2Ac đź§Ş We call this AutoScientists: self-organizing agent teams for long-running scientific experimentation Open science: Paper: https://t.co/mzEx5xwtSE Code: https://t.co/1OLxN4AW94 @HarvardDBMI @harvardmed @broadinstitute @KempnerInst

7retweets40likes
View on X

Still wondering? A few quick answers below.

AutoScientists is a decentralized multi-agent framework developed by Harvard researchers for long-running scientific experimentation. It uses teams of AI agents that self-organize around research directions rather than following a central orchestrator. The system allows agents to independently interpret a shared experimental state, run parallel experiments, and reorganize their team structures when a specific research direction stops producing results.

The system coordinates specialized agents through a shared state containing experiment logs, a discussion forum, and a dead-end registry. Analyst agents propose hypotheses and rank them by effect size, while Experiment agents execute these proposals and record results. Instead of a central planner, agents use the forum to critique proposals and share findings across teams to avoid redundant exploration.

Most existing AI agents follow a single research trajectory or rely on a central planner with fixed objectives, which often leads to stagnation at local optima. AutoScientists differs by enabling parallel exploration across multiple research directions. It uses a decentralized architecture where agents can declare hypotheses dead and reorganize into new teams to find better research paths.

Yes, AutoScientists is a fully open science project. The researchers have released the complete paper on arXiv and made the source code publicly available on GitHub. This allows other researchers and developers to deploy the framework for their own computational experiments in fields like drug discovery, protein engineering, and language model training optimization.

AutoScientists outperformed prior agentic systems across several benchmarks. On BioML-Bench, it achieved a 74.4 percent mean leaderboard percentile across 24 biomedical tasks. In GPT training optimization, it reached target performance 1.9 times faster than single-agent systems. It also discovered a new protein engineering method that improved ACE2-Spike binding prediction by 12.5 percent compared to previous state-of-the-art models.

Share this update