Harvard Researchers Launch AutoScientists Using Self-Organizing Agent Teams

Marinka Zitnik

May 29, 2026 · Updated Jun 20, 2026

Harvard researchers released AutoScientists, a decentralized multi-agent framework that allows AI agents to self-organize into research teams for autonomous scientific experimentation. By replacing central planners with a shared experimental state and discussion forum, the system enables agents to pursue competing hypotheses and reorganize when research directions stagnate.

Harvard researchers released AutoScientists, an open-source framework for decentralized AI agent teams designed for long-running scientific discovery. Unlike systems following a single research trajectory, it deploys specialized Analyst and Experiment agents that coordinate through a shared state—a central repository of logs and dead-end registries—to explore multiple hypotheses in parallel.

BioML-Bench performance: 74.4% mean leaderboard percentile
GPT optimization speedup: 1.9x faster than autoresearch
ProteinGym improvement: +12.5% Spearman correlation
Biomedical tasks evaluated: 24 tasks
Availability: Open-source GitHub repository

This decentralized approach addresses the local optima problem where single-agent loops—exemplified by Andrej Karpathy's autoresearch launch—often plateau. By allowing agents to critique each other and share failed results, the system mimics human scientific discovery, outperforming prior agents by 8.3% on biomedical benchmarks and optimizing GPT training 1.9x faster.

You can use the framework to automate research in drug discovery, protein engineering, and ML optimization. The project is open science, with the research paper and code available for deployment. It provides a blueprint for building resilient multi-agent systems that sustain autonomous experimentation over extended periods without human intervention.

View the full update on autoscientists.openscientist.ai

Marinka Zitnik

@marinkazitnikMay 28

Scientific discovery is not a single chain of thought @GaoShanghua @AdaFang_ . It is a long-running process of competing hypotheses, failed experiments, shared insights, and changing research directions AutoScientists lets AI agents do the same https://t.co/JHAkbEx2Ac 🧪 We call this AutoScientists: self-organizing agent teams for long-running scientific experimentation Open science: Paper: https://t.co/mzEx5xwtSE Code: https://t.co/1OLxN4AW94 @HarvardDBMI @harvardmed @broadinstitute @KempnerInst

740

View on X

Still wondering? A few quick answers below.

AutoScientists is a decentralized multi-agent framework developed by Harvard researchers for long-running scientific experimentation. It uses teams of AI agents that self-organize around research directions rather than following a central orchestrator. The system allows agents to independently interpret a shared experimental state, run parallel experiments, and reorganize their team structures when a specific research direction stops producing results.

The system coordinates specialized agents through a shared state containing experiment logs, a discussion forum, and a dead-end registry. Analyst agents propose hypotheses and rank them by effect size, while Experiment agents execute these proposals and record results. Instead of a central planner, agents use the forum to critique proposals and share findings across teams to avoid redundant exploration.

Most existing AI agents follow a single research trajectory or rely on a central planner with fixed objectives, which often leads to stagnation at local optima. AutoScientists differs by enabling parallel exploration across multiple research directions. It uses a decentralized architecture where agents can declare hypotheses dead and reorganize into new teams to find better research paths.

Yes, AutoScientists is a fully open science project. The researchers have released the complete paper on arXiv and made the source code publicly available on GitHub. This allows other researchers and developers to deploy the framework for their own computational experiments in fields like drug discovery, protein engineering, and language model training optimization.

AutoScientists outperformed prior agentic systems across several benchmarks. On BioML-Bench, it achieved a 74.4 percent mean leaderboard percentile across 24 biomedical tasks. In GPT training optimization, it reached target performance 1.9 times faster than single-agent systems. It also discovered a new protein engineering method that improved ACE2-Spike binding prediction by 12.5 percent compared to previous state-of-the-art models.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Keep reading

Google DeepMind previews Co-Scientist to automate scientific hypothesis generation

Google DeepMind introduced Co-Scientist, a multi-agent system built on Gemini that generates and debates scientific hypotheses. The system moves beyond simple literature search by using a tournament of ideas to refine and rank novel research leads. Researchers can now access these capabilities through the new Hypothesis Generation tool.

Karpathy Open-Sources autoresearch for Autonomous LLM Training by AI Agents

Andrej KarpathyMar 15

Karpathy Open-Sources autoresearch for Autonomous LLM Training by AI Agents

Andrej Karpathy released autoresearch, a minimal single-GPU repo where an AI agent autonomously runs LLM training experiments overnight. The agent edits train.py, runs 5-minute experiments, and keeps only the runs that lower validation loss — no human involvement needed.

Anthropic Automates AI Safety Research Using Claude Opus 4.6 Agents

AnthropicApr 19

Anthropic Automates AI Safety Research Using Claude Opus 4.6 Agents

Anthropic deployed autonomous Claude Opus 4.6 agents to solve weak-to-strong supervision tasks, achieving a 97% performance recovery rate. The study highlights a future where AI brute-forces alignment hypotheses, though early results show these methods often fail to generalize to production-scale models.

Perplexity Research with Harvard Shows AI Agents Cut Task Time and Cost

PerplexityJun 9

Perplexity Research with Harvard Shows AI Agents Cut Task Time and Cost

Perplexity, in collaboration with Harvard Business School, published new research on its Computer autonomous agent. The study found that workers using Computer completed tasks in 87% less time and at 94% lower cost than using Search alone, demonstrating how agents expand the scope and efficiency of knowledge work.

What is AutoScientists?

How does the AutoScientists multi-agent system work?

How is AutoScientists different from other AI scientist agents?

Is AutoScientists open source and available for use?

What were the key results of the AutoScientists research?

Keep reading

Google DeepMind previews Co-Scientist to automate scientific hypothesis generation

Google DeepMind previews Co-Scientist to automate scientific hypothesis generation

Karpathy Open-Sources autoresearch for Autonomous LLM Training by AI Agents

Karpathy Open-Sources autoresearch for Autonomous LLM Training by AI Agents

Anthropic Automates AI Safety Research Using Claude Opus 4.6 Agents

Anthropic Automates AI Safety Research Using Claude Opus 4.6 Agents

Perplexity Research with Harvard Shows AI Agents Cut Task Time and Cost

Perplexity Research with Harvard Shows AI Agents Cut Task Time and Cost

Keep reading

Google DeepMind previews Co-Scientist to automate scientific hypothesis generation

Google DeepMind previews Co-Scientist to automate scientific hypothesis generation

Karpathy Open-Sources autoresearch for Autonomous LLM Training by AI Agents

Karpathy Open-Sources autoresearch for Autonomous LLM Training by AI Agents

Anthropic Automates AI Safety Research Using Claude Opus 4.6 Agents

Anthropic Automates AI Safety Research Using Claude Opus 4.6 Agents

Perplexity Research with Harvard Shows AI Agents Cut Task Time and Cost

Perplexity Research with Harvard Shows AI Agents Cut Task Time and Cost