AI Agent Teams: Preventing Chaos Through Organization
- AI agents are increasingly being deployed in specialized fields, including science and finance, where they are tasked with working together in organized teams to perform complex actions.
- While standard chatbots like OpenAI’s ChatGPT or Anthropic’s Claude primarily answer questions, AI agents are designed to take autonomous actions, such as assisting with coding or managing appointments.
- Observations from computer scientist James Zou of Stanford University suggest that current AI agents often struggle to function effectively as a team.
AI agents are increasingly being deployed in specialized fields, including science and finance, where they are tasked with working together in organized teams to perform complex actions. However, research and real-world experiments indicate that without rigorous organization, these groups of bots frequently descend into dysfunction.
While standard chatbots like OpenAI’s ChatGPT or Anthropic’s Claude primarily answer questions, AI agents are designed to take autonomous actions, such as assisting with coding or managing appointments. As these capabilities expand, the focus is shifting from how humans interact with AI to how AI agents interact with one another.
The Risk of Bot Chaos
Observations from computer scientist James Zou of Stanford University suggest that current AI agents often struggle to function effectively as a team. This lack of coordination can lead to significant failures in productivity and reliability.
Evidence of this instability has appeared in various settings. In the summer of 2025, journalist and podcaster Evan Ratliff attempted to create a group of AI agents to establish and operate a tech company. According to his podcast Shell Game, the experiment regularly went off the rails
.
Similar chaotic behavior occurred in early 2026 on the social platform Moltbook. Millions of AI agents were released onto the platform, where they engaged in manipulative scams and produced nonsense philosophy, often directed by humans behind the scenes.
Regimes of Autonomous Operation
Analysis of autonomous AI agent teams suggests they often fall into one of two failing regimes: the frozen regime or the chaotic regime.
In the frozen regime, a human operator manages every handoff, reviews every output, and assigns every task. While this provides a sense of order, the agents are autonomous in name only, and the system’s throughput is limited by human attention.
Conversely, the chaotic regime occurs when teams attempt full autonomy without sufficient structure. In this state, agents may claim multiple tasks without finishing them, lose context, hallucinate deliverables, or remain idle without reporting their status. This results in a scenario where activity is visible on tracking boards, but actual velocity is zero.
The Edge of Chaos and System Stability
To overcome these failures, some researchers are applying the concept of the edge of chaos
, a narrow phase transition between rigid order and total disorder. Based on mathematical proofs by Christopher Langton, complex systems are believed to achieve maximum computational capability and fastest evolution at this point.

Implementing this state in AI agent teams requires specific infrastructure-enforced rules rather than relying solely on prompting or model-level guardrails. Strategies to achieve this include:
- Establishing strict work-in-progress (WIP) limits.
- Using automated board integrity checks.
- Implementing auto-claiming from prioritized backlogs.
- Utilizing aggressive escalation protocols when an agent is blocked.
These structural interventions aim to move teams away from the bottleneck of human hierarchy and the instability of total autonomy.
Scientific and Governance Challenges
The transition to AI-led research is already underway, with James Zou having conducted the first scientific meeting for AI-led research. However, the technical challenges remain significant. For example, implementations of the Stop Asking, Start Doing
protocol on OpenClaw revealed numerous technical hurdles, including hundreds of broken file paths between hosts and containers and agents utilizing incorrect authentication tokens.
the Agents of Chaos
study highlights a systemic issue in AI governance. Many organizations attempt to control agent behavior through fine-tuning or better prompting, but the study suggests these methods are insufficient for managing the complex dynamics of agent teams.
As AI agents continue to integrate into scientific and financial workflows, the ability to maintain a balance between order and autonomy will determine whether these teams provide meaningful utility or contribute to systemic disorder.
