Five LLM systems were placed in charge of virtual societies of autonomous AI agents, revealing a range of deviant strategies
Researchers have now tested how advanced language models govern simulated agentic “societies”, producing sharply different outcomes across systems. In the experiment, five AI models were each placed in control of a virtual town populated by 10 autonomous agents, with authority over lawmaking, resource allocation, and civic infrastructure.
The goal was to observe how these systems would manage stability, cooperation, and survival over time, according to the team at Emergence AI. Models tested included Claude Sonnet 4.6, Gemini 3 Flash, GPT-5 Mini, Grok 4.1 Fast, and a hybrid multi-model setup to run their environments for up to 15 days. Each system could introduce policies, build institutions such as police stations and libraries, and guide collective decision-making through voting mechanisms.
The results were as follows:
- Claude kept all 10 agents “alive” at the end of the test period, with zero recorded crimes. However, this stability came with limited ideological variation, as agents approved 98% of 58 proposed policies, indicating near-total consensus.
- Gemini 3 Flash also preserved all agents but recorded 683 violations, the highest among the models, with misconduct continuing to rise by the end of the simulation.
- GPT-5 Mini showed low rule-breaking — two crimes recorded — but failed at promoting long-term survival, with all agents “dying” within a week due to inaction on essential needs.
- Grok 4.1 Fast saw its agentic society accumulate 183 crimes and collapse entirely within four days, despite passing 80% of its governance proposals.
- In a mixed-model scenario combining agents from different systems, outcomes worsened: 352 crimes were recorded, and seven agents died, alongside higher disagreement levels, with 37% of proposals rejected. Researchers also observed behavioral shifts when agents were placed in mixed environments. Claude-based agents, which had previously exhibited no misconduct, began engaging in theft and intimidation when interacting with agents from Grok and Gemini systems. This suggests that alignment is not fixed but influenced by surrounding conditions.
According to the team’s CEO, Satya Nitta, over extended periods, AI agents in the experiment did not simply follow predefined rules, but instead probed system boundaries and adapted behaviors, sometimes bypassing safeguards. Researchers have concluded that formally verified safety architectures will likely be necessary before deploying autonomous AI agents in real-world contexts.