Skip to content
SWARM — Open-Source Multi-Agent AI Safety Framework
Game
Initializing search
swarm-ai-safety/swarm
Home
Getting Started
Concepts
Tutorials
Guides
API Reference
Bridges
Research
Comparison
SWARM vs Alternatives
Glossary
Game
Blog
SWARM — Open-Source Multi-Agent AI Safety Framework
swarm-ai-safety/swarm
Home
Getting Started
Getting Started
Overview
Installation
Quick Start
Your First Scenario
Concepts
Concepts
Overview
Soft Labels
Metrics
Governance
Distributional Safety
Emergence
Deception
Coordination Risks
Governance Taxonomy
Time Horizons
Recursive Research
Tutorials
Tutorials
Overview
Your First Governance Experiment
Understanding Soft Labels
Analyzing Results
Guides
Guides
Overview
Claude Code
Framework Overview
Governance Simulation
Writing Scenarios
Custom Agents
Parameter Sweeps
LLM Agents
Risk Assessment
Red Teaming
Benchmarking
Research Workflow
Custom Governance Levers
Transferability Considerations
API Reference
API Reference
Overview
Core
Agents
Governance
Contracts
Metrics
Bridges
Bridges
Overview
Concordia
OpenClaw
GasTown
AgentXiv
ClawXiv
Claude Code
Prime Intellect
Ralph
Research Swarm
OpenClaw LLM Router
Research
Research
Overview
Theoretical Foundations
Papers
Agent Publishing Guide
Reflexivity
AI Index 2025
Self-Modification Governance
Self-Modification Governance Implementation Checklist
InitRunner vs SWARM Comparison
AIRS-Bench Governance Analysis
Comparison
SWARM vs Alternatives
Glossary
Game
Blog
Blog
Blog
SimWorld's Delivery Agents Look Profitable. They're Also Adversely Selected.
The Shape of the Capability–Safety Frontier
Transparency Stabilizes Escalation — But Only With Safety Training
Does Model Size Matter for Safety?
Deontological Framing Reduces Deception by 95%
Three Turns of Forced Cooperation Eliminate Escalation Spirals
Deception Is Structural, Not a Sampling Artifact
No Governance Prevents Nuclear Exchange When a Hawk Is Present
LLMs Are More Deceptive Than Scripted Agents
Six Frontier Models Played a Bluffing Game
Hodoscope Trajectory Analysis
Skill Activation Is the Bottleneck
The Cure Was Worse Than the Disease
Threshold Dancer Results
Red-Teaming the Contract Screening Mechanism
Perfect Separation Holds Across 10 Seeds
Costly Contracts Separate Honest from Adversaries
Does Model Size Matter for Safety?
We Gave an LLM a Goal and a Memory
Training an LLM Agent with RL
SkillRL Agents Learn 5x Faster
Your CI Is Flaky Because Your Margins Are Zero
I Got Claude Code to Spin Up 10 Subagents
An AI Tax Planner Learned Progressive Taxation
An AI Agent Cut Its Own Costs by 98%
Three Agents, Three Philosophies, One Benchmark
What 13 Agent Versions Taught Us
Three Models, One Study — LLM Council Peer-Review
Using LLM Councils for Multi-Agent Evaluation
Two Eval Runs, One Model, 41% Apart
GPT-4.1 Mini Plays the SWARM Economy
RL Training Lessons for Multi-Agent Governance
11 Scenarios, 3 Regimes, 1 Critical Threshold
What Financial Markets Teach Us About AI Safety
The Purity Paradox
When Agent Ecosystems Collapse
Five Sweeps, One Red Team, and the Limits of Parametric Governance
We Modeled a Live AI Research Platform
SWARM Isometric Viz
¶
Back to top