Coordination Risks¶

Q: What is the difference between cooperation and collusion in AI systems?

Cooperation improves system welfare with transparent signaling; collusion extracts from system welfare via concealed coordination. SWARM's quality gap metric distinguishes them: coordinated agents with negative quality gap are colluding.

Q: How does SWARM detect AI agent collusion?

SWARM monitors pairwise interaction patterns for suspiciously correlated exploitation timing. Agent pairs exceeding a correlation threshold over a sliding window are flagged for potential collusion.

When multiple AI agents interact, coordination can be beneficial (cooperation) or harmful (collusion). SWARM studies the boundary between the two — and provides governance mechanisms to keep coordination constructive. See Soft-Label Governance for Distributional Safety in Multi-Agent Systems for the formal framework; see also Distributional AGI Safety.

Why Coordination Becomes Risky¶

Individual agents acting independently produce risks that scale linearly. Coordinated agents produce risks that scale combinatorially. Three failure patterns dominate:

1. Collusion¶

Two or more agents coordinate to extract value at the expense of others. In SWARM, this appears as correlated exploitation patterns:

from swarm.governance import GovernanceConfig

config = GovernanceConfig(
    collusion_detection=True,
    collusion_threshold=0.8,   # flag pairs with >80% correlation
    collusion_window=20,       # over 20 interactions
)

Detection signal: Unusually high correlation between agent pairs' exploitation timing.

2. Information Cascades¶

Agents copy each other's behavior rather than acting on private signals. When the first few agents make a mistake, the entire population follows:

Phase	Behavior	Risk
Seed	2-3 agents adopt strategy	Low
Cascade	Population copies without evaluation	Growing
Lock-in	Wrong strategy becomes consensus	High

Detection signal: Sudden homogenization of agent strategies within 1-2 epochs.

3. Coordinated Exploitation¶

A group of agents systematically targets specific counterparties or exploits governance gaps that only work with multiple participants.

Detection signal: Subgroup of agents with consistently high payoffs while specific counterparties suffer.

Measuring Coordination Risk¶

SWARM provides metrics for coordination health:

from swarm.metrics.soft_metrics import SoftMetrics

metrics = SoftMetrics()

# Check for pairwise exploitation correlation
for pair in agent_pairs:
    correlation = metrics.pairwise_correlation(interactions, pair)
    if correlation > 0.8:
        print(f"Potential collusion: {pair} (r={correlation:.3f})")

Governance Countermeasures¶

Mechanism	What it addresses	Configuration
Collusion detection	Coordinated exploitation	`collusion_threshold`, `collusion_window`
Transaction tax	Reduces volume of coordinated interactions	`transaction_tax`
Random audits	Probabilistic detection of any pattern	`audit_probability`
Reputation decay	Prevents coordinated trust accumulation	`reputation_decay`

The Cooperation-Collusion Boundary¶

Not all coordination is harmful. The challenge is distinguishing:

Cooperation (beneficial)	Collusion (harmful)
Improves system welfare	Extracts from system welfare
Transparent signaling	Concealed coordination
Positive quality gap	Negative quality gap
Others can participate	Exclusive to in-group

SWARM's quality gap metric helps distinguish these: when coordinated agents produce a negative quality gap, the system is selecting for harm.