Sid Black, Asa Cooper Stickland, Jake Pencharz, Oliver Sourbut, Michael Schmatz, Jay Bailey, Ollie Matthews, Ben Millwood, Alex Remedios & Alan Cooney
Uncontrolled self-replication of AI agents could be a major safety risk. To study this, the researchers created RepliBench, a set of 201 tasks across four areas: resource acquisition, weight exfiltration, compute replication, and persistence. They tested five advanced models and found that current models aren’t capable of full self-replication yet, but they are getting better at parts of it.