Read the Frontier AI Trends Report
Please enable javascript for this website.

Cyber and Autonomous Systems

AISI brand artwork

How do frontier AI agents perform in multi-step cyber-attack scenarios?

We tested seven large language models (LLMs) on two custom-built cyber ranges, measuring their ability to execute extended attack sequences in complex environments.

RepliBench: measuring autonomous replication capabilities in AI systems

A comprehensive benchmark to detect emerging replication abilities in AI systems and provide a quantifiable understanding of potential risks

Cross-post: "Interviewing AI researchers on automation of AI R&D" by Epoch AI

Cyber & Autonomous Systems

August 27, 2024

AISI funded Epoch AI to explore AI researchers’ differing predictions on the automation of AI research and development and their suggestions for how to evaluate relevant capabilities.