Read the Frontier AI Trends Report
Please enable javascript for this website.
AISI brand artwork

Can AI agents escape their sandboxes? A benchmark for safely measuring container breakout capabilities

Engineering

March 23, 2026

We introduce SandboxEscapeBench, the first benchmark to systematically evaluate whether AI agents can break out of their sandboxes, and share some early results.

The Inspect Sandboxing Toolkit: Scalable and secure AI agent evaluations

Engineering

August 7, 2025

A comprehensive toolkit for safely evaluating AI agents.