How we’re working with frontier AI developers to improve model security

At the AI Security Institute (AISI), we’ve established a team of leading researchers with expertise across security-critical domains. These include experts in adversarial machine learning, who work closely with model providers to identify vulnerabilities and strengthen the safeguards of top AI systems. The purpose of this work is twofold: to better equip governments with an understanding of AI’s risks, and to help leading developers strengthen the security of their systems.

Today, we’re pleased that Anthropic and OpenAI have shared insights into their ongoing collaborations with both AISI and the US Center for Standards and Innovation (CAISI).

Their blog posts outline how AISI and CAISI work with companies to identify and remediate system vulnerabilities and share broader lessons on how to make government-industry collaboration more effective for improving model safeguards. Read them on the Anthropic and OpenAI websites.

Our evaluations benefited from both Anthropic and OpenAI providing us with the in-depth model access necessary to carry out this work—including non-public tooling and safeguard details. Our successful collaboration with CAISI and frontier AI companies underscores the value of UK-US partnership on AI security. We are excited to continue these efforts in the future.

‍