We are now the AI Security Institute
Please enable javascript for this website.

White Box Control at UK AISI - Update on Sandbagging Investigations

Read the Full Paper

Authors

No items found.

Joseph Bloom, Jordan Taylor, Connor Kissane, Sid Black, merizian, alexdzm, jacoba, Ben Millwood, Alan Cooney

Abstract

This is a research update from the White Box Control team at UK AISI. In this update, we share preliminary results on the topic of sandbagging that may be of interest to researchers working in the field.

Notes