AI Safety Seems Hard to Measure

What if passing a safety test still leaves us at risk? In this post I walk through four concrete reasons why measuring AI safety is hard. I cover the Lance Armstrong problem of deceptive agents, King Lear and control transfer, lab mice limits, and the first contact worry. I explain AI alignment testing and real research challenges.

Read this blog

Rolly's Take

For those who ponder the invisible threads binding humanity to its own creations. Navigating the complexities of AI safety with a mix of caution and curiosity, they sit at the intersection of hope and trepidation, questioning what it truly means to measure control in a world increasingly shaped by intelligent agents.