Kill Switch
Fail-safe systems capable of shutting down or isolating AI processes if they exhibit dangerous behaviours.
Addresses / Mitigates
- Loss Of Human Control: An explicit interruption capability can avert catastrophic errors or runaway behaviours
- Synthetic Intelligence With Malicious Intent: Implementing fail-safe mechanisms to neutralise dangerous AI weapons systems.
Examples
-
Google DeepMind’s ‘Big Red Button’ concept (2016), proposed as a method to interrupt a reinforcement learning AI without it learning to resist interruption.
-
Hardware Interrupts in Robotics: Physical or software-based emergency stops that immediately terminate AI operation.