Anthropic CEO Dario Amodei raised a couple of eyebrows on Monday after suggesting that advanced AI models might someday be supplied with the flexibility to push a “button” to quit tasks they could find unpleasant. Amodei made the provocative remarks during an interview on the Council on Foreign Relations, acknowledging that the thought “sounds crazy.”
“So that is—that is one other one in all those topics that’s going to make me sound completely insane,” Amodei said through the interview. “I believe we must always not less than consider the query of, if we’re constructing these systems they usually do all types of things like humans in addition to humans, and appear to have a number of the identical cognitive capacities, if it quacks like a duck and it walks like a duck, possibly it’s a duck.”
Amodei’s comments got here in response to an audience query from data scientist Carmem Domingues about Anthropic’s late-2024 hiring of AI welfare researcher Kyle Fish “to have a look at, you realize, sentience or lack of thereof of future AI models, and whether or not they might deserve moral consideration and protections in the long run.” Fish currently investigates the highly contentious topic of whether AI models could possess sentience or otherwise merit moral consideration.
“So, something we’re enthusiastic about beginning to deploy is, you realize, after we deploy our models of their deployment environments, just giving the model a button that claims, ‘I quit this job,’ that the model can press, right?” Amodei said. “It’s just a few form of very basic, you realize, preference framework, where you say if, hypothesizing the model did have experience and that it hated the job enough, giving it the flexibility to press the button, ‘I quit this job.’ In case you find the models pressing this button so much for things which are really unpleasant, you realize, possibly it is best to—it doesn’t suggest you are convinced—but possibly it is best to pay some attention to it.”