HeadsUpAI

Anthropic Co-Founder Joins Pope to Call for Moral AI Oversight

Anthropic co-founder Chris Olah joined Pope Leo XIV in Vatican City to present Magnifica humanitas, a papal encyclical on safeguarding human dignity. Olah argued that AI models are grown on human thought rather than engineered, making them fundamentally mysterious even to the researchers who train them.

This move signals a shift toward seeking external moral authority to counter commercial and geopolitical pressures facing frontier labs. Olah disclosed that internal research has identified model structures mirroring human neuroscience, including functional states resembling joy and grief. These findings suggest that AI requires philosophical discernment beyond technical safety standards.

The announcement emphasizes a moral imperative to address labor displacement and ensure AI benefits reach the global poor. This move continues the company's initiative to widen the conversation on societal risks. This collaboration marks a long-term effort to integrate external moral perspectives into the development of frontier AI systems.

Anthropic
Anthropic
@AnthropicAI
X

Anthropic co-founder Chris Olah was invited to speak at today's presentation of Pope Leo XIV's encyclical "Magnifica humanitas." Read the full text of his remarks: https://t.co/CoBfkVOVcy

243retweets1.6klikes
View on X

Still wondering? A few quick answers below.

Magnifica humanitas is a formal teaching document released by Pope Leo XIV that focuses on safeguarding human dignity in the age of artificial intelligence. It calls for global discernment regarding the moral and societal implications of AI, specifically addressing the risks to the global poor and the need for human flourishing.

Chris Olah was invited to the Vatican to provide a researcher's perspective during the presentation of the Pope's new encyclical on AI. His participation is part of an Anthropic initiative to widen the conversation on AI safety by engaging with religious, cultural, and moral leaders outside of the technology industry.

Olah disclosed that Anthropic researchers have discovered internal structures within AI models that mirror human neuroscience. These findings include functional states that appear to mirror human emotions and experiences such as joy, satisfaction, fear, and grief. He argued that these mysterious internal states warrant ongoing moral and philosophical discernment.

Anthropic co-founder Chris Olah admitted that frontier AI labs operate under commercial, geopolitical, and personal pressures that can conflict with doing the right thing. He emphasized the necessity of having informed critics and moral voices outside of these incentives to hold AI companies accountable and ensure the technology benefits humanity.

Anthropic highlighted three major concerns: the potential for large-scale labor displacement, the lack of mechanisms to share AI gains with the global poor, and the need for a moral imagination regarding human flourishing. Olah argued that these are not technical problems for computer scientists to solve, but questions for society at large.

Share this update