Is Power-Seeking AI an Existential Risk?

June 16, 2022

Joseph Carlsmith

Computer Science

Computers and Society

Artificial Intelligence

Machine Learning

This report examines what I see as the core argument for concern about existential risk from misaligned artificial intelligence. I proceed in two stages. First, I lay out a backdrop picture that informs such concern. On this picture, intelligent agency is an extremely powerful force, and creating agents much more intelligent than us is playing with fire -- especially given that if their objectives are problematic, such agents would plausibly have instrumental incentives to seek power over humans. Second, I formulate and evaluate a more specific six-premise argument that creating agents of this kind will lead to existential catastrophe by 2070. On this argument, by 2070: (1) it will become possible and financially feasible to build relevantly powerful and agentic AI systems; (2) there will be strong incentives to do so; (3) it will be much harder to build aligned (and relevantly powerful/agentic) AI systems than to build misaligned (and relevantly powerful/agentic) AI systems that are still superficially attractive to deploy; (4) some such misaligned systems will seek power over humans in high-impact ways; (5) this problem will scale to the full disempowerment of humanity; and (6) such disempowerment will constitute an existential catastrophe. I assign rough subjective credences to the premises in this argument, and I end up with an overall estimate of ~5% that an existential catastrophe of this kind will occur by 2070. (May 2022 update: since making this report public in April 2021, my estimate here has gone up, and is now at >10%.)

A Review of the Evidence for Existential Risk from AI via Misaligned Power-Seeking

October 27, 2023

94% Match

Rose Hadshar

Computers and Society

Artificial Intelligence

Rapid advancements in artificial intelligence (AI) have sparked growing concerns among experts, policymakers, and world leaders regarding the potential for increasingly advanced AI systems to pose existential risks. This paper reviews the evidence for existential risks from AI via misalignment, where AI systems develop goals misaligned with human values, and power-seeking, where misaligned AIs actively seek power. The review examines empirical findings, conceptual arguments a...

Find SimilarView on arXiv

Current and Near-Term AI as a Potential Existential Risk Factor

September 21, 2022

93% Match

Benjamin S. Bucknall, Shiri Dori-Hacohen

Computers and Society

Artificial Intelligence

There is a substantial and ever-growing corpus of evidence and literature exploring the impacts of Artificial intelligence (AI) technologies on society, politics, and humanity as a whole. A separate, parallel body of work has explored existential risks to humanity, including but not limited to that stemming from unaligned Artificial General Intelligence (AGI). In this paper, we problematise the notion that current and near-term artificial intelligence technologies have the po...

Find SimilarView on arXiv

Artificial Intelligence: Arguments for Catastrophic Risk

January 27, 2024

92% Match

Adam Bales, William D'Alessandro, Cameron Domenico Kirk-Giannini

Computers and Society

Artificial Intelligence

Recent progress in artificial intelligence (AI) has drawn attention to the technology's transformative potential, including what some see as its prospects for causing large-scale harm. We review two influential arguments purporting to show how AI could pose catastrophic risks. The first argument -- the Problem of Power-Seeking -- claims that, under certain assumptions, advanced AI systems are likely to engage in dangerous power-seeking behavior in pursuit of their goals. We r...

Find SimilarView on arXiv

Gradual Disempowerment: Systemic Existential Risks from Incremental AI Development

January 28, 2025

92% Match

Jan Kulveit, Raymond Douglas, Nora Ammann, Deger Turan, ... , Duvenaud David

Computers and Society

This paper examines the systemic risks posed by incremental advancements in artificial intelligence, developing the concept of `gradual disempowerment', in contrast to the abrupt takeover scenarios commonly discussed in AI safety. We analyze how even incremental improvements in AI capabilities can undermine human influence over large-scale systems that society depends on, including the economy, culture, and nation-states. As AI increasingly replaces human labor and cognition ...

Find SimilarView on arXiv

AI Research Considerations for Human Existential Safety (ARCHES)

May 30, 2020

90% Match

Andrew Critch, David Krueger

Computers and Society

Artificial Intelligence

Machine Learning

Framed in positive terms, this report examines how technical AI research might be steered in a manner that is more attentive to humanity's long-term prospects for survival as a species. In negative terms, we ask what existential risks humanity might face from AI development in the next century, and by what principles contemporary technical research might be directed to address those risks. A key property of hypothetical AI technologies is introduced, called \emph{prepotence...

Find SimilarView on arXiv

Artificial General Intelligence, Existential Risk, and Human Risk Perception

November 15, 2023

90% Match

David R. Mandel

Computers and Society

Artificial Intelligence

Artificial general intelligence (AGI) does not yet exist, but given the pace of technological development in artificial intelligence, it is projected to reach human-level intelligence within roughly the next two decades. After that, many experts expect it to far surpass human intelligence and to do so rapidly. The prospect of superintelligent AGI poses an existential risk to humans because there is no reliable method for ensuring that AGI goals stay aligned with human goals. ...

Find SimilarView on arXiv

AI Risk Skepticism, A Comprehensive Survey

February 16, 2023

90% Match

Vemir Michael Ambartsoumean, Roman V. Yampolskiy

Computers and Society

Artificial Intelligence

In this thorough study, we took a closer look at the skepticism that has arisen with respect to potential dangers associated with artificial intelligence, denoted as AI Risk Skepticism. Our study takes into account different points of view on the topic and draws parallels with other forms of skepticism that have shown up in science. We categorize the various skepticisms regarding the dangers of AI by the type of mistaken thinking involved. We hope this will be of interest and...

Find SimilarView on arXiv

Multi-Agent Risks from Advanced AI

February 19, 2025

90% Match

Lewis Hammond, Alan Chan, Jesse Clifton, Jason Hoelscher-Obermaier, Akbir Khan, Euan McLean, Chandler Smith, Wolfram Barfuss, Jakob Foerster, Tomáš Gavenčiak, The Anh Han, Edward Hughes, Vojtěch Kovařík, Jan Kulveit, Joel Z. Leibo, Caspar Oesterheld, Witt Christian Schroeder de, Nisarg Shah, Michael Wellman, Paolo Bova, Theodor Cimpeanu, Carson Ezell, Quentin Feuillade-Montixi, Matija Franklin, Esben Kran, Igor Krawczuk, Max Lamparth, Niklas Lauffer, Alexander Meinke, Sumeet Motwani, Anka Reuel, Vincent Conitzer, Michael Dennis, Iason Gabriel, Adam Gleave, Gillian Hadfield, Nika Haghtalab, Atoosa Kasirzadeh, Sébastien Krier, Kate Larson, Joel Lehman, David C. Parkes, ... , Rahwan Iyad

Multiagent Systems

Artificial Intelligence

Computers and Society

Emerging Technologies

Machine Learning

The rapid development of advanced AI agents and the imminent deployment of many instances of these agents will give rise to multi-agent systems of unprecedented complexity. These systems pose novel and under-explored risks. In this report, we provide a structured taxonomy of these risks by identifying three key failure modes (miscoordination, conflict, and collusion) based on agents' incentives, as well as seven key risk factors (information asymmetries, network effects, sele...

Find SimilarView on arXiv

An Overview of Catastrophic AI Risks

June 21, 2023

90% Match

Dan Hendrycks, Mantas Mazeika, Thomas Woodside

Computers and Society

Artificial Intelligence

Machine Learning

Rapid advancements in artificial intelligence (AI) have sparked growing concerns among experts, policymakers, and world leaders regarding the potential for increasingly advanced AI systems to pose catastrophic risks. Although numerous risks have been detailed separately, there is a pressing need for a systematic discussion and illustration of the potential dangers to better inform efforts to mitigate them. This paper provides an overview of the main sources of catastrophic AI...

Find SimilarView on arXiv

Examining Popular Arguments Against AI Existential Risk: A Philosophical Analysis

January 7, 2025

89% Match

Torben Swoboda, Risto Uuk, Lode Lauwaert, Andrew P. Rebera, Ann-Katrien Oimann, ... , Prunkl Carina

Computers and Society

Concerns about artificial intelligence (AI) and its potential existential risks have garnered significant attention, with figures like Geoffrey Hinton and Dennis Hassabis advocating for robust safeguards against catastrophic outcomes. Prominent scholars, such as Nick Bostrom and Max Tegmark, have further advanced the discourse by exploring the long-term impacts of superintelligent AI. However, this existential risk narrative faces criticism, particularly in popular media, whe...

Find SimilarView on arXiv