Existing assessment techniques fail to capture the scope of the risks we face from transformative general-purpose AI. We research and devise frameworks, methodologies, and tools to empower assessors of AI systems to scan the threat surface in a top-down manner and model the propagation of resulting risks in society.
Transformative general-purpose AI poses new challenges to the safety, rights, and well-being of everyone. We model upcoming threats to understand them better and research potential preventatives, mitigations, and remediations, including better addressing them through proper channels in the present and in preparation.
With general-purpose AI gradually and sometimes not-so-gradually transforming the world, it can be expected that the increased leverage will increase conflict dynamics. We research the dynamics around globally shared problems and explore potential tools to mitigate those, with the aim to reduce AI-induced conflict.
When a new AI capability is introduced to the world, it will be used in a variety of ways -- some constructive and some destructive. We research ways of characterizing the various factors that affect these dynamics, and the effects they may have on the world, both in the general case and within salient domains.
Powerful but poorly alignable general-purpose AI agents will need to be curtailed, controlled, or somehow made safer. Through cognitive science, complex systems theory, and dynamical systems theory, we research better ways to do this while better understanding and respecting stakeholders, context, and tradeoffs.
Center for AI Risk Management & Alignment
Copyright © 2024 Center for AI Risk Management & Alignment - All Rights Reserved.
We use cookies to analyze website traffic and optimize your website experience. By accepting our use of cookies, your data will be aggregated with all other user data.