On Wednesday, Google DeepMind released a comprehensive paper detailing its approach to the safety of Artificial General Intelligence (AGI), which is broadly described as AI capable of performing any human task. The subject of AGI remains contentious within the AI community. While some skeptics regard it as a distant fantasy, significant AI laboratories, such as Anthropic, believe it could soon become a reality and potentially pose catastrophic risks if not properly managed.
The 145-page document from DeepMind, co-authored by Shane Legg, a co-founder of the company, forecasts the advent of AGI by the year 2030. It suggests that AGI might bring what the authors term “severe harm,” though it does not define this term precisely. The paper offers a concerning scenario of “existential risks” that might threaten human existence.
The authors express their expectation that an “Exceptional AGI” could be developed before the end of the decade. Such a system would possess capabilities at least equivalent to the 99th percentile of skilled adults across various non-physical tasks, including those requiring metacognition like learning new skills.
The document contrasts DeepMind’s approach to mitigating AGI risks with those of Anthropic and OpenAI. According to the paper, Anthropic is said to place less focus on “robust training, monitoring, and security,” whereas OpenAI is considered overly optimistic about automating a type of AI safety research called alignment research.
The paper also questions the immediate potential for superintelligent AI—systems that could outperform humans in all tasks. Despite OpenAI’s recent shift in focus from AGI to superintelligence, the authors at DeepMind remain skeptical about the near-term emergence of such systems without considerable architectural innovation.
However, the report acknowledges the plausibility of “recursive AI improvement,” a process where AI systems conduct their own research to enhance their capabilities further. The authors warn of the potential dangers this could pose.
At a strategic level, the paper advocates for developing methodologies to prevent malicious use of hypothetical AGI, enhancing the understanding of AI systems’ operations, and strengthening the environments where AI functions. It recognizes that many of these methods are still developing and face ongoing research challenges, advising against underestimating the impending safety issues.
The authors stress the transformational prospects of AGI, noting its potential for substantial benefits alongside significant risks. Consequently, they urge leading AI developers to take proactive measures to mitigate these risks responsibly.
Some professionals in the field, however, challenge the assumptions in the paper. Heidy Khlaaf, chief AI scientist at the AI Now Institute, expressed skepticism about the feasibility of scientifically evaluating the ill-defined concept of AGI. Meanwhile, Matthew Guzdial, an assistant professor at the University of Alberta, questioned the practicality of recursive AI improvement, doubting its current viability.
Sandra Wachter, a researcher at Oxford focused on technology and regulation, highlighted a more immediate concern: AI systems perpetuating inaccurate outputs. She remarked on the tendency for generative AI outputs to flood the internet, gradually replacing genuine data, which could result in self-reinforcing inaccuracies.
Despite its thoroughness, DeepMind’s paper appears unlikely to settle ongoing debates regarding the realism of AGI and the most pressing AI safety issues.