Google DeepMind Plans to Track AGI Progress With These 10 Traits of General Intelligence

Few terms are as closely related to AI hype as artificial general intelligence, or AGI. But Google DeepMind researchers have now proposed a framework that would more concretely measure how close models are to this tech industry holy grail.

Artificial general intelligence refers to a mythical AI system that may match the final and highly adaptable type of intelligence present in humans. Because the variety of tasks that giant language models can tackle has rocketed lately, there’s been a growing chorus of voices suggesting the technology is creeping ever closer to this threshold.

But to this point, there’s been no clear solution to assess progress toward AGI, leaving loads of room for speculation and exaggeration. To handle this gap, a team from Google DeepMind has introduced a brand new cognitively inspired framework that deconstructs general intelligence into 10 key faculties. More importantly, they propose a solution to evaluate AI systems across these key capabilities and compare their performance to humans.

“Despite widespread discussion of AGI, there is no such thing as a clear framework for measuring progress toward it. This ambiguity fuels subjective claims, makes it difficult to trace progress, and risks hindering responsible governance,” the researchers write in a paper outlining their recent approach. “We hope this framework will provide a practical roadmap and an initial step toward more rigorous, empirical evaluation of AGI.”

This is not DeepMind’s first try to make clear the term. In 2023, the corporate proposed separating AI systems into different levels of capability, in much the identical way self-driving systems are categorized.

However the approach didn’t really propose a solution to measure what level AI systems have reached. The brand new framework goes further by constructing a firmer conceptual footing for the important thing facets underpinning model performance and a practical solution to evaluate and compare systems.

Digging through a long time of research in psychology, neuroscience, and cognitive science, the researchers discover eight basic cognitive constructing blocks that they are saying make up general intelligence.

These include the perception of sensory inputs and generation of outputs like text, speech, or actions. Add to those learning, memory, reasoning, and the flexibility to focus attention on specific information or tasks. Rounding out the list are metacognition—or the flexibility to reason about and control your personal mental processes—and so-called executive functions, like planning and the inhibition of impulses.

The researchers also outline two “composite faculties” that require several constructing blocks to be applied together. These are problem solving and social cognition, which refers to the flexibility to grasp and react appropriately to the social context.

To evaluate how well AI systems perform on each measure, the researchers suggest subjecting them to a broad suite of cognitive evaluations that focus on each specific ability. Additionally they propose collecting human baselines for every task. This might involve asking a demographically representative sample of adults with at the least a highschool education to finish them under an identical conditions.

The outcomes of those tests can then be combined to create “cognitive profiles” that give a way of a model’s strengths and weaknesses. And by comparing the outcomes against the human baselines, it ought to be possible to find out when a system matches or surpasses the final intelligence of a mean person.

Crucially, the framework focuses on what a system can do relatively than how it does it, which implies the evaluation is agnostic in regards to the underlying technology. Nevertheless, the researchers concede that there may be currently no good solution to measure lots of the core cognitive capabilities identified.

While there are already well-established benchmarks for faculties like problem solving and perception, there are not any reliable tests for things like metacognition, attention, learning, and social cognition. As well as, a lot of one of the best benchmarks are public, which implies the testing criteria are easily accessible and can have already been included in model training data. So the authors say they’re working with academics to construct more robust, non-public evaluations to fill the gaps.

How useful the brand new framework shall be depends upon several aspects. First, it stays to be seen whether the factors identified by the DeepMind team truly capture the essence of human general intelligence. Second, they should prove that acing this test actually leads to higher performance on practical problems in comparison with narrower, specialist AI systems.

But considering the hand-waving nature of the talk around AGI to this point, any framework grounded in well-established cognitive theory and rigorous evaluation represents a big step forward.

Related Post

Leave a Reply