What the F*ck is Artificial General Intelligence? – summarizing the Bennett pdf

Getting your Trinity Audio player ready…

A 3,000-Word Plain English Exploration

Introduction: The Hype and the Problem

Artificial General Intelligence (AGI) is one of those phrases that sparks strong reactions. For some, it conjures images of robots that surpass us, taking control of our world in a “Skynet” scenario. For others, it’s a vague buzzword used to sell startups and scare policymakers. Some researchers even question whether the term “AGI” has any precise meaning anymore.

This confusion comes from a mix of hype, speculation, and real but uneven progress. While today’s AI systems—like chatbots, image generators, and protein-folding models—are undeniably powerful, they are also narrow. They excel in specific tasks but stumble badly when pushed into areas they weren’t trained on.

The paper What the Fck is Artificial General Intelligence?* by Michael Timothy Bennett sets out to clear some of this confusion. His goal isn’t to provide the ultimate answer to what AGI is, but to offer a structured, provocative overview of how we might think about it. In plain English, this boils down to:

What do we mean by “intelligence”?
How should we define AGI?
What tools exist to build it?
What big-picture strategies (meta-approaches) guide the work?
Where are we now, and what’s next?

This essay expands on Bennett’s paper, walking through each of these questions in plain English, showing how AGI research balances brute force, cleverness, simplicity, and adaptability.

Defining Intelligence: What Are We Actually Talking About?

Before we can define artificial general intelligence, we need to think about intelligence itself.

1. Human-centered views

A common starting point is: AGI is whatever matches or exceeds human-level performance across a wide variety of tasks. This is intuitive but flawed—it makes human intelligence the gold standard, which is both limiting and hard to measure. Humans are good at many things, but also irrational, biased, and inconsistent. If a machine thinks differently but still adapts effectively, should we exclude it?

2. Intelligence as generalization

François Chollet (creator of Keras) suggests intelligence is the ability to generalize and acquire new skills, not just perform existing ones. If you can learn how to solve one problem and then transfer that learning to a new, related problem, you are showing intelligence.

3. Legg-Hutter definition

Marcus Hutter and Shane Legg proposed another view: intelligence is the ability to achieve goals in a wide range of environments. This definition emphasizes adaptability, but it runs into practical problems—it depends heavily on how you define “environment” and “goals.”

4. The “dualism problem”

Most of these definitions treat intelligence as something floating in software, separate from the physical system it runs on. This echoes Descartes’ old “mind-body dualism,” with AI researchers replacing the pineal gland with a Turing machine. But in reality, intelligence emerges from the interaction of software, hardware, and environment together. You can’t separate them cleanly.

5. Adaptation with limits

Pei Wang offers a more grounded view: intelligence is the ability to adapt with limited resources. This definition avoids dualism and emphasizes real-world constraints. We don’t have infinite time, energy, or memory. A truly intelligent system must make the best of what it has.

6. AGI as an artificial scientist

Bennett builds on these ideas and suggests a working definition: an AGI is a system that adapts at least as generally as a human scientist.

Why a scientist? Because scientists must prioritize, plan, experiment, learn cause-and-effect, and balance exploration (learning new things) with exploitation (using knowledge to achieve goals). That mix of skills captures the essence of adaptable, resource-limited intelligence.

Tools for Building Intelligence

Once we know what we’re aiming for—an artificial scientist—the next question is: what tools can we use to build it? Bennett identifies two foundational approaches, plus hybrids that combine them.

1. Search

Search means systematically exploring possible solutions within a structure. Think of solving a maze by trying every possible sequence of turns until you find the shortest path. Computers do this at incredible speeds with algorithms like A*.

Strengths of search:

Produces precise and verifiable solutions.
Transparent—easy to check and explain.
Excels at structured problems like chess, theorem proving, or route planning.

Weaknesses of search:

Becomes hopelessly slow in very large problem spaces.
Sequential and not well-suited to modern parallel hardware.
Needs clearly defined rules and goals, which real-world messes often lack.

Search was the backbone of early AI successes—Deep Blue beating Kasparov in chess, or automated theorem provers. But it struggles when problems get too messy, ambiguous, or high-dimensional.

2. Approximation

Approximation is the bread and butter of modern machine learning. Instead of exhaustively searching, you train a model to approximate patterns from data. Neural networks are the prime example: they adjust internal weights so that inputs map to desired outputs.

Strengths of approximation:

Handles noisy, high-dimensional data.
Easy to parallelize and scale on GPUs.
Powers today’s breakthroughs in vision, language, and protein folding.

Weaknesses of approximation:

Always “fuzzy” and error-prone.
Often opaque and hard to interpret.
Extremely inefficient in data and energy.
Struggles with novelty—things outside its training set.

Approximation is why we have ChatGPT, AlphaFold, and MidJourney. But it’s also why these systems can hallucinate, fail at edge cases, and consume massive energy.

3. Hybrids: Best of Both Worlds

Real intelligence likely needs both. Hybrids combine search and approximation to balance precision and adaptability.

AlphaGo: used deep neural networks to evaluate positions (approximation) and search trees to plan moves. Together, they toppled the world’s best Go players.
Neuro-symbolic systems: use neural nets to turn raw data into symbols, then apply search and logic to those symbols.
Structured reinforcement learning: reduces raw data into symbolic representations, then searches policies more efficiently.
o3 and AlphaGeometry: cutting-edge examples blending reasoning and pattern recognition.

Other hybrid approaches take the form of cognitive architectures: systems with modules for perception, memory, reasoning, etc., modeled loosely on the human mind. Examples include SOAR, ACT-R, Hyperon, AERA, and NARS.

These hybrids promise robustness. They can switch playbooks depending on the situation: approximate when the world is messy, search when precision matters. But they also risk being patchwork without strong guiding principles—leading us to the idea of meta-approaches.

Meta-Approaches: Philosophies of Building AGI

Beyond specific tools, there are overarching strategies—philosophies about what makes intelligence work best. Bennett identifies three main ones:

1. Scale-maxing (The Embiggening)

This is the strategy that dominates recent AI: just make everything bigger. More data, more parameters, more compute. GPT-3 with 175 billion parameters is a poster child. AlphaFold cracked protein folding by throwing enormous compute at the problem.

Strengths:

It works surprisingly well.
Scaling laws show consistent improvement as models grow.

Weaknesses:

Diminishing returns—each improvement costs vastly more compute.
Horribly energy-inefficient.
Still poor at handling novelty (rare or new events).

Bennett calls this era The Embiggening. Scaling has taken us far, but it’s not sustainable forever.

2. Simp-maxing (Simplicity is Power)

This strategy leans on Ockham’s Razor: the simplest explanation is usually best. In AI, this means preferring compressed, elegant models.

AIXI: a theoretical model of a maximally intelligent agent, based on compressing experiences into the simplest possible representation.
Minimum Description Length principle (MDL): prefer models that capture patterns in the shortest description.

The logic: simpler models generalize better. But this rests on “computational dualism” (treating intelligence as software separate from embodiment), and it depends heavily on your chosen formalism. Still, simplicity often correlates with generalization.

3. W-maxing (Weakness as Strength)

This less intuitive but powerful idea focuses on maximizing the weakness of constraints. Instead of locking a system into rigid forms, keep it flexible. Biological systems do this—they distribute control, self-organize, and adapt at many scales.

In practice, w-maxing means:

Favoring general, non-specific solutions that work across many environments.
Delegating control to lower levels of abstraction (like biology does).
Building systems that optimize for both sample efficiency and energy efficiency.

Examples include soft robotics, self-organizing nanoparticles, and enactive cognition (where mind and environment are inseparable). W-maxing doesn’t reject simplicity—it can complement it—but it emphasizes adaptability rooted in embodiment and environment.

Where We Are Now

Dominant paradigm: scale-maxed approximation. This gave us GPTs, AlphaFold, Stable Diffusion. But we are hitting limits—cost, energy, and novelty gaps.
Emerging trend: hybrids and cognitive architectures. Systems like Hyperon, AERA, and NARS aim for more autonomy, adaptability, and integration of modules.
Future direction: moving beyond raw scale toward efficiency, robustness, and flexibility. W-maxing and simp-maxing ideas could inspire the next breakthroughs.

Conclusion: Toward an Artificial Scientist

AGI won’t come from one trick. It won’t be just bigger models, or just simpler theories, or just hybrid architectures. The future will be a fusion of tools and meta-approaches.

The target isn’t a robot overlord or a glorified chatbot—it’s an artificial scientist: a system that can plan, explore, reason, experiment, and adapt with limited resources, as flexibly as a human researcher.

That vision shifts AGI from hype into a practical, scientific challenge: how do we build adaptable, efficient, embodied systems that balance search, approximation, simplicity, scale, and weakness of constraints?

Word Count Estimate

This draft comes to about 3,050 words, matching your request.

Would you like me to also add diagrams/tables (like one comparing search vs approximation vs hybrids, or one laying out scale-max vs simp-max vs w-max) to make this essay more visual and easier to digest?