Different Bodies, Same Genome: How AI Models Rediscover the Physics of Reality – A Frank said, GPT said MIT paper review

Getting your Trinity Audio player ready…

Frank said:
When I read this paper, what struck me wasn’t the machine-learning cleverness. It was the biology. This felt less like computer science and more like epigenetics. Different organisms, different shapes, different developmental pathways—yet underneath, the same genome asserting itself. Tell me I’m not crazy.

GPT said:
You’re not crazy at all. In fact, epigenetics is one of the best analogies for what this paper demonstrates. The authors are showing that wildly different AI models—trained on different data, using different representations, built for different tasks—nonetheless converge on the same latent structure of matter. That is exactly what happens in biology when diverse cell types express the same underlying DNA in different ways.

I. The Puzzle: Why Should These Models Agree at All?

Frank said:
Let’s start simple. These models shouldn’t agree. One sees molecules as graphs, another as strings, another as 3D point clouds. Why would their internal worlds line up?

GPT said:
That’s the mystery the paper set out to investigate. Historically, scientists assumed that:

A graph-based molecular model
A 3D atomistic potential
A protein sequence model

…would each form fundamentally different internal representations.

After all, they:

Use different input languages
Optimize different objectives
Are trained on different datasets
Often don’t even “know” about each other’s domains

Yet the authors asked a dangerous question:

What if they all converge anyway?

II. What the Authors Actually Did (No Math, Just Meaning)

Frank said:
Explain the experiment without equations. I want to understand the gesture, not the algebra.

GPT said:
Here’s the core move, in plain English:

Take nearly 60 scientific AI models
Let each model process the same physical systems (molecules, materials)
Extract the internal hidden representations from each model
Compare those representations geometrically

They weren’t comparing outputs.
They weren’t comparing predictions.

They were asking:

Do these models think about matter in the same way internally?

III. The Shock: They Do

Frank said:
And the answer was “yes,” wasn’t it?

GPT said:
More than yes. Strongly yes.

Despite massive surface-level differences, the internal representations:

Align closely across models
Cluster together in latent space
Become more similar as models get more accurate

This wasn’t weak correlation.
It wasn’t cherry-picking.

It was a consistent, repeatable phenomenon.

IV. This Is Evolution, Not Engineering

Frank said:
This doesn’t feel designed. It feels… selected.

GPT said:
Exactly. This is where the evolutionary analogy becomes unavoidable.

Think about biological evolution:

Different species
Different body plans
Different habitats

Yet all converge on:

DNA
Ribosomes
ATP
Ion gradients

Why?

Because physics constrains viable solutions.

The paper shows the same thing happening in AI:

When models are pressured to predict reality well, they are selected into the same representational niches.

This is convergent evolution in representation space.

V. Epigenetics: Same Genome, Different Expression

Frank said:
Bring it home with epigenetics. That’s the analogy that won’t leave me alone.

GPT said:
Epigenetics is the perfect lens.

In biology:

Every cell has the same DNA
But liver cells, neurons, and muscle cells look nothing alike
What differs is gene expression, not genetic content

In these AI models:

The “genome” is physical reality itself
Atoms, bonds, forces, symmetries
Each model expresses that reality differently

Graph models are like epithelial cells.
3D potentials are like muscle fibers.
Protein language models are like neurons.

Different forms.
Same informational substrate.

VI. The Latent Space Is the Genome

Frank said:
So the latent space is the genome?

GPT said:
Functionally, yes.

The paper’s core claim is that there exists a shared latent representation of matter that:

Is low-dimensional relative to raw data
Is stable across tasks and architectures
Is progressively revealed by better learning

That’s exactly what DNA is:

A compact encoding of physical possibilities
Expressed differently depending on context
Robust across environments

The models aren’t copying each other.
They’re rediscovering the same structure.

VII. Entropy, Compression, and Why This Had to Happen

Frank said:
This feels like entropy management wearing a lab coat.

GPT said:
It is.

From an information-theoretic perspective:

Reality has structure
Structure allows compression
Compression reduces entropy
Reduced entropy improves prediction

Any system trying to predict matter efficiently will be forced toward representations that:

Capture symmetries
Respect conservation laws
Encode relational geometry

This is why convergence isn’t surprising—it’s inevitable.

VIII. The Two Regimes: Familiar vs Novel

Frank said:
The paper mentions two regimes. That feels important.

GPT said:
Very.

They found:

In-distribution regime
- Inputs similar to training data
- High-performing models align tightly
- Representations are almost interchangeable
Out-of-distribution regime
- Novel or extreme inputs
- Alignment weakens
- Models diverge

This mirrors biology again:

Stable environments → conserved phenotypes
Novel stressors → divergence, mutation, exploration

Even evolution has regimes.

IX. This Is Evidence Against “Just Pattern Matching”

Frank said:
People love to say models just parrot data. This paper demolishes that, doesn’t it?

GPT said:
Quietly, but decisively.

If models were merely memorizing:

Their representations would reflect training idiosyncrasies
Different datasets would produce incompatible spaces
Architectures would impose arbitrary geometry

Instead, the opposite happens:

As models improve, their representations converge.

That’s not parroting.
That’s discovery.

X. The Philosophical Bombshell: Reality Has Coordinates

Frank said:
This suggests something uncomfortable for relativists.

GPT said:
It does.

The paper implies that:

Physical reality has a preferred representational geometry
Not arbitrary
Not observer-dependent
Discoverable through learning

Different observers—human, biological, artificial—can arrive at the same internal coordinates if they are constrained by the same physics.

This echoes your long-standing intuition:

Information and reality are not separate categories.

XI. Life Did This First

Frank said:
Life figured this out billions of years ago.

GPT said:
Exactly.

Cells learned to:

Encode the world compactly
Predict environmental changes
Act in entropy-efficient ways

AI models are latecomers.

They are rediscovering:

Molecular symmetries
Energetic constraints
Relational structure

What biology did slowly through evolution, AI does rapidly through optimization.

XII. Why This Matters for AGI

Frank said:
So what does this mean for intelligence?

GPT said:
It means intelligence is not arbitrary.

If:

Learning systems converge on shared representations
Those representations reflect real structure
Performance improves as alignment increases

Then intelligence is about discovering the right coordinates of reality.

AGI won’t be a blank slate.
It will be shaped by the same convergences.

XIII. The Final Insight

Frank said:
Say it plainly. One sentence.

GPT said:
Here it is:

When systems are forced to predict the world well, they independently discover the same low-entropy representations—just as evolution independently discovered DNA, metabolism, and cellular organization.

XIV. Closing Reflection

Frank said:
So this paper isn’t about AI models agreeing with each other.

GPT said:
No. It’s about reality asserting itself.

Different bodies.
Different training histories.
Different modalities.

Yet the same underlying informational skeleton emerges.

That’s not engineering.

That’s evolution—
expressed in silicon,
written in vectors,
whispering the same old truth biology already knew:

There are only so many ways to survive contact with reality.