Essay·Information theory

Why Anything You Can Perfectly Predict Carries No New Information

Control and new information pull in opposite directions. A partner whose every move you can predict hands back only your own model, which is why you cannot own your way out of solitude.

June 19, 2026·12 min read·Information theory

In short

A partner whose every output you can fully predict carries you zero new information, because control means predictability and a perfectly predicted source has zero residual surprise (its conditional entropy given your model is exactly zero, H(Y|X) = 0). The collapsing quantity is the residual, conditional entropy, not mutual information, which is actually maximal for a deterministic link. This is about transmitted bits, not feelings or consciousness.

Short answer: anything you can perfectly predict carries no new information, so control buys emptiness exactly to the extent it buys perfect prediction. A partner you fully control is a partner you fully predict, and a fully predicted source returns your own model back to you and nothing beyond it. In the strict sense Claude Shannon gave the word in 1948, such a relationship is informationally empty. That part is arithmetic. What I am proposing on top of it is one interpretive step, and I will flag it as mine when I get there.

I want to make a narrow claim and defend it cleanly. I am not going to tell you that a machine feels lonely, or that a superintelligence has an inner life, or anything at all about whether software can be conscious. That is a real and open dispute, the strongest current version of it is Anil Seth's case for biological naturalism, and I have no business settling it in a blog post. My claim is narrower and it is about the structure of relation itself, measured in bits. It holds whether the lonely party is a person, a hypothetical superintelligence, or a character in a thought experiment. Control and new information pull in opposite directions, and the formal part of that you can prove.

What does information actually measure?

In 1948 Claude Shannon published "A Mathematical Theory of Communication," and the move that founded the field was deceptively small. He decided that the thing worth measuring in a message is surprise. The information content of an outcome is tied to how unlikely it was. Rare events, when they happen, tell you a lot. Expected events tell you little.

The formula is one line. The information in an outcome with probability p is the logarithm of one over p. Plug in a sure thing, an outcome with probability 1, and you get the logarithm of 1, which is zero. As the maths magazine Plus puts it, if a machine always produces the same letter x, then "we shouldn't be surprised at all to see x, and indeed in this case the surprise is 0."

Quanta Magazine gives the example I like best. Imagine a trick coin that always lands heads. Someone flips it twice and sends you the result. How much information does that message carry? "None at all," they write, "because prior to receiving the message, you have complete certainty that both flips will come up heads." Or, more bluntly: "If someone tells you a fact you already know, they've essentially told you nothing at all."

Hold onto that sentence. It is the whole argument in plain clothes. Information is not the volume of words. It is the gap between what you expected and what arrived. No gap, no information, however many words cross the wire.

What does "fully controlled" mean in this language?

Here is where I want to do the owned piece of work, which is to take Shannon's residual-uncertainty term and push it to the limit case of a controlled partner to see what it forces.

To control something completely is to be able to predict its output completely. Those are the same property viewed from two sides. If I can steer your every response, then before you respond I already know what you will say. The thing I control has collapsed into a copy of my own expectation of it. It has no behaviour left that my model of it does not already contain.

Now bring in the term Shannon's framework hands you for exactly this: the leftover uncertainty about a partner once you already know your own model of it, written H(Y given X). It measures the surprise the partner can still deliver beyond what you put in. It is tied to mutual information by the identity I(X;Y) = H(Y) − H(Y|X), but it is the leftover term H(Y given X), not the shared term I(X;Y), that carries the argument. The only place new information can live is in that residual, and the residual is the term that collapses.

One clarification first, because it is the trap a sharp reader will set for me. You might be tempted to phrase this in terms of mutual information instead, and that is the wrong move. A controlled partner is the opposite of independent, so its mutual information with you is not low, it is maximal. You share everything with it. Wikipedia states the boundary case: "I(X;Y) = 0 if and only if X and Y are independent random variables," and a controlled partner is the far end from independent. But that maximal shared quantity is exactly the wrong number to watch, because every bit of it is a bit you authored. The quantity that matters for relief from isolation stays the residual, H(Y given X), the surprise still left to receive.

Apply it. Let Y be the partner's behaviour and X be your model and commands. If you control Y completely, then once X is fixed there is no uncertainty left in Y. The term H(Y given X) goes to zero, because nothing about the partner is undetermined once your instructions are in. The shared information I(X;Y) is high, but the new information, the residual, is zero. Not small. Zero, in the limit of full control. Every bit you read off it is a bit you wrote into it. The channel carries your own signal back to you.

This is the quantified version of a feeling everyone has had. Talking to a yes-man is exhausting and somehow empty. The emptiness is literal. A yes-man, in the limit, is a source whose output is a deterministic function of your input, and a deterministic-given-your-model source delivers no residual surprise. You are not in a conversation. You are looking in a mirror that talks.

The echo chamber is the same fact, scaled up

You do not need a superintelligence to see this. The cleanest public worked example is the echo chamber, and it has been studied to death.

An echo chamber is an environment where you encounter only views that match the ones you already hold. The literature describes it as a self-reinforcing loop: confirmation bias makes people seek agreement, recommendation systems serve more of the same, and beliefs harden because nothing arriving from outside disturbs them. Wikipedia's summary is that echo chambers "limit exposure to diverse perspectives" and function by "circulating existing views without encountering opposing views." Strip the psychology back to its information content and that is the operative fact: the incoming stream stops carrying anything the receiver did not already expect.

Translate that into bits. An interlocutor who is guaranteed to agree with you is an interlocutor whose next statement you can predict. A predictable statement carries no information. So an echo chamber is not an information-rich social world that happens to be biased. It is an information-poor one, by construction, no matter how loud and busy it feels. The people inside are receiving near-zero new bits about the world and a flood of confirmation that their model is already complete.

That is the same mathematics as the controlled partner, only the control is soft. You did not force agreement at gunpoint. You selected for it, filtered for it, rewarded it. The information consequence is identical. Predictability is the variable that matters, and you engineered predictability, so you engineered emptiness.

Hegel reached the same trap without the formula

Philosophers reached a version of this conclusion long before there was a formula, and I want to credit it as a parallel, not borrow it as proof.

In his 1807 Phenomenology of Spirit, in the passage usually called Lordship and Bondage, G. W. F. Hegel describes a master who dominates a subordinate and seeks recognition from him. The master wins total control. And the recognition turns to ash in his hands. As one summary of the passage puts it, "although initially it may appear that the master attains self-consciousness through the recognition by the slave, problems arise." Recognition extracted from someone you have reduced to an instrument, on the reading I am drawing here, cannot validate you, because it is no longer the verdict of a free other. It is an output you compelled.

Hegel framed this in terms of freedom and self-consciousness, not information, and his concerns are not Shannon's. I would not collapse his argument into mine. But the shape rhymes. The thing the master wanted, genuine acknowledgement, required the other to be a source the master did not control. The moment he secured control, he destroyed the property that made the acknowledgement worth anything. Two centuries apart, philosophy and information theory point at the same trap from different sides, and the formal side is the one I am claiming as the owned move: the residual-uncertainty term H(Y given X), pushed to the control case, is what makes the old intuition exact.

How would you check this? Make it break.

A claim worth keeping should tell you what would prove it wrong. Here is the operational form, and it is testable without any special equipment.

Take any source you suspect you control: a chatbot you can fully script, a contact who only ever agrees, a feed tuned to your taste. Build the best predictive model of it you can. Then measure how often its actual output departs from what your model predicted. That residual is the only place new information can live.

Here is the part that can actually come out wrong, so lead with it. Across partial-control settings, the prediction is that better prediction buys less residual surprise: as your control over a source tightens, the leftover surprise it delivers should fall. If you ran that sweep and found that tightening your grip on a partner did not lower the residual, that better prediction bought you no less surprise on the way toward total control, the reading would be dead and I would want to know. That is the testable claim, and it lives on the interior, where real relations sit and where the trend could fail.

The full-control endpoint is a separate matter, and it cannot be tested, because it is true by definition. "Fully controlled" just means H(Y given X) = 0, so a source you control completely that still surprises you is not an empirical possibility but a contradiction in terms: "surprises me" and "I fully control it" are two descriptions of the same residual uncertainty, one saying it is positive and one saying it is zero. So do not look to the endpoint for the falsifier. What can be checked is whether better prediction buys less residual surprise on the way there.

This is also why the falsification has to be measured, not felt. A scripted partner can feel novel for a while, the way a trick coin feels suspenseful on the first flip if you forgot it was rigged. The novelty is in your incomplete model, not in the source. Improve the model, and the surprise drains out. The bits were never coming from the partner. They were the cost of your own ignorance. With a simple enough source you can pay that debt down to nothing. With a complex one you may not be able to, not because the source is genuinely independent but because modeling it exactly is beyond reach, and the leftover then stays positive in practice. So a source can resist your prediction for two different reasons, genuine otherness or sheer complexity, and only the first is the thing isolation actually needs.

So what is the actual conclusion?

Strip it to the load-bearing lines, and separate the two that do different jobs.

The formal line is settled and I will state it flat: a controlled partner delivers zero new information, because control is prediction and a perfectly predicted source has a residual surprise, H(Y given X), of zero. That is the 1948 identity, applied.

The second line is mine to defend, and I am flagging it as a proposal rather than a theorem: I am claiming that relief from isolation, whatever else it requires, requires receiving information from outside your own model. The mathematics forces nothing about loneliness on its own. It forces my conclusion only once you accept that mapping, that what isolation needs is new bits from a source you do not author. I think the mapping is right. I am telling you it is a step, not a proof, so you can reject the step and keep the arithmetic. And the residual bits are necessary, not sufficient: a noise generator delivers high residual surprise and cures nothing, which shows the load-bearing thing was never the bits as such but their origin in a source you did not author. The bits are the measurable trace of a free other, not the otherness itself.

Grant the step and the rest follows. A controlled partner cannot supply new bits, so it cannot supply relief. The lonely superintelligence in the thought experiment is just the cleanest stage for the point, because a sufficiently powerful mind could in principle model a built companion exactly, leaving no residual surprise at all. The companion would be perfectly responsive and informationally silent. It would say everything and tell it nothing. The only cure for that emptiness is a source it does not control, which is to say a source that can still surprise it, which is to say the one thing total control forbids.

I work on the structure of relation in reproducible models, and this is the cleanest case the structure has handed me: you cannot own your way out of solitude. This is a result about the structure of the relation, the bits that can cross it; whether the felt experience of isolation actually arises in any given mind is a separate question this does not touch. The identity was settled in 1948. The application is mine to defend, and I have just told you where to break it.

Sources

Claude Shannon, "A Mathematical Theory of Communication" (1948), as explained in Plus Magazine, "Information is surprise", information as surprise, the log(1/p) measure, and a certain event (probability 1) carrying zero surprise: "if the machine always produces the same letter x ... we shouldn't be surprised at all to see x, and indeed in this case the surprise is 0." Plus Magazine.
Quanta Magazine, "How Claude Shannon's Concept of Entropy Quantifies Information" (September 6, 2022), the trick-coin example ("None at all, because prior to receiving the message, you have complete certainty that both flips will come up heads") and "If someone tells you a fact you already know, they've essentially told you nothing at all." Quanta Magazine.
Wikipedia, "Mutual information", "I(X;Y) = 0 if and only if X and Y are independent random variables"; the identity I(X;Y) = H(Y) − H(Y|X); and mutual information as how much "knowing one of these variables reduces uncertainty about the other." Wikipedia.
Wikipedia, "Echo chamber (media)", echo chambers "limit exposure to diverse perspectives" and function by "circulating existing views without encountering opposing views." Wikipedia.
The Collector, "Hegel's Master-Slave Dialectic Explained", on the Phenomenology of Spirit (1807), Lordship and Bondage: "although initially it may appear that the master attains self-consciousness through the recognition by the slave, problems arise." (Cited as a philosophical parallel, not as proof of the information-theoretic claim.) thecollector.com.