# Information Theory and Entropy

The way I think of Shannon’s information theoretic concept of entropy is as “uncertainty.” Or, in another approach, Shannon’s entropy — and there are many types of entropy — is a measure of how resistant “information content” is to compression: more entropy/uncertainty, means less redundancy, and consequently less compression. Consequently, static on the TV, while we think of as being rather meaningless and not communicating much information, actually has very high Shannon entropy. If you took a screen shot of static and saved it to a GIF file, its size would be quite large. If, instead, you consider an image of a flower, it will have a lower Shannon entropy, even though it feels like there’s more being communicated.

Shannon’s conception of “information content” and “uncertainty” has nothing to do with the meaning of the symbols, only their statistical character, as Shannon wrote, “These semantic aspects of communication are irrelevant to the engineering problem.” In the image of the flower there’s redundancy: there’s whole chunks of pixels representing the petals that are of the same shade of yellow. Even though the image of the flower is more meaningful to us, the image of the flower has much less entropy/uncertainty: it doesn’t have to represent a different color for every single pixel as in the image of static.

Other conceptualizations (including those from physics) of entropy tend to confuse an understanding of Shannon entropy — and the metaphor of moving objects around can make it worse. This confusion tends to happen for a number of reasons:

1. Shannon didn’t know what to call his measure of uncertainty, and John von Neumann didn’t help things when he coyly suggested, “You should call it entropy … [since] … no one knows what entropy really is, so in a debate you will always have the advantage.”
2. In physics, entropy is often considered to be a measurement of “disorder” in a closed system, and static on a TV certainly seems disordered. However, physical entropy is best thought of as an irreversible physical or chemical change that will not spontaneously reverse itself without some external influence. So the Sun is shedding energy, towards “disorder” and a “heat death.” Of course, the Earth is picking a lot of that up and using it for “ordered” things like photo-synthesis and evolution. But if you look at the Sun and Earth (and everything that influences them), the sum total is towards disorder. In any case, it’s best just to avoid this conflation all-together.
3. Shannon was tackling a problem different than the frame in which most people try to understand the concept. Shannon asked how can we achieve the reliable communication of information in the presence of noise? His concept of entropy as uncertainty could be applied to the statistical character of the signal, or the noise. As mentioned, it had nothing to do with the meaning of the messages being sent.

The framework that most people that try to make use of Shannon’s information theory should really be using is that of Kolmogorov/Chaitin (KC) complexity theory. KC attempts to quantify the amount of information in a string with respect to some interpreter. Shannon doesn’t care if a message with really high Shannon entropy is simply random noise, an alien message, or a very efficient encoding of information; KC’s conception of complexity includes this variable, which is quite handy when one need to consider the context that is brought to bear in understanding a message.