[CST-2] Information Theory

Timothy Hospedales tmh31@cam.ac.uk
Sun, 26 May 2002 11:59:58 +0100


Ant,
 Good point! I'm still slightly confused though. You're right in that
looking only at the frequency of symbol occurance from a source is not
how you want to work out its entropy. (An obvious case being,
H(0101010...) will have an entropy of 1 rather than 0). 

 But the notes say that english has an entropy of 4 bits - which it
does based on distribution of letters - But actually its much less
than that if you know about spelling / grammar constraints of the
source. (Obviously so, or you would only be able to get 8:4
compression for ASCII since H=K). So is (a) JDG using "entropy"
loosely there, or (b) do you get different measures of entropy
depending on how much you let yourself know about the source - Like
with english or pi or (c) something else entirely?

 And what about the exercise questions that give you alphabets and the
probabilities of symbols and ask you to calculate the entropy from
that? - If the alphabets were actually being used to communicate
something, there would be more gramatical constraints that would
reduce the actual entropy to below the value you calculate?

 It looks like entropy suffers from the same impossibility to
determine that kolmogorov complexity suffers from. Any source you try
to calculate the entropy of could be discovered to actually be
deterministic (like pi) thereby making your earlier entropy
calculation wrong.

Thoughts?
Tim

> The digits aren't random at all, they just look random. They are actually
> completely deterministic, so for each digit P(digit = some number) is equal
> to either 1 or 0, and therefore p * log(p) is always zero, and the digits
> have zero entropy.  Which is approximately equal to the length of the short
> program that calculates them.
> 
> Ant
> 
> 
> _______________________________________________
> CST-2 mailing list
> CST-2@srcf.ucam.org
> http://www.srcf.ucam.org/mailman/listinfo/cst-2