Originally Posted By: Orac
I would leave a little open excercise for ImagingGeek if he wants to take up the challenge.

The nasty problem I am left with he is how do we measure and define complexity or if you prefer "information left to learn" it is not going to be a simple as count genomes or anything like that because the fitness criteria imposed was complex.

That is very much an open question in biology. People have tried to apply classical information theory (Shannon entropy, etc) to the problem, and generally have failed. The issue is that not all genetic material is information - some of it is structural, some of it has functions other than passing on information, some of it (most, in humans) is junk, in many cases even functional elements like genes are disposable (so the value of the information in them is hard to quantify), etc.

Restricting the analysis to regions which encode functional elements - proteins, RNAs, etc - also doesn't seem to provide any meaningful measure. Often, the amount of these present seems to have nothing to do with the complexity are degree of adaptation of the organisms, and instead reflect a combination of genetic chance and the strength of selection against increasing genome size.

IMO, the best measure of 'information' is one not currently in reach - a bioinformatic description of all the protein (and other functional elements) interactions required for the organisms survival. This avoid the complexities of genome architecture, junk DNA, etc, by quantifying only those interactions which impact the biology of the organisms. we are, however, likely decades away from that capacity.