The adaptive arrow of time tells you that you go
from probable states to an improbable state,
from disordered states to an ordered state.
It is the reverse of the thermodynamic arrow of time.
There have been several efforts to mathematize that fact,
most famously by Ronald A. Fisher
in the so called 'Fundamental Theory of Natural Selection'.
Fisher wanted a theory as general as the second law of thermodynamics.
And here it is, captured in mathematical terms:
It says, you move through a space of possible solutions
in such a way that you minimize the variability in the population.
That is like minimizing the uncertainty.
And at a certain point you reach the maximum
and then there is only one solution that you observe
and that is the one we typically would call 'best adapted'.
There is a problem with the theory,
and that is: it is not completely general.
If you think about situations like this,
the rock, paper, scissors game
and ask: what is the maximally adapted solution?
Well, imagine that Fisher's fundamental theory was right
you would say that only one of those strategies at the end is where you end up
because you have minimized the variability.
So I always play 'rock'.
But, of course, someone can come along and play [paper and beat me?].
So, in this particular instance of frequency dependent selection
Fisher's fundamental theory cannot be right,
because you are not minimizing the variance
you are actually maximizing it.
Over the last decade or so
a number of us have been working on generalizing the adaptive arrow of time, mathematically,
to ask: what is actually being minimized?
And here is a little bit of mathe'magix'
but we can jump stright to the text.
What we now know:
all adaptive processes are minimizing
is, they are minimizing the uncertainty of an agent about the state of the world.
Or, put differently,
each agent maximizes the amount of information it possesses about the world in which it lives .
And when you express the adaptive arrow of time in these terms,
you realize that a whole range of aparently distinct phenomena
- evolution, Bayesian inference, reinforcement learning -
are all examples of the same fundamental dynamic.
In other words: a functionalist perspective allows you to see
that the particular mechanical implementation does not matter so much.
All of them achieve the desired goal of maximizing the information in the agent about the world.