lundi 2 février 2009

Evolutionary metaheuristics: what is a suitable fitness function?

Evolutionary metaheuristic methods such as Genetic Algorithms, or more recently Estimation of Distribution Algorithms are useful when the optimization problem to solve presents no apparent structure.

In other words, when you have to find an $x_0$ living in a domain $D$ such as

$f(x_0)=\max_{x \in D} f(x)$

you have either:
  1. A messy domain $D$, for example a huge discret domain
  2. A messy function $f$, for example a non-convex or discontinuous one
you may reasonably invoke these metaheuristics. In the evolutionary metaheuristics, $f$ is called the fitness function, and a population $P \subset D$ is evolved towards the high values of $f$. Finally, you can pick your $x_0$ in $P$ if all went well.

However, as exciting as evolutionary computing may sound, this tool will perform poorly, and even fail to evolve any population when applied on ill designed fitness functions. The existence of such unfriendly functions is given by the famous no free lunch theorem(s). It states that for static optimization tasks such as ours, when benchmarking any algorithm over all the possible $f$ functions, it performs as well as a random search over $D$. Or in other words, any given algorithm can't be better than any other one on all possible fitness functions. Again: for each and every optimization algorithm, there is a disastrous class of $f$ functions on which it fails computing a good solution. This result holds independently of the type of algorithm. They are all included, even the ones we don't know yet! There is no static optimization algorithm performing well on every $f$, just like there is no hotel serving free lunches everyday.

This is a theoretical result: it may seem the space where the $f$ functions are living is so huge, we won't even be capable of exhibiting such disastrous functions for evolutionary algorithms. As a matter of fact, we can!

Disclaimer: the following part is solely based on my previous experience with evolutionary algorithms. It doesn't rely on a truly scientific evidence, just on my own intuition.

Genetic Algorithms and Estimation of Distribution Algorithms rely implicitly on a minimal local stability of the fitness function.
Let $\delta$ be a distance over $D$. For convenience, we assume that $f(D) \subset \mathbb{R}$. Minimal local stability means that for almost any $x$ in $D$, slightly changing $x$ will only reasonably affect $f(x)$. This is (a little bit more) formalized by:

$\exists e, C > 0; e \sim \min_{x \neq y} {\delta(x,y)}; C \ll \max{f} - \min{f}$
for almost any $x,y \in D$
$\delta(x,y) \leq e \Rightarrow \mid f(x) - f(y) \mid < C$

The local discontinuities of the fitness don't matter, because they are soon assimilated by the population in GA or the distribution model for EDA. Everywhere else, the behavior of the fitness is kind of predictable, and we can rely on this stability to carry out the evolution process. In EDA, local stability plays a major role, because it ultimately legitimates the approximations done during the learning of the model parameters.

In GA, when we are crossing-over (notation $\otimes$) $x$ and $y$, we hope that, most of the time, $f(x \otimes y)$ will be comparable to $\max(f(x),f(y))$. On the early stages of evolving $P$, we can say that the algorithm is searching for a stability zone in $D$, where this assumption can be made. The existence of this stability zone in $D$ is to my mind related to the propriety of local stability of $f$. One heuristic argument in favor of this assumption, is that GA will fail when $f$ is discontinuous everywhere, because of the induced randomness. There is no direction to go for the algorithm with such a messy fitness function.

The moral of the story is to always check if the fitness function has some kind of regularity. If $f$ is locally stable, we can hope that the algorithms will perform honorably. However, this is probably only a necessary condition for good results...

Aucun commentaire:

Enregistrer un commentaire