Next: Algorithm Description
Up: Silicon Cluster Optimization Using
Previous: Parameter Encoding
Extended Compact Genetic Algorithm (ECGA)
ECGA, proposed by Harik [7] is based on a key idea
that the choice of a good probability distribution is equivalent
to linkage learning. The measure of a good distribution is
quantified based on minimum description length(MDL) models.
The key concept behind MDL models is that given all things are
equal, simpler distributions are better than the complex ones.
The MDL restriction penalizes both inaccurate and complex models,
thereby leading to an optimal probability distribution. Thus, MDL
restriction reformulates the problem of finding a good distribution
as an optimization problem that minimizes both the probability model
as well as population representation. The probability distribution
used in ECGA is a class of probability models known as marginal
product models (MPMs). MPMs are formed as a product of marginal
distributions on a partition of the genes and are similar to those
of CGA [31] and PBIL [32]. Unlike the models used
in CGA and PBIL, MPMs can represent probability distributions for
more than one gene at a time. MPMs also facilitate a direct
linkage map with each partition separating tightly linked genes.
Hence, in the current study each gene partition would refer to a BB.
The MPM concept is illustrated as follows: consider 4 bit problem and chose the following partition [0,2], [1], [3]. This partition implies that 0
and 2
bit are jointly distributed and the 1
and the 3
bits are independently distributed. Therefore [0,2] can take the following four values, 00, 01, 10, and 11. The probability distribution for this partition is simply the frequency of individuals with those bit values. Similarly [1] and [3] can take either 0 or 1 and the proportion of individuals have on 1 and 0 in the 1
(and similarly in the 3
bit) is the probability distribution for that partition.
Figure 3:
ECGA procedure
|
A flowchart of ECGA procedure is given in fig. 3. Two things need further explanation, one is the identification of MPM using MDL and the other is the creation of a new population based on MPM. The identification of MPM in every generation is formulated as a constrained optimization problem,
where is the model complexity which represents the cost of a complex model and is given by
|
(7) |
and is the compressed population complexity which represents the cost of using a simple model as against a complex one and is evaluated as
|
(8) |
in the equations represent the number of BBs, is the length of BB
, is the number of chromosomes in the current population possessing bit-sequence
1 for BB . The constraint (eqn. 6) arises due to finite population size.
A new population is generated based on the optimal MPM as follows, population of size where is the crossover probability, is filled by the best individuals in the current population. This differs from the original ECGA procedure [33] in which individuals were taken to be the last individuals of the current generation irrespective of their fitness. The rest individuals are generated by randomly choosing subsets from the current individuals. These subsets are the gene groups identified by the current MPM. This procedure is also different from the original ECGA procedure in which the subsets were chosen by probabilistic polling. The original procedure leads to a stochastic method were as the one used in the current study is a deterministic method in which the frequencies of the bit sequence of a particular subset remains constant. However, it has been observed that the outcome of both the methods are similar and in fact the deterministic method is computationally much faster.
Next: Algorithm Description
Up: Silicon Cluster Optimization Using
Previous: Parameter Encoding
Kumara Sastry
2001-04-02