GA rules in Lisp
Author Message
GA rules in Lisp

Hi folks,

I'm working on a GA homework that deals with learning rules from real-valued
data, and I'm writing the program in Lisp. I'd like to be able to represent
my entire gene as a string of bits. This means that I need to convert from a
Lisp float, probably using something like one's complement.

1) Is this the right way to solve this problem?
2) If it is, are there existing functions/libraries that I should know about
that would help to do this? (Seems like an awfully silly thing to code up by
hand.)

Thanks,
JT

Sat, 09 Aug 2003 10:17:23 GMT
GA rules in Lisp
JT,

Early GA implementations used bit string representation, which makes
crossover and mutation simple, but there's little else to recommend it.
Use a native float representation for your gene values and you'll avoid
the issues of conversion, how many bits to use, Gray codes, etc. It's
also far more efficient.

Crossover and mutation operators are a bit more complex for native
representation. For crossover, pick a random cut point within the gene,
then blend the two parent gene values to produce the offspring value. If
p1 and p2 are two float values from the parents, then the child value
will be:

c = f * p1 + (1-f) * p2

where f is a random fraction [0, 1) representing the cut point. This
basically averages the parent values, weighted according to the position
of the cut point.

For mutation, calculate a random float in the range [-m, m], where m is
the maximum mutation value, and add it to the gene value. An m value of
one tenth the overall gene range usually works well. Ensure that the
resulting value falls within the gene range.

Bob

Quote:

> Hi folks,

> I'm working on a GA homework that deals with learning rules from real-valued
> data, and I'm writing the program in Lisp. I'd like to be able to represent
> my entire gene as a string of bits. This means that I need to convert from a
> Lisp float, probably using something like one's complement.

> 1) Is this the right way to solve this problem?
> 2) If it is, are there existing functions/libraries that I should know about
> that would help to do this? (Seems like an awfully silly thing to code up by
> hand.)

> Thanks,
> JT

Sat, 09 Aug 2003 11:58:23 GMT
GA rules in Lisp

You're using Lisp, not C, so why represent the gene as numeric values at
all - you can directly manipulate code.  Obviously this won't work in
every scenario, but it's a neat approach (IMHO).

For example, I looked at genetic algorithms for generating rhythms (as
in music).  I implemented a tiny language in Lisp that could generate
rhythms.  I then treated randomly generated programs in this language as
genes and "bred" them by intermingling them.  This sounds complex, but
in Lisp all you are doing is combining two lists to make a third - when
a language has no clear separation between code and data there's no
longer a clear separation between genetic algorithms and genetic
programming.

For more info see
http://www.andrewcooke.free-online.co.uk/jara/rytmo/index.html

Andrew

PS Happy to answer questions, but away on holiday soon! :-)

Quote:

> Hi folks,

> I'm working on a GA homework that deals with learning rules from real-valued
> data, and I'm writing the program in Lisp. I'd like to be able to represent
> my entire gene as a string of bits. This means that I need to convert from a
> Lisp float, probably using something like one's complement.

> 1) Is this the right way to solve this problem?
> 2) If it is, are there existing functions/libraries that I should know about
> that would help to do this? (Seems like an awfully silly thing to code up by
> hand.)

> Thanks,
> JT

Sat, 09 Aug 2003 17:06:17 GMT
GA rules in Lisp
On Mon, 19 Feb 2001 22:58:23 -0500, Bob Whitefield

[snip]

Quote:
>Crossover and mutation operators are a bit more complex for native
>representation. For crossover, pick a random cut point within the gene,
>then blend the two parent gene values to produce the offspring value. If
>p1 and p2 are two float values from the parents, then the child value
>will be:

>c = f * p1 + (1-f) * p2

>where f is a random fraction [0, 1) representing the cut point. This
>basically averages the parent values, weighted according to the position
>of the cut point.

>For mutation, calculate a random float in the range [-m, m], where m is
>the maximum mutation value, and add it to the gene value. An m value of
>one tenth the overall gene range usually works well. Ensure that the
>resulting value falls within the gene range.

There are other approaches as well with their own advantages (and, I
suppose, disadvantages).  Generator, for example, does not do
recombination *inside* a gene; it picks crossover points at the
boundaries of genes so only whole genes are exchanged in
recombination/crossover.  Gene values change only via mutation.
Generator also uses a mutation operator that is quasi-Gaussian: it is
most likely to choose new gene values that are close to the current
value, but the width of the Gaussian peak can be selected by the user.

Steve

Sat, 09 Aug 2003 22:16:42 GMT
GA rules in Lisp

JT>  I'd like to be able to represent my entire gene as a string of
JT>  bits. This means that I need to convert from a Lisp float,
JT>  probably using something like one's complement.

Storing bits in float is not very good idea.

(make-array 16 :element-type 'bit) will give you bit-vector of size
16, which is object to all standard sequence and array manipulations.

If you still need for some reasons to pack bits into numbers, use
integers. There are some standard functions and accessors defined over
integers which are useful for bitwise manipulations (LDB, DPB and
friends).

For further details consult the chapters 12, 15 and 17 of the HyperSpec.

--
Eugene

Sun, 10 Aug 2003 00:38:05 GMT
GA rules in Lisp

Quote:

> Generator, for example, does not do
> recombination *inside* a gene; it picks crossover points at the
> boundaries of genes so only whole genes are exchanged in
> recombination/crossover.  Gene values change only via mutation.

Let me give an example where using a blending crossover within genes can
make a significant difference. Find the minimum of this function:

f(x,y) = x sin(4x) + 1.1 y sin(2y)
where 0 <= x <= 10, 0 <= y <= 10

This is a very bumpy function with many local minima. To solve it, I
created a chromosome with two genes encoded as floats, with values
constrained between 0 and 10. Fitness is the function above; the
objective is to minimize it. Population size is 500.

First, I solved it without a blending algorithm, i.e., crossover cuts
were allowed only between gene values. Here are the results averaged
over ten runs:

Evaluations     770642
Crossovers      731649
Mutations       38493
Improvements    18.7
Elapsed time    62.7 sec

Evaluations is the number of times the fitness function was executed.
Improvements is the number of times a new best member was created via
crossover or mutation during the run.

The overall best fitness was -18.5547210729247, found on the second run.
These results are fairly good, accurate to 8-10 significant digits
compared to the correct value. Average error was 1.947E-07.

Next I enabled the blending algorithm, so crossover cuts can now occur
within a gene. Average results over ten runs:

Evaluations     17267
Crossovers      15918
Mutations       849
Improvements    37.2
Elapsed time    1.35 sec

Best fitness for all ten runs were identical, correct to 15 significant
digits (-18.5547210773827). Blending crossover improved accuracy by 5-7
significant figures. It also executed 45 times faster on average.

The reason blending crossover works so much better is because it
produces offspring with new gene values. Without blending, crossover
only recombines alleles already present in the population. Without new
alleles, the population quickly converges, leaving only blind mutation
to introduce new values, a very slow process.

Some people mistakenly assume binary encoding works better than
continuous parameters, because they don't use an appropriate crossover
operator. Binary crossover does at least provide intra-gene blending,
but conversion and Gray coding overhead can kill performance.

Of course this was an extreme example, a chromosome with only two genes.
For problems with many genes, and/or genes with a limited alphabet, the
advantages of blending crossover are less dramatic.

I would be interested in results others might get for this problem.

Bob

Quote:

> On Mon, 19 Feb 2001 22:58:23 -0500, Bob Whitefield

> [snip]
> >Crossover and mutation operators are a bit more complex for native
> >representation. For crossover, pick a random cut point within the gene,
> >then blend the two parent gene values to produce the offspring value. If
> >p1 and p2 are two float values from the parents, then the child value
> >will be:

> >c = f * p1 + (1-f) * p2

> >where f is a random fraction [0, 1) representing the cut point. This
> >basically averages the parent values, weighted according to the position
> >of the cut point.

> >For mutation, calculate a random float in the range [-m, m], where m is
> >the maximum mutation value, and add it to the gene value. An m value of
> >one tenth the overall gene range usually works well. Ensure that the
> >resulting value falls within the gene range.

> There are other approaches as well with their own advantages (and, I
> suppose, disadvantages).  Generator, for example, does not do
> recombination *inside* a gene; it picks crossover points at the
> boundaries of genes so only whole genes are exchanged in
> recombination/crossover.  Gene values change only via mutation.
> Generator also uses a mutation operator that is quasi-Gaussian: it is
> most likely to choose new gene values that are close to the current
> value, but the width of the Gaussian peak can be selected by the user.

> Steve

Sun, 10 Aug 2003 05:47:28 GMT
GA rules in Lisp
Bob,
Thank you for including enough information about your test
problem and results to do a direct comparison!

I ran your problem on Generator and got very fast results:

population 100

directional hillclimb mutation 5 steps max

mutation size 100% of range, with range broken into cells of size
.01 x .01

crossover activated

After 20 generations (less than 25000 evaluations) the best
fitness achieved was  -18.55458459,    with x = 9.04 and y = 8.67

I don't think this proves much, even though this result was
slightly better than yours.  All it shows is that there are many paths
to the peak of the mountain, and the view is pretty much the same once
you get there.

Steve

On Tue, 20 Feb 2001 16:47:28 -0500, Bob Whitefield

Quote:

>> Generator, for example, does not do
>> recombination *inside* a gene; it picks crossover points at the
>> boundaries of genes so only whole genes are exchanged in
>> recombination/crossover.  Gene values change only via mutation.

>Let me give an example where using a blending crossover within genes can
>make a significant difference. Find the minimum of this function:

>f(x,y) = x sin(4x) + 1.1 y sin(2y)
>where 0 <= x <= 10, 0 <= y <= 10

>This is a very bumpy function with many local minima. To solve it, I
>created a chromosome with two genes encoded as floats, with values
>constrained between 0 and 10. Fitness is the function above; the
>objective is to minimize it. Population size is 500.

>First, I solved it without a blending algorithm, i.e., crossover cuts
>were allowed only between gene values. Here are the results averaged
>over ten runs:

>Evaluations 770642
>Crossovers  731649
>Mutations   38493
>Improvements        18.7
>Elapsed time        62.7 sec

>Evaluations is the number of times the fitness function was executed.
>Improvements is the number of times a new best member was created via
>crossover or mutation during the run.

>The overall best fitness was -18.5547210729247, found on the second run.
>These results are fairly good, accurate to 8-10 significant digits
>compared to the correct value. Average error was 1.947E-07.

>Next I enabled the blending algorithm, so crossover cuts can now occur
>within a gene. Average results over ten runs:

>Evaluations 17267
>Crossovers  15918
>Mutations   849
>Improvements        37.2
>Elapsed time        1.35 sec

>Best fitness for all ten runs were identical, correct to 15 significant
>digits (-18.5547210773827). Blending crossover improved accuracy by 5-7
>significant figures. It also executed 45 times faster on average.

>The reason blending crossover works so much better is because it
>produces offspring with new gene values. Without blending, crossover
>only recombines alleles already present in the population. Without new
>alleles, the population quickly converges, leaving only blind mutation
>to introduce new values, a very slow process.

>Some people mistakenly assume binary encoding works better than
>continuous parameters, because they don't use an appropriate crossover
>operator. Binary crossover does at least provide intra-gene blending,
>but conversion and Gray coding overhead can kill performance.

>Of course this was an extreme example, a chromosome with only two genes.
>For problems with many genes, and/or genes with a limited alphabet, the
>advantages of blending crossover are less dramatic.

>I would be interested in results others might get for this problem.

Sun, 10 Aug 2003 22:15:23 GMT
GA rules in Lisp

Quote:

>         After 20 generations (less than 25000 evaluations) the best
> fitness achieved was  -18.55458459,    with x = 9.04 and y = 8.67

>         I don't think this proves much, even though this result was
> slightly better than yours.

Steve,

I think you'll see my result is actually more accurate. The goal is to
find the minimum:

-18.55458459       (your result, x=9.04 y=8.67)
-18.5547210773827  (my result, x=9.0389916 y=8.66818896)
0.0001364873827  (difference)

I'm not trying to criticize Generator--it's a fine product--I just
wanted to show how powerful a real-valued GA can be when you use a
blending crossover. Also note that my implementation doesn't use
hill-climbing techniques, only selection, crossover and mutation.

Bob

p.s. You used a population of 100, while I used 500. Out of curiousity,
I just ran the problem again with 100 members. Here are the averages
over 20 runs:

Evaluations    167746
Mutations      8850
Improvements   74.05
Elapsed time   13.6 sec

In all runs, the best result was still accurate to 15 digits, but the
smaller population increased execution time by a factor of ten, on
average. I imagine Generator would also show better performance with a
larger population.

Quote:

> Bob,
>         Thank you for including enough information about your test
> problem and results to do a direct comparison!

>         I ran your problem on Generator and got very fast results:

> population 100

> directional hillclimb mutation 5 steps max

> mutation size 100% of range, with range broken into cells of size
>  .01 x .01

> crossover activated

>         After 20 generations (less than 25000 evaluations) the best
> fitness achieved was  -18.55458459,    with x = 9.04 and y = 8.67

>         I don't think this proves much, even though this result was
> slightly better than yours.  All it shows is that there are many paths
> to the peak of the mountain, and the view is pretty much the same once
> you get there.

> Steve

> On Tue, 20 Feb 2001 16:47:28 -0500, Bob Whitefield

> >> Generator, for example, does not do
> >> recombination *inside* a gene; it picks crossover points at the
> >> boundaries of genes so only whole genes are exchanged in
> >> recombination/crossover.  Gene values change only via mutation.

> >Let me give an example where using a blending crossover within genes can
> >make a significant difference. Find the minimum of this function:

> >f(x,y) = x sin(4x) + 1.1 y sin(2y)
> >where 0 <= x <= 10, 0 <= y <= 10

> >This is a very bumpy function with many local minima. To solve it, I
> >created a chromosome with two genes encoded as floats, with values
> >constrained between 0 and 10. Fitness is the function above; the
> >objective is to minimize it. Population size is 500.

> >First, I solved it without a blending algorithm, i.e., crossover cuts
> >were allowed only between gene values. Here are the results averaged
> >over ten runs:

> >Evaluations    770642
> >Crossovers     731649
> >Mutations      38493
> >Improvements   18.7
> >Elapsed time   62.7 sec

> >Evaluations is the number of times the fitness function was executed.
> >Improvements is the number of times a new best member was created via
> >crossover or mutation during the run.

> >The overall best fitness was -18.5547210729247, found on the second run.
> >These results are fairly good, accurate to 8-10 significant digits
> >compared to the correct value. Average error was 1.947E-07.

> >Next I enabled the blending algorithm, so crossover cuts can now occur
> >within a gene. Average results over ten runs:

> >Evaluations    17267
> >Crossovers     15918
> >Mutations      849
> >Improvements   37.2
> >Elapsed time   1.35 sec

> >Best fitness for all ten runs were identical, correct to 15 significant
> >digits (-18.5547210773827). Blending crossover improved accuracy by 5-7
> >significant figures. It also executed 45 times faster on average.

> >The reason blending crossover works so much better is because it
> >produces offspring with new gene values. Without blending, crossover
> >only recombines alleles already present in the population. Without new
> >alleles, the population quickly converges, leaving only blind mutation
> >to introduce new values, a very slow process.

> >Some people mistakenly assume binary encoding works better than
> >continuous parameters, because they don't use an appropriate crossover
> >operator. Binary crossover does at least provide intra-gene blending,
> >but conversion and Gray coding overhead can kill performance.

> >Of course this was an extreme example, a chromosome with only two genes.
> >For problems with many genes, and/or genes with a limited alphabet, the
> >advantages of blending crossover are less dramatic.

> >I would be interested in results others might get for this problem.

Mon, 11 Aug 2003 01:03:32 GMT
GA rules in Lisp
On Wed, 21 Feb 2001 12:03:32 -0500, Bob Whitefield

Quote:

>I think you'll see my result is actually more accurate. The goal is to
>find the minimum:

>-18.55458459       (your result, x=9.04 y=8.67)
>-18.5547210773827  (my result, x=9.0389916 y=8.66818896)
>  0.0001364873827  (difference)

Oops.  You're right.   I think your program is set up to use
higher-precision gene values than Generator is.

Quote:
>I'm not trying to criticize Generator--it's a fine product--I just
>wanted to show how powerful a real-valued GA can be when you use a
>blending crossover. Also note that my implementation doesn't use
>hill-climbing techniques, only selection, crossover and mutation.
>p.s. You used a population of 100, while I used 500. Out of curiousity,
>I just ran the problem again with 100 members. Here are the averages
>over 20 runs:

>Evaluations    167746
>Mutations      8850
>Improvements   74.05
>Elapsed time   13.6 sec

>In all runs, the best result was still accurate to 15 digits, but the
>smaller population increased execution time by a factor of ten, on
>average. I imagine Generator would also show better performance with a
>larger population.

Actually, Generator runs a lot faster with a smaller population but is
more likely to get hung up on a local optimum.  So I usually start
with a large population (100) and gradually shrink it down to about 5
or 10.  This tends to filter out the most likely sections of search
space and then focus on them.

Steve

Mon, 11 Aug 2003 02:25:49 GMT

 Page 1 of 1 [ 9 post ]

Relevant Pages

Powered by phpBB® Forum Software