Back to the Future: A Science of Genetic Algorithms

From the preface to my dissertation:

The foundations of most computer engineering disciplines are almost entirely mathematical. There is, for instance, almost no question about the  soundness of the foundations of such engineering disciplines as graphics, machine learning, programming languages, and databases. An exception to this general rule is the field of genetic algorithmics, whose foundation includes a significant scientific component.

The existence of a science at the heart of this computer engineering discipline is  regarded with nervousness. Science traffics in provisional truth; it requires one to adopt a form of skepticism that is more nuanced, and hence more difficult to master than the radical sort of skepticism that suffices in mathematics and theoretical computer science. Many, therefore, would be happy to see science excised from the foundations of genetic algorithmics. Indeed, over the past decade and a half, much effort seems to have been devoted to turning genetic algorithmics into just another field of computer engineering, one with an entirely mathematical foundation.

Broadening one’s perspective beyond computer engineering, however, one cannot help wondering if much of this effort is not a little misplaced. Continue reading “Back to the Future: A Science of Genetic Algorithms”

Back to the Future: A Science of Genetic Algorithms

Red Dots, Blue Dots

In this blog entry I’d like to showcase just one of a number of remarkable findings that comprise the basis for the generative fixation hypothesis—a new explanation for the adaptive capacity of recombinative genetic algorithms.

Consider the following stochastic function which takes a bitstring of length \ell as input and returns a real value as output.

fitness(bitstring)
  accum = 0
  for i = 1 to 4
     accum = accum + bitstring[pivotalLoci[i]]
  end
  if accum is odd
     return a random value from normal distribution N(+0.25,1)
  else
     return a random value from normal distribution N(-0.25,1)
  end

The variable pivotalLoci is an array of four distinct integers between 1and \ell which specifies the location of  four loci—let’s call them A, B, C, D—of an input bitstring that matter in the determination the bitstring’s fitness. These four loci are said to be pivotal. Continue reading “Red Dots, Blue Dots”

Red Dots, Blue Dots

Bit Dynamics Visualization

I’ve found the bit dynamics visualizer included in speedyGA very useful for understanding the dynamics of SGAs with bitstring genomes. In each generation the visualizer plots/updates the frequency of the bit 1 at each locus (the frequency of the bit 0 is straightforwardly deducible) .

Here’s a visualization of the bit dynamics of an SGA with 1pt crossover when applied to the the Royal Roads fitness function. Going by the building block hypothesis one expects to see the dots marching orderly to the top of the plot in groups of eight or more.

That’s not what happens. Instead, one gets to see hitchhiking in action—look for a swift downward movement of certain dots in tandem with the swift upward movement of other dots at close by loci.

The maximum and average fitness in each generation of this run are shown belowavg_max_fitness_crossover1

The matlab code used to generate these and other figures in this blog post can be found here.

Let’s visualize the bit dynamics of a population when an SGA with uniform-crossover is applied to the Royal Roads function.

The maximum and average fitness in each generation of this run are shown below

Continue reading “Bit Dynamics Visualization”

Bit Dynamics Visualization

The Fundamental Problem with the Building Block Hypothesis (new manuscript)

Abstract: Skepticism of the building block hypothesis  has previously been expressed on account of the weak theoretical foundations of this hypothesis and anomalies in the empirical record of the simple genetic algorithm. In this paper we focus on a more fundamental cause for skepticism—the extraordinary strength of some of the assumptions undergirding the building block hypothesis. As many of these assumptions have been embraced by the designers of so called “competent” genetic algorithms, our critique is relevant to an appraisal of such algorithms. We argue that these assumptions are too strong to be acceptable without additional evidence. We then point out weaknesses in the arguments that have been provided in lieu of such evidence.

Download manuscript

The Fundamental Problem with the Building Block Hypothesis (new manuscript)

What Are GAs Good For?

Researchers studying the foundations of genetic algorithms have not, to the best of my knowledge, identified a non-trivial computational problem that a simple GA can solve robustly and scaleably (I’ve previously raised this issue here) . In my opinion, this singular fact is the most clear evidence for the inadequacy of current paradigm within which we understand/study the adaptive capacity of GAs—the question of what GAs are good for is, after all, intimately related to the question of how GAs work.

In a draft of one of my dissertation chapters I identify a hard computational problem and show that a GA can solve it robustly and scalably. Remarkably, this problem is closely related to a hairy statistical problem in computational biology. How might a GA leverage this kind of computational ability to perform adaptation? I’ll be presenting my theory about this in future chapters. The idea behind this theory is delightfully simple. Presenting it formally, however, is a another story. Stay tuned.


What Are GAs Good For?

The Need for a Sound Theory of Adaptation for the Simple Genetic Algorithm

The conclusion of a manuscript that I recently submitted for review

The biosphere is replete with organisms that are exquisitely well adapted to the environmental niches they inhabit. Natural sexual evolution has been crucial to the generation of what are arguably the most highly adapted of these organisms — cheetahs, owls, humans etc. A deeply intriguing idea is that we can build adaptation algorithms which, at an abstract level, mimic the behavior of natural sexual evolution, and in doing so, “harness” something of the adaptive power of this incredibly effective process. But what is the abstract level at which natural sexual evolution should be mimicked? In other words Continue reading “The Need for a Sound Theory of Adaptation for the Simple Genetic Algorithm”

The Need for a Sound Theory of Adaptation for the Simple Genetic Algorithm

The Dubious History of the Building Block Hypothesis

From the introduction of a manuscript that I recently submitted for review

Perceptions of the abilities and limitations of the SGA (and hence the kinds of problems that it can and cannot solve) have been heavily influenced by a theory of adaptation called the building block hypothesis (Goldberg, 1989; Mitchell, 1996; Holland, 1975, 2000). This theory of adaptation has its genesis in the following idea: maybe small groups of closely located co-adaptive alleles propagate within an evolving population of genomes in much the same way that single adaptive alleles do in Fisher’s theories of sexual evolution (Fisher, 1958). Holland called such groups of alleles building blocks. This idea can be taken one step further: maybe small groups of co-adaptive building blocks propagate within an evolving population of genomes in much the same way that single building blocks do. Such groups can be thought of as higher-level building blocks. Pursuing this idea to the fullest extent, maybe Continue reading “The Dubious History of the Building Block Hypothesis”

The Dubious History of the Building Block Hypothesis

Optimization, Adaptation, Machine Learning and Evolutionary Computation

From the introduction of a manuscript that I recently submitted for review

The practice of Machine Learning research can be characterized as the effective semiprincipled reduction of learning problems to problems for which robust and efficient solution techniques exist – ideally ones with provable bounds on their use of time and space. In a recent paper Bennett and Parrado-Hern´andez (2006) describe the synergistic relationship between the fields of machine learning (ML) and mathematical programming (MP). They remark:

“Optimization lies at the heart of machine learning. Most machine learning problems reduce to optimization problems. Consider the machine learning analyst in action solving a problem for some set of data. The modeler formulates the problem by selecting an appropriate family of models and massages the data into a format amenable to modeling. Then the model is typically trained by solving a core optimization problem that Continue reading “Optimization, Adaptation, Machine Learning and Evolutionary Computation”

Optimization, Adaptation, Machine Learning and Evolutionary Computation

Critique of the Compositional Paradigm

Adaptation in selecto-recombinative genetic systems is widely believed to occur by the recombination of pre-adapted genetic material. This belief is at the core of the paradigm under which most GA and all EDA research currently occurs. It underlies the construction of several new varieties of genetic algorithms that purportedly work by combining pre-adapted genetic material in some sophisticated way (e.g. cohort GAs, messy GAs, LLGA, ECGA, BOA, hBOA, etc.).

In this paradigm, each post-selection population is thought to harbor “good” genetic material. Recombination operators, and estimation of distribution procedures, are thought drive adaptation by composing this material to produce good or better individuals in the next generation.When adaptation stalls it is thought to be because “good” genetic material is unavailable, or because recombination of this material was not performed effectively.

Let us call this general set of beliefs the Compositional Paradigm. This paradigm draws its support from Holland’s Building Block Hypothesis. Its widespread acceptance in the GA community signals Continue reading “Critique of the Compositional Paradigm”

Critique of the Compositional Paradigm