The Dubious History of the Building Block Hypothesis

From the introduction of a manuscript that I recently submitted for review

Perceptions of the abilities and limitations of the SGA (and hence the kinds of problems that it can and cannot solve) have been heavily influenced by a theory of adaptation called the building block hypothesis (Goldberg, 1989; Mitchell, 1996; Holland, 1975, 2000). This theory of adaptation has its genesis in the following idea: maybe small groups of closely located co-adaptive alleles propagate within an evolving population of genomes in much the same way that single adaptive alleles do in Fisher’s theories of sexual evolution (Fisher, 1958). Holland called such groups of alleles building blocks. This idea can be taken one step further: maybe small groups of co-adaptive building blocks propagate within an evolving population of genomes in much the same way that single building blocks do. Such groups can be thought of as higher-level building blocks. Pursuing this idea to the fullest extent, maybe co-adaptive groups of higher-level building blocks propagate in much the same way as ordinary building blocks do to yield building blocks of an even higher level, and so on and so forth in hierarchical fashion with the building-blocks of higher levels being comprised of co-adaptive groups of lower-level building blocks. Let us call this this idea hierarchical building block assembly.

Holland (1975) saw in hierarchical building block assembly a way out from the problem that epistasis (Wolf et al., 2000) poses for Fisher’s theory of sexual evolution. He also believed that hierarchical building block assembly, if implemented efficiently, could serve as a useful problem solving technique. He argued that a genetic algorithm that he called a genetic plan can implement hierarchical building block assembly, and moreover does so efficiently. He offered the genetic plan as a model of natural sexual evolution and also as a useful technique for finding solutions to adaptation problems with non-convex objective functions. The main theoretical tool that he used in his argument has come to be called the schema theorem (Goldberg, 1989; Mitchell, 1996). However neither the schema theorem, nor any of Holland’s other theoretical analyses fully support his claim that simple genetic algorithms are capable of efficiently implementing hierarchical building block assembly. Given the boldness of his claim and the large leaps of intuition that Holland makes in order to support it, the absence of experimental support in (Holland, 1975) is rather conspicuous (even more so given that simple, computationally unintensive, proof-of-concept experiments are not difficult to conceive of. See, Mitchell et al., 1992, and Forrest and Mitchell, 1993) . It would not have been surprising therefore if the genetic plan had been relegated to the history books as an algorithm that did not fulfill its raison d’etre — to support its inventor’s hunch about the utility of hierarchical building block assembly as a theory of adaptation for natural sexual evolutionary systems, and to support its inventor’s hunch that hierarchical building block assembly can be efficiently implemented. What seems to have saved the SGA from this fate is the curious matter of its utility.

In the years following the publication of Holland’s seminal work (Holland, 1975), the SGA was successfully used to adapt high-quality solutions to different sorts of real world and toy problems with non-convex objective functions. In an unfortunate twist of reasoning hierarchical building block assembly became the de-facto explanation for the success of the SGA. This explanation came to be called the building block hypothesis. Despite its name, the building block hypothesis was treated more as an assumption than as a hypothesis. Hierarchical building block assembly had aesthetic appeal, and the building block hypothesis had Holland’s unqualified endorsement (Holland, 1992). Therefore the building block hypothesis was readily accepted by most within the GA community. Some even went so far as to tout the success of SGAs as evidence of the veracity of the building block hypothesis or as evidence that hierarchical building block assembly is a useful search technique for a wide variety of search problems. Consider the following confused passage from one of the first text books on genetic algorithms:

“…the building block hypothesis has held up in many different problem domains. Smooth, unimodal problems, noisy multimodal problems, and combinatorial optimization problems have all been attacked successfully using virtually the same reproduction-crossover-mutation [S]GA.”(Goldberg, 1989)

The early support that the building block hypothesis enjoyed accounts for the deep impact it has had and continues to have on the course of research in genetic algorithms as well as other fields of evolutionary computation such as genetic programming. Recently the building block hypothesis has been sharply criticized for lacking adequate theoretical support. The most forceful criticism that we are aware of has been levied by Wright et al. (2003): “The various claims about [S]GAs that are traditionally made under the name of the building block hypothesis have, to date, no basis in theory, and, in some cases,are simply incoherent”. On the empirical side experimental results have been obtained which straightforwardly cast doubt upon the ability of a simple genetic algorithm to efficiently implement hierarchical building block assembly (Mitchell et al., 1992; Forrest and Mitchell, 1993). In response to these experimental results a silent transition has occurred within the field of genetic algorithms: hierarchical building block assembly has gone from being thought of as the abstract process that SGAs implement to being thought of as a normative process that SGAs mis-implement. Even though this transition between intellectual positions is completely specious it is now widely assumed that SGAs work because they manage to “fudge” hierarchical building block assembly. Many new genetic algorithms have been constructed to compensate for the perceived short-comings of the GA —e.g. messy GA, (Goldberg et al., 1989; Goldberg, 1989, 2002), LLGA (Harik and Goldberg, 1997; Goldberg, 2002), CGA (Harik et al., 1999), ECGA (Harik, 1999), cohort GA (Holland, 2000), FDA (M¨uhlenbein and Mahnig, 1999), LFDA (M¨uhlenbein and Mahnig, 2001), BOA (Pelikan et al., 1999; Goldberg, 2002), hBOA (Pelikan and Goldberg, 2001), SEAM (Watson, 2002, 2006),etc. The inventors of these algorithms claim, or at least imply, that their algorithms are better than the SGA at its own game — hierarchical building block assembly. In many circles within the GA community the curious matter of frequent utility of SGAs is now considered closed.

For a case in point of the kind of sleight of hand that we are discussing consider the following: conceding that there is little evidence that SGAs can efficiently and robustly implement hierarchical building block assembly, Holland (2000) remarks, “Are [S]GA’s, then, a robust approach to all problems in which building blocks play a key role? By no means! After years of investigation we still have only limited information about the [S]GA’s capabilities for exploiting building blocks”. Later he asserts that “the very essence of good GA design is retention of diversity, furthering exploration, while exploiting building blocks already discovered”, and presents a new genetic algorithm, the Cohort Genetic Algorithm, and argues that it implements this essence (see (Pei and Goodman, 2001) for evidence that it does not).

The field of genetic algorithms is both a scientific field as well as an engineering domain. Heedful science and meticulous engineering can often work synergistically. However when the boundary between science and engineering begins to blur, dogma and misplaced faith can beleaguer the practice of both, to wit, a system that is useful in practice, but does not implement a hypothetical mechanism may receive reduced attention, whereas the mechanism, far from being dismissed according to the basic norms of science may become the holy grail of the engineering goals of the field.

A theory that explains why a system exhibits a particular behavior can influence perceptions of how the system can behave, and also of how it cannot. Of the two kinds of perceptions, the latter kind is often judged in retrospect to be the greater impediment to the discovery of a new theory that can explain and predict the behavior of the system with greater accuracy. This is because by influencing perceptions of how the system cannot behave a theory implicitly determines the “domain of the impossible” and in doing so it steers researchers away from considering certain possibilities. Yet it is precisely amongst these ”impossibilities” that the seeds of a new more accurate theory often lie.

One of the two goals of this paper is to challenge the widespread belief that the SGA cannot increase the frequency of a low order schema with above-average fitness when the defining length of that schema is high (i.e. when the defining bits of that schema are widely dispersed). This belief can be traced back to Holland’s original treatise on genetic algorithms (Holland, 1975) and goes hand in hand with belief in the building block hypothesis (and variations thereof). In section 11 we provide an argument based on experimental evidence that this belief is misplaced. We believe that this errant belief will be judged in retrospect to have been a significant impediment to the discovery of a sound theory of adaptation for the SGA.

The Dubious History of the Building Block Hypothesis

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s