learning parities

To appear in the proceedings of the Foundations of Genetic Algorithms Conference, 2015

Hypomixability Elimination in Evolutionary Systems

Abstract: Hypomixability Elimination is an intriguing form of computation thought to underlie general-purpose, non-local, noise-tolerant adaptation in recombinative evolutionary systems. We demonstrate that hypomixability elimination in recombinative evolutionary systems can be efficient by using it to obtain optimal bounds on the time and queries required to solve a subclass $(k=7, \eta=1/5)$ of a familiar computational learning problem: PAC-learning parities with noisy membership queries; where $k$ is the number of relevant attributes and $\eta$ is the oracle’s noise rate. Specifically, we show that a simple genetic algorithm with uniform crossover (free recombination) that treats the noisy membership query oracle as a fitness function can be rigged to PAC-learn the relevant variables in $O(\log (n/\delta))$ queries and $O(n \log (n/\delta))$ time, where $n$ is the total number of attributes and $\delta$ is the probability of error. To the best of our knowledge, this is the first time optimally efficient computation has been shown to occur in an evolutionary algorithm on a non-trivial problem.

The optimality result and indeed the implicit implementation of hypomixability elimination by a simple genetic algorithm depends crucially on recombination. This dependence yields a fresh, unified explanation for sex, adaptation, speciation, and the emergence of modularity in evolutionary systems. Compared to other explanations, Hypomixability Theory is exceedingly parsimonious. For example, it does not assume deleterious mutation, a changing fitness landscape, or the existence of building blocks.

A commonly asked question since the early days of evolutionary computation is “What are Genetic Algorithms good for?” What, in other words, is the thing that genetic algorithms (GAs) do well? The prevailing answer for almost two decades was “Hierarchical Building Block Assembly”. This answer, which goes by the name of the building block hypothesis, generated tremendous excitement in its day and helped jumpstart the field of genetic algorithmics. However it fell into disrepute during the ’90s because of a lack of theoretical support and (more damningly) anomalies in the empirical record.

Since then, The Question (“What are GAs good for?”) has gone unanswered—or, to be precise, remains poorly answered—The Building Block Hypothesis continues on as a placeholder in textbooks. More recently, The Question is increasingly going unasked by evolutionary computation theorists, in large part because EC theorists are now wary of entertaining claims about genetic algorithms that cannot be formally deduced, and because Genetic algorithms used in practice (ones with finite but non-unitary recombinative populations) have proven frustratingly resistant to conventional analytic approaches.

The following paper squares off with Das Question, and for genetic algorithms with uniform crossover (UGAs), ventures an answer: Implicit Concurrent Multivariate Effect Evaluation—implicit concurrency for short—is what UGAs are good for. Implicit concurrency is compared with implicit parallelism (the “engine of optimization” according to the building block hypothesis), and the former is revealed to be more powerful.

Empirical evidence for implicit concurrency can be found here and here. And additional empirical evidence is easily obtained for one-off problem instances—just run a genetic algorithm on the appropriately chosen instance. The paper below takes a different tack. It establishes theoretical support (across an infinite set of problem instances) for the claim that implicit concurrency is a form of efficient computational learning. It achieves this end by demonstrating that implicit concurrency can be used to obtain low bounds on the time and query complexity of a UGA based algorithm that solves a constrained version of a well recognized problem from the computational learning literature: Learning parities with a noisy membership query oracle.

The difficulty of formally analyzing genetic algorithms is circumvented with an empirico-symmetry-analytic proof comprised of

A formal part (no knowledge of genetic algorithms required)
An accessible symmetry argument
A simple statistical hypothesis testing based rejection of a null hypothesis with very high significance $p<10^{-100}$

The paper is here:
The Fundamental Learning Problem that GAs with UX Solve Efficiently and Repeatedly as Evolution Proceeds

The code used to conduct the experiment is here:
https://github.com/kburjorj/speedyGApy/blob/master/soda14.py

evorithmics

Understanding and Harnessing the Computational Power of Evolution

Explaining Sex, Adaptation, Speciation, and the Emergence of Genetic Modularity

Theoretical Bonafides For Implicit Concurrency