learning juntas

To appear in the proceedings of the Foundations of Genetic Algorithms Conference, 2015

Hypomixability Elimination in Evolutionary Systems

Abstract: Hypomixability Elimination is an intriguing form of computation thought to underlie general-purpose, non-local, noise-tolerant adaptation in recombinative evolutionary systems. We demonstrate that hypomixability elimination in recombinative evolutionary systems can be efficient by using it to obtain optimal bounds on the time and queries required to solve a subclass $(k=7, \eta=1/5)$ of a familiar computational learning problem: PAC-learning parities with noisy membership queries; where $k$ is the number of relevant attributes and $\eta$ is the oracle’s noise rate. Specifically, we show that a simple genetic algorithm with uniform crossover (free recombination) that treats the noisy membership query oracle as a fitness function can be rigged to PAC-learn the relevant variables in $O(\log (n/\delta))$ queries and $O(n \log (n/\delta))$ time, where $n$ is the total number of attributes and $\delta$ is the probability of error. To the best of our knowledge, this is the first time optimally efficient computation has been shown to occur in an evolutionary algorithm on a non-trivial problem.

The optimality result and indeed the implicit implementation of hypomixability elimination by a simple genetic algorithm depends crucially on recombination. This dependence yields a fresh, unified explanation for sex, adaptation, speciation, and the emergence of modularity in evolutionary systems. Compared to other explanations, Hypomixability Theory is exceedingly parsimonious. For example, it does not assume deleterious mutation, a changing fitness landscape, or the existence of building blocks.

Recently, I showed that a simple genetic algorithm with uniform crossover can solve a noisy learning 7-parity problem efficiently—in $O(n \textrm{ polylog}(n, \frac{1}{\epsilon}))$ time and $O(\textrm{polylog}(n, \frac{1}{\epsilon}))$ membership queries. Learning parities is a subclass of a more general problem—learning juntas—that is of considerable interest to computational learning theorists because it neatly abstracts the general problem of learning in the presence of large amounts of irrelevant information. This is a very common problem in practice. A good example comes, coincidentally, from the field of genetic epidemiology—learning the genetic loci participating in the penetrance of a particular disease. (More about this “coincidence” in a bit)

The implicit concurrency hypothesis posits that the efficient learning of relevant variables in the presence of large numbers of irrelevant variables is precisely what gives evolution its power; so, understandably, when I read about the learning juntas problem a few weeks ago (on this blog), I sensed the possibility of a connection. Formally, a $k$ -junta is a function $f$ over $n$ boolean variables, only $j\,\,(\leq k\leq n)$ of which matter (i.e. are relevant) in the determination of the output of $f$ . The $j$ relevant variables are said to be the juntas of $f$ , and the function $f$ is completely characterized by its juntas and by a hidden boolean function $h$ over $j$ inputs. The output of $f$ is just the output of $h$ on the values of the $j$ relevant variables of $f$ (the values of the irrelevant variables are simply ignored). The problem of identifying the relevant variables is called the learning juntas problem.

Observe how this problem abstracts the genetic epidemiology problem mentioned above. The junta are the genetic loci participating in the presence/absence of the disease. The hidden function specifies which configurations of alleles at these loci cause the disease, and which ones do not. The problem of learning the relevant loci is the problem of learning juntas. Also note that the learning juntas problem generalizes the learning parities problem straightforwardly. Whereas the hidden function in the case of learning parities must be the parity function, the hidden function in the case of learning juntas can be any boolean function over the juntas.

Here is an evolution based algorithm, written in Python, that learns juntas given a membership query oracle. And here is a tutorial on its usage. For any value of $k$ , the theory of implicit concurrency suggests that the $k$ -junta for which it is hardest for a genetic algorithm to learn one or more relevant attributes (Mossel et al observed in Learning Juntas that to crack the problem, asymptotically speaking, one only needs to find a single relevant attribute) is the $k$ -junta with a hidden parity function over $k$ variables. This insight and my previous work on learning parities suggests that the evolutionary algorithm presented can learn the relevant attributes of $k$ -juntas with small $k$ in $O(n\textrm{ polylog}(n, \frac{1}{\epsilon}))$ time and $O(\textrm{polylog}(n, \frac{1}{\epsilon}))$ queries.

Implications for the Neo-Darwinian Synthesis

It seems like more than a coincidence that

The Learning Juntas problem abstracts the genetic epidemiology problem mentioned earlier
Evolutionary computation can efficiently solve the Learning Juntas problem for small numbers of juntas.

One is hard pressed not to hypothesize that evolution takes the form it takes precisely because this form allows it to solve a problem that is similar to the genetic epidemiology problem in all significant ways, save two :

The problem is to learn the genetic loci participating in fitness/hotness not disease penetrance (often the same loci, but by no means always)
Unlike computational geneticists, evolution has access to an active oracle.

Given the straightforwardness of this idea it may seem odd that it has not been forthcoming from Evolutionary Biology itself; that is until one realizes that a deep-rooted practice in Population Genetics actively steers theorists away. A vestige from the days before computers, when pen, paper, and mathematics was all one had, the conflation of the unit of inheritance with the unit of selection casts a long shadow. More about this in Sections 2.4.1 and 5.1 of my dissertation.

What now?

Despite the routine use of evolutionary algorithms in science and engineering, computer scientists have been unable to give a straight answer to a simple question: “Is there anything computationally efficient about evolution?” This question can now be answered in the affirmative. Recombinative evolutionary algorithms are capable of efficiently solving a broad, non-trivial problem: Learning $k$ -Juntas with membership queries when $k$ is small.

While fascinating from a biological perspective, this answer is not entirely satisfying to the engineer in me because there are conventional algorithms that solve the problem under consideration as efficiently; for example, the binary search algorithm due to Blum and Langley described in Section 5 of the paper by Mossel et al. A more involved algorithm due to Feldman solves the problem almost as efficiently, and does so non-adaptively and in the presence of random persistent classification error.

Have modern ML techniques eclipsed evolution at the very thing it is good at?

I have a hunch that the answer is “no”. It would be fantastic to find generalizations of the learning juntas with membership queries problem that are more difficult for conventional algorithms, but remain efficiently solvable by evolution.

Relevant Computational Learning Theory papers
R.J. Lipton
[web] The Learning Juntas Problem
(An excellent introduction to the Learning Juntas problem)

E. Mossel, R. O’donnell, and R. Servedio
[pdf] Learning Juntas,
(Section 3.1 is especially relevant if you want to reverse engineering the posted Python algorithm)

A. Blum, L. Hellerstein, and N. Littlestone
[pdf] Learning in the Presence of Finitely or Infitely Many Irrelevant Attributes

V. Feldman
[pdf] Attribute-Efﬁcient and Non-adaptive Learning of Parities and DNF Expressions

Relevant papers on Implicit Concurrency
K.M. Burjorjee
[pdf] The Fundamental Learning Problem that Genetic Algorithms with Uniform Crossover Solve Efficiently and Repeatedly as Evolution Proceeds

K.M. Burjorjee
[pdf] Explaining Optimization in Genetic Algorithms with Uniform Crossover

Relevant papers on genetic epidemiology
J.H. Moore, L.W. Hahn, M.D. Ritchie, T.A. Thornton, and B.C. White
[web] Routine Discovery of Complex Genetic Models using Genetic Algorithms

J.H. Moore
[web] The ubiquitous nature of epistasis in determining susceptibility to common human diseases

Last Updated: September 18, 2013

evorithmics

Understanding and Harnessing the Computational Power of Evolution

Explaining Sex, Adaptation, Speciation, and the Emergence of Genetic Modularity

Efficiently Evo-learning Juntas With Membership Queries

Implications for the Neo-Darwinian Synthesis

What now?