Keep it Pithy

William L. Weaver
TL;DR Innovation
Published in
5 min readFeb 20, 2018

Parallel Probabilistic Model Building Genetic Algorithms

Everyone needs a one-sentence resume (OSR). Those of you who are routinely visited by Senior Vice Presidents, Directors of Research, or very important clients constantly have your OSR at the ready. If you haven’t wielded an OSR, your response should be accurate and succinct when the big boss asks the question, “What do you do?” Throngs of employees are performing equally valuable and varied functions for your organization and the visiting dignitary does not have the time or the jargon requisite to appreciate a 20-minute dissertation on the progress you made over the previous 20 minutes.

Image by PublicDomainPictures on Pixabay

Research and Development centers have individuals exploring multiple components of a large problem simultaneously and it is management’s task to coordinate the investigations through an analysis of intermediate results. Genetic algorithms (GAs) also utilize this large-scale problem-solving paradigm when searching for an optimal solution to computationally complex problems. Akin to other error-space searching algorithms, including SIMPLEX, steepest descent and simulated annealing, GAs propose candidates for the optimal solution, evaluate them to find the best solution yet, and generate additional candidates in that region of minimum error space.

The “genetic” in GA is affixed to the algorithm for its emulation of biological reproduction. Proposed solutions are cast as fixed-length strings of variables or “available genes” and the entire collection of strings resembles a population of chromosomes. Each chromosome is evaluated for its “fitness” in the error-space, the most-fit solution having the lowest value of error. Traditional GAs generate new possible solutions or “children chromosomes” by “mating” the most-fit solutions using crossover and mutation processes similarly utilized by biological genetics.

Problems addressed by GAs have error spaces that are too large for an exhaustive search for the true global minimum and instead provide “best available” solutions. Large chromosome populations help to avoid the traps of local minima and often lead to faster convergence on the best available solution. But it does so at the expense of requiring greater computing power. The rate-limiting step in GAs is most often the time needed to evaluate the fitness of each solution and not the production of new generations to explore.

One method used to alleviate the bottleneck is to have a master processor generate one large population and submit small groups of chromosomes to several other processors for fitness evaluation that subsequently report their scores back to the master. This method addresses the evaluation problem but creates a problem of communication bandwidth. For example, a parallel GA with a population of 1 million, 1000-bit chromosomes must transmit over 1 Gigabit of data between its master and parallel slave processors for each generation in a search requiring perhaps thousands of generations to converge on a solution.

Fernando G. Lobo and colleagues at the University of Algarve in Portugal demonstrated a nifty solution to the bandwidth problem in a paper presented at the Genetic and Evolutionary Computation Conference (GECCO-2004) held this past June. They replace the traditional crossover and mutation GA with a relatively new animal known as a Probabilistic Model Building Genetic Algorithm or PMBGA. One such strain of PMBGA is the “compact GA” (cGA) co-developed by Lobo while a student at the Illinois Genetic Algorithms Laboratory (IlliGAL) of the University of Illinois at Urbana-Champaign.

IlliGAL showed the children generation of possible solutions produced by multiple crossovers is statistically identical to a generation based solely on a probability model of the parent chromosomes. For example, given the following four 4-bit chromosomes to be mated, 1011, 1000, 1001, 1010, a probability model describing all four members is generated by calculating the probability that each bit will have the value of “1”, resulting in a four-value probability model of 1.0, 0.0, 0.5, 0.5. Granted, there are only 16 possible solutions to this 4-bit example, but 1000-bit chromosomes have a tad more. The next generation of any population size can be created based on the probability of each bit having a value of 1. The cGA model is initialized with a probability of 0.5 for each bit and converges when the model only contains probability values of 1.0 and 0.0.

So what about that population of 1 million, 1000-bit chromosomes that must be evaluated for fitness? Lobo suggests a single processor be designated as the manager that keeps a single copy of the latest 1000-value model of the best population. If we use 20-bits to represent the floating point probability value, the model consumes 20 kilobits of memory — down from the 1-Gigabit needed by the traditional GA. Each parallel worker processor requests a copy of the model, generates a large population of possible solutions, evaluates them for fitness, calculates the new probability model and sends the difference between the new and original models back to the manager. The manager adjusts the master probability model using the difference and farms the updated model out at the request of the workers.

The ease of scalability is apparent in that any number of workers can request the latest copy of the master model, calculate an update, and communicate it back to the manager asynchronously. Additional workers can join at any time, the system is fault tolerant to workers leaving or never returning an update and there is no required communication between workers. Additional testing of the parallel cGA will include a distributed Internet cluster analogous to the SETI@home project.

While working for the Air Force Research Laboratory in the early 1990s, a General from the Base Realignment and Closure Commission toured our fuels and combustion laboratory, looked at the millions of dollars in lasers, computers, optics and scopes, and asked, “What does this lab do?” My answer, “We make planes fly faster and further.” Pithy works every time.

This material originally appeared as a Contributed Editorial in Scientific Computing and Instrumentation 21:11 October 2004, pg. 10.

William L. Weaver is an Associate Professor in the Department of Integrated Science, Business, and Technology at La Salle University in Philadelphia, PA USA. He holds a B.S. Degree with Double Majors in Chemistry and Physics and earned his Ph.D. in Analytical Chemistry with expertise in Ultrafast LASER Spectroscopy. He teaches, writes, and speaks on the application of Systems Thinking to the development of New Products and Innovation.

--

--

William L. Weaver
TL;DR Innovation

Explorer. Scouting the Adjacent Possible. Associate Professor of Integrated Science, Business, and Technology La Salle University, Philadelphia, PA, USA