Computing Reviews
Today's Issue Hot Topics Search Browse Recommended My Account Log In
Review Help
Search
Classification and learning using genetic algorithms : applications in bioinformatics and Web intelligence (Natural Computing Series)
Bandyopadhyay S., Pal S., Springer-Verlag New York, Inc., Secaucus, NJ, 2007. 311 pp. Type: Book (9783540496069)
Date Reviewed: Feb 8 2008

This is an ambitious book, a “treatise and a unified framework.” It tries to simultaneously be an introduction, a theoretical approach, and an exemplification. In my view, it also serves as a review of the growing literature on genetic algorithms (GAs).

GAs emulate biological principles to solve demanding optimization and search problems. Each coded possible solution of the problem is called a “chromosome.” GA approaches the optimum solution via sampling, by applying stochastic operations on chromosomes (like selection, mutation, and crossover) and by using one or more optimization fitness functions.

The book is well organized; each chapter begins with an introduction and ends with a summary.

The first chapter introduces the context of the application of GAs: pattern recognition, classification, and clustering. Some basic concepts described here include: nearest-neighborhood rules, Bayesian maximum likelihood classifier, multilayer perceptron (MLP), and fuzzy sets.

The second chapter, “Genetic Algorithms,” describes different steps involved in GAs (definition of chromosomes and genetic operators on chromosomes), and sketches the proof of the schema theorem. The schema is a subset of strings (chromosomes) with similarities at certain string positions. The theorem is fundamental: it proves that GAs are convergent to the optimal solution in certain conditions.

The third chapter is called “Supervised Classification Using GAs.” The basic classification principle used in GA classifiers is the approximation of the class boundaries using a fixed or a variable number of hyperplanes (or higher-order surfaces), such that one or more objective criteria functions are optimized. In this chapter, a fixed number of hyperplanes and fixed-length chromosomes are used. The chapter ends with clear experimental results: GAs outperform nearest neighborhood and MLP classifiers.

“Theoretical Analysis of the GA-Classifier” is the fourth chapter. It describes in mathematical terms and proofs the relationship between the GA classifier and the Bayesian classifier. The theoretical findings are supported abundantly with empirical results obtained from the application of GAs to four data sets.

The next chapter, “Variable String Lengths in GA-Classifier,” explores the case of a variable number of hyperplanes. The fitness function in this case also includes the minimization of the number of separation hyperplanes. A short proof shows that the variable string length GA (VGA) provides the minimum number of hyperplanes for an infinitely large number of iterations in certain conditions. Interestingly, one VGA is used to design the MLP. The goal is to eliminate the need to train the MLP, deriving the MLP architecture and the connection weights from VGA classification.

Chapter 6 is called “Chromosome Differentiation in VGA-Classifier.” Chromosome differentiation involves using male and female classes of chromosomes, and changing of evolution rules to enforce genetic modifications from sets of chromosomes as far apart as possible—similar to sexual differentiation in nature. This apparently reduces the computation time and improves performance. The schema theorem, from chapter 2, is adapted for chromosome differentiation and exemplification, and uses satellite image data of Calcutta.

Chapter 7, “Multiobjective VGA-Classifier and Quantitative Indices,” considers the following objectives: the number of misclassified samples, the number of hyperplanes, and the product of the class-wise correct recognition rates. Different multiobjective optimization techniques can be applied, and some real-life data examples justify which one should be favored.

Chapter 8, “Genetic Algorithms in Clustering,” deals with unsupervised classification, suitable when data is not labeled. It starts with examples of the classical k-means single-linkage and fuzzy c-means algorithms. The authors consider GAs and VGAs for both crisp and fuzzy data, with both a fixed and variable number of clusters. The examples are, again, very helpful.

In chapter 9, “Genetic Learning in Bioinformatics,” beginners might get a glimpse into the vast area of bioinformatics. Topics covered, each one with specific algorithms, some GA based and some not, include: multiple sequence alignment; gene mapping on chromosomes; promoter identifications in DNA sequences; gene regulatory network identification; constructions of phylogenetic trees; DNA/RNA/protein structure prediction; and molecular design and docking.

A short last chapter, “Genetic Algorithms and Web Intelligence,” contains an introduction to Web intelligence and mining problems (huge, distributed, semi-structured, time varying, and high dimensional). The review of the literature reveals the potential of GAs for search and retrieval, query optimization, and reformulation.

The authors recognize the limits of GAs and the research areas that need improvement. First, they are not efficient enough for the scale of the problems (design of problem-specific mutation/selection/crossover operators are necessary). Second, they require extensive experimentation for the specification of several parameters (chromosome length and mutation probabilities). Third, they involve a large degree of randomness, and different runs may produce different results (it is necessary to incorporate additional problem-specific domain knowledge to reduce the randomness and computational time).

Overall, this is a good introductory book in an area that seems to have passed its infancy.

Reviewer:  Adrian Pasculescu Review #: CR135247 (0811-1061)
Bookmark and Share
 
Pattern Recognition (I.5 )
 
 
Applications (I.5.4 )
 
 
Clustering (I.5.3 )
 
 
Learning (I.2.6 )
 
 
Miscellaneous (I.5.m )
 
 
Models (I.5.1 )
 
  more  
Would you recommend this review?
yes
no
Other reviews under "Pattern Recognition": Date
Classification and learning using genetic algorithms: applications in bioinformatics and Web intelligence (Natural Computing Series)
Bandyopadhyay S., Pal S., Springer-Verlag New York, Inc., Secaucus, NJ, 2007.  311, Type: Book (9783540496069), Reviews: (1 of 2)
Oct 24 2007
Computational intelligence: concepts to implementations
Eberhart R., Shi Y., Morgan Kaufmann Publishers Inc., San Francisco, CA, 2007.  496, Type: Book (9781558607590)
Feb 22 2010
Joint discriminative-generative modelling based on statistical tests for classification
Xue J., Titterington D. Pattern Recognition Letters 31(9): 1048-1055, 2010. Type: Article
Dec 6 2011
more...

E-Mail This Printer-Friendly
Send Your Comments
Contact Us
Reproduction in whole or in part without permission is prohibited.   Copyright 1999-2024 ThinkLoud®
Terms of Use
| Privacy Policy