Classification and Learning

Using Genetic Algorithms:

Applications in Bioinformatics and Web Intelligence


by Sanghamitra Bandyopadhyay and Sankar K. Pal

Springer, June 2007


Reviewed by: Zheng Liu (Canada)

Click here for Top of Page

A genetic algorithm (GA) is a search technique that can be applied for large, complex, and multimodal search spaces. It emulates biological principles, such as inheritance, mutation, selection, and crossover, to solve complex optimization problems. This 10-chapter book provides a framework describing how GAs can be applied to pattern recognition (PR) and learning systems. Each chapter consists of introduction/background information, theoretical details, experimental results, and a summary. These chapters are well connected through their introductions. Readers can clearly see how the whole book is developed.  From the experimental results, readers can also learn how the GA-based method is compared with traditional PR methods.

The first chapter gives a brief introduction to pattern recognition, which is good for those who may not have such a background. Chapters 3 to 8 describe the details of the GA algorithm and its use for classification, clustering, and multi-objective optimization. Two specific applications, i.e., learning in bioinformatics and web intelligence, are presented in the last two chapters.

The basic principles of GA algorithms are described in Chapter 2. Chapter 3 discusses how the GA algorithm can be applied to one of the "classic problems" in pattern recognition, i.e., supervised classification. GA can be used to facilitate fuzzy rule-based classification and optimize the decision tree method. Moreover, a GA classifier can be created with the training samples by searching for a number of linear segments that form the boundaries between different classes and minimize the misclassification rate. It is interesting to see the performance comparison with some classical methods, such as Bayes maximum likelihood classifier, k-NN (k-nearest neighborhood) classifier, and MLP (multilayer perceptron). It is no surprise that the GA classifier is comparable to, or even better than, those methods. However, the parameter, string length H, is crucial for good performance so needs to be selected carefully.

Chapter 4 presents a theoretical analysis of the GA classifier in comparison with the Bayes classifier. For a Bayes classifier, a priori probability and the class conditional density need to be known. However, in a practical application, this may not be possible and this is the gap that a GA classifier can fill.

The discussion of the importance of string-length H is continued in Chapter 5. An empirical estimation may degrade the performance of a GA classifier. An automatic evolving process to generate a value of H is described in this chapter. With this value, both the number of misclassified samples and the number of hyperplanes are minimized. The concept of variable length strings in GA is introduced, i.e., the length of the string is not fixed. The name “variable string length genetic algorithm” (VGA) is derived.

Chapter 6 describes the integration of variable length chromosomes and GA with chromosome differentiation (GACD), which results in a nonparametric VGACD classifier. The test results of classifying the SPOT image of Calcutta demonstrate the superiority of the VGACD classifier to the VGA classifier, Bayes classifier, and k-NN classifier.

In Chapter 7, a multi-objective GA-based classifier is described. Three optimization techniques based on constrained elitist multi-objective GA (CEMOGA), Pareto archived evolutionary strategies (PAES), and non-dominated sorting GA (NSGA-II) are used to develop the multi-objective classifiers. The validating and testing results are presented in the experiments, which indicate the GA-based multi-objective classifier outperforms other multi-objective optimization techniques.

Chapter 8 deals with another classical problem in pattern recognition, i.e. clustering or unsupervised classification. Similar to Chapter 3, the authors started with the traditional methods like K-means, and fuzzy c-means clustering. Then, the use of GA to search the appropriate cluster center is described. The details of GA-based approaches for crisp clustering and fuzzy clustering into fixed or variable number of clusters are presented.

Two applications using GA-based methods are given in Chapters 9 and 10 respectively. One is bioinformatics and the other is web intelligence. Although few implementation details are provided, readers can learn how to solve a practical problem with GA-based approaches.

A flaw in an otherwise perfect book may be some of the figures, which are not uniformly formatted due to the different aspect ratios or to the limitation of paper size. However, this does not hurt the excellent contents presented in the book. This book tries to balance the mixture of theories, algorithms, and applications and is a good reference for people who want to solve a complex optimization problem for their field. As a reader, I may be more curious about how to implement the GA algorithms and how they work for the datasets provided,  and I used in this book, even without going through the equations. If the theories were demonstrated with "codes", either commercial or open source software, this would be helpful to a novice. Some information or links about the GA software or the authors' own implementation in the appendix would be an added value and especially helpful to students. Overall, this book is well organized and well written. There is no doubt that this is another good pattern recognition reference to have on one’s bookshelf.  


Click above to go to the publisher’s web page for this book where you will find information about the book, contents and sample pages.

Right Arrow: Next
Right Arrow: Previous

Book Reviews Published in

the IAPR Newsletter


Close Range Photogrammetry:  Principles, Methods, and Applications

by Luhmann, Robson, Kyle, and Harley

             (see review in this issue)


Learning Theory:

An Approximation Theory Viewpoint

by Cucker and Zhou

             (see review in this issue)


Character Recognition Systems—A Guide for Students and Practitioners

by Cheriet, Kharma, Liu, and Suen

             (see review in this issue)


Geometry of Locally Finite Spaces

by Kovalevsky

             (see review in this issue)


Machine Learning in Document Analysis and Recognition

by Marinai and  Fujisawa (Editors)

             (see review in this issue)


From Gestalt Theory to Image Analysis—A Probabilistic Approach

By Desolneux, Moisan, and Morel

             (see review in this issue)


Numerical Recipes:  The art of scientific computing, 3rd ed.

by Press, Teukolsky, Vetterling and Flannery

             Jul ‘08    [html]     [pdf]


Feature Extraction and Image Processing, 2nd ed.

by Nixon and Aguado

             Jul ‘08    [html]     [pdf]


Digital Watermarking and Steganography:

Fundamentals and Techniques

by Shih

             Jul ‘08    [html]     [pdf]


Springer Handbook of Speech Processing

by Benesty, Sondhi, and Huang, eds.

             Jul ‘08    [html]     [pdf]


Digital Image Processing: An Algorithmic Introduction Using Java

by Burger and Burge

             Jul ‘08    [html]     [pdf]


Bézier and Splines in Image Processing and Machine Vision

by Biswas and Lovell

             Jul ‘08    [html]     [pdf]


Practical Algorithms for Image Analysis, 2 ed.

by  O’Gorman, Sammon and Seul

             Apr ‘08   [html]     [pdf]


The Dissimilarity Representation for Pattern Recognition:  Foundations and Applications

by Pekalska and Duin

             Apr ‘08   [html]     [pdf]


Handbook of Biometrics

by Jain, Flynn, and Ross (Editors)

             Apr ‘08   [html]     [pdf]


Advances in Biometrics –

Sensors, Algorithms, and Systems

by Ratha and Govindaraju, (Editors)

             Apr ‘08   [html]     [pdf]


Dynamic Vision for Perception and Control of Motion

by Dickmanns

             Jan ‘08   [html]     [pdf]



by Polanski and Kimmel

             Jan ‘08   [html]     [pdf]


Introduction to clustering large and high-dimensional data

by Kogan

             Jan ‘08   [html]     [pdf]


The Text Mining Handbook

by Feldman and Sanger

             Jan ‘08   [html]     [pdf]


Information Theory, Inference,

and Learning Algorithms

by Makay

             Jan ‘08   [html]     [pdf]


Geometric Tomography

by Gardner

           Oct ‘07   [html]     [pdf]


“Foundations and Trends in Computer Graphics and Vision”

Curless, Van Gool, and Szeliski., Editors

           Oct ‘07   [html]     [pdf]


Applied Combinatorics on Words

by M. Lothaire

           Jul ‘07    [html]     [pdf]



Human Identification Based on Gait

by Nixon, Tan and Chellappar

             Apr ‘07   [html]     [pdf]


Mathematics of Digital Images

by Stuart Hogan

             Apr ‘07   [html]     [pdf]


Advances in Image and Video Segmentation

Zhang, Editor

             Jan ‘07 [html]      [pdf]


Graph-Theoretic Techniques for Web Content Mining

by Schenker, Bunke, Last and Kandel

             Jan ‘07 [html]      [pdf]


Handbook of Mathematical Models in Computer Vision

by Paragios, Chen, and Faugeras (Editors)

           Oct ‘06     [html]     [pdf]


The Geometry of Information Retrieval

by van Rijsbergen

           Oct ‘06     [html]     [pdf]


Biometric Inverse Problems

by Yanushkevich, Stoica, Shmerko and Popel

           Oct ‘06     [html]     [pdf]


Correlation Pattern Recognition

by Kumar, Mahalanobis, and Juday

           Jul. ‘06     [html]     [pdf]


Pattern Recognition 3rd Edition

by Theodoridis and Koutroumbas

           Apr. ‘06    [html]     [pdf]


Dictionary of Computer Vision and

Image Processing

by R.B. Fisher, et. Al

           Jan. ‘06    [html]     [pdf]


Kernel Methods for Pattern Analysis

by Shawe-Taylor and Cristianini

           Oct. ‘05    [html]     [pdf]


Machine Vision Books

           Jul. ‘05     [html]     [pdf]


CVonline:  an overview

           Apr. ‘05    [html]     [pdf]


The Guide to Biometrics by Bolle, et al

           Jan. ‘05    [html]     [pdf]


Pattern Recognition Books

           Jul. ‘04                  [pdf]