The past decade has witnessed an explosion in the amount and complexity of life sciences data, such as DNA and protein sequences, gene and protein expressions, structures, pathways, genetic information, biomedical text data, and molecular images.  Although the analyses of these data involve pattern recognition and data mining, the novel and efficient data analyses techniques have not realized their true potential. Bioinformatics can be viewed as a field of discovering knowledge from life sciences data with the aid of Information Technology, to find answers to unresolved problems in biology.  An example of the benefits of bioinformatics research could be the discovery of new drugs.  A grand challenge in the post-genome era is to understand how the information stored in the genome, or the blueprint of life, influences the intrinsic functions of the living organisms.

 

The information stored in DNA, a chain of four nucleotides (A, T, G, and C), is first converted to mRNA through the process of transcription and then converted to the functional form of life, proteins, through the process of translation.  The initiation of the translation or the transcription process depends on the presence of specific patterns of DNA, RNA, and motifs.  Research on detecting specific patterns of DNA sequences, such as genes, protein coding regions, and promoters, leads to the better understanding of molecular level function of the cell.  Comparative genomics focuses on comparisons across the genomes to find conserved patterns over the evolution, which should possess some functional significance.  Construction of evolutionary trees is useful to know how genome and proteome are evolved over all species by ways of a complete library of motifs and genes.

 

A protein’s functionality or its interaction with another protein is mainly determined by its 3-D structure.  Prediction of protein’s 3-D structure from its 1-D amino-acid sequence remains an open problem in structural genomics; protein-protein interactions determine essential functions in living cells.  Computational modeling and visualization tools of 3-D structures of proteins help biologists to infer cellular activities.

 

The challenge in functional genomics is to analyze gene expression data accumulated by microarray techniques to discover the clusters of co-regulated genes and thereby gene regulatory networks, leading to the understanding of regulatory mechanisms of genes and pathways.  Molecular imaging provides techniques for in vivo sensing and imaging of molecular events, which measure biological processes in living organism at the molecular and cellular level.  The techniques to fuse and integrate different kinds of information derived from different life science data are yet to realize their full potential.

 

The ever-expanding knowledge of biomedical and phenotype data, combined with genotypes, is becoming difficult to be analyzed by traditional text-based methods.  Advanced data mining techniques, where the use of ontologies for constructing precise descriptors of medical concepts and procedures, are required in the field of medical informatics.  The vast amount of biological literature is posing new challenges in the field of text mining.  These text mining techniques along with the aid of information fusion methods could help find pathways and interaction networks.

 

The goal of TC20 is to bring together pattern recognition scientists and life scientists to find solutions to problems in bioinformatics and to foster multidisciplinary research in the pattern recognition community.  A workshop on Pattern Recognition Techniques for Bioinformatics is currently being planned.

 

Please visit the TC20 web site (http://www.cse.psu.edu/~acharya/IAPR/iapr.htm) for more information.

Technical Committee Report

TC20 Pattern Recognition for Bioinformatics

 

 

 

 

 

                                                  

 

Click here for Top of Page

TC Reports in

the IAPR Newsletter

 

pdf:

IAPR Newsletter, July 2004

TC15 Graph-based Representations

TC19 Computer Vision for Cultural Heritage Applications

 

IAPR Newsletter, January 2004

 

TC5 Benchmarking Software

TC7 Remote Sensing

TC10 Graphics Recognition

TC11 Reading Systems

Raj Acharya

Chair, TC20

Jagath Rajapakse

Vice-Chair, TC20