Springer Handbook of

Speech Processing


by J. Benesty, M. M. Sondhi, Y Huang, eds.

Springer, 2008


Reviewed by:  Lawrence O’Gorman


Click here for Top of Page
Right Arrow: Next
Right Arrow: Previous

This major textbook befits the mature status of the field of speech processing about 40 years into its digital lifetime. This book speaks “major” in many ways. It is 1176 pages long. There are 53 chapters, and these chapters are organized into 9 parts. The list of 83 contributing authors includes many who were involved originally in the field and and many who are now at the forefront of research in the particular topic on which they contributed.

It is evident that books in the Springer Handbook series are meticulously formatted to function as efficiently as possible as a source that can be quickly accessed in many ways. Starting from the front inside flap, there are part summaries, along with each chapter title. The table of contents includes part title, chapter title, and titles of subsections. This table of contents runs 13 pages. Next there is an alphabetically ordered list of abbreviations, numbering over 500 entries. Following the book chapters is another table of contents, this one “detailed”, which includes subsections as well sections. Following this is the subject index. Although authors of the chapters are listed with their biographies, noticeably absent is an author index that would include not only contributors but entries for referenced papers. Finally, the back flap has a DVD-ROM containing full, searchable contents of the book.

Part A begins with the basics of human speech production. It is entitled “Production, Perception, and Modeling of Speech”. The four chapters cover the physiological basics of speech production, cochlear speech and sound perception, and subjective and objective speech quality assessment methods.

Part B introduces speech signal processing methods. These include: Wiener and adaptive filters, linear prediction, Kalman filter, homomorphic and cepstral processing, pitch determination, formant estimation, Fourier transform, and multichannel identification.

Part C describes speech coding. The chapters cover: principles, voice-over-IP, low bit rate coding, analysis by synthesis coding, and perceptual audio coding.

Part D covers text-to-speech synthesis. The chapters cover: principles, rule-based synthesis, corpus-based synthesis, linguistic processing, prosodic processing, voice transformation, and expressive/affective synthesis.

Part E is a large section covering speech recognition. Chapters here are: history of automatic speech recognition and natural language understanding, hidden Markov models, weighted finite-state transducers, machine learning, toward superhuman recognition, natural language understanding, spontaneous speech recognition, environmental robustness, the business of speech technologies, and spoken dialogue systems.

Part F is on speaker recognition. Chapters are: overview, text-dependent and text independent recognition.

Part G elevates from signals to language recognition. The chapters cover: principles, spoken language characterization, spectral- and token-based approaches, and vector-based classification.

Part H describes speech enhancement techniques. The chapters here are: fundamentals of noise reduction, spectral enhancement, adaptive echo cancellation, dereverberation, adaptive beamforming and postfiltering, feedback control in hearing aids, and active noise control.

Finally, Part I covers multichannel speech processing. The chapters are: microphone arrays, time delay estimation and source localization, convolutive blind source separation methods, and sound field reproduction.

This handbook provides a convenient way for anyone— graduate student, researcher, or practitioner—to own a single comprehensive reference book to begin in this field. Because of the breadth and depth, this book also offers non-beginners and speech experts deeper coverage of many topics. This book well-befits the large, technologically and commercially successful, and still progressing field of speech processing.

Click above to go to the publisher’s web page where you can read about this book and link to the Table of Contents, sample pages and reviews.

Book Reviews Published in

the IAPR Newsletter


Numerical Recipes:  The art of scientific computing, 3rd ed.

by Press, Teukolsky, Vetterling and Flannery

             (see review in this issue)


Feature Extraction and Image Processing, 2nd ed.

by Nixon and Aguado

             (see review in this issue)


Digital Watermarking and Steganography:

Fundamentals and Techniques

by Shih

             (see review in this issue)


Digital Image Processing: An Algorithmic Introduction Using Java

by Burger and Burge

             (see review in this issue)


Bézier and Splines in Image Processing and Machine Vision

by Biswas and Lovell

             (see review in this issue)


Practical Algorithms for Image Analysis, 2 ed.

by  O’Gorman, Sammon and Seul

             Apr ‘08   [html]     [pdf]


The Dissimilarity Representation for Pattern Recognition:  Foundations and Applications

by Pekalska and Duin

             Apr ‘08   [html]     [pdf]


Handbook of Biometrics

by Jain, Flynn, and Ross (Editors)

             Apr ‘08   [html]     [pdf]


Advances in Biometrics –

Sensors, Algorithms, and Systems

by Ratha and Govindaraju, (Editors)

             Apr ‘08   [html]     [pdf]


Dynamic Vision for Perception and Control of Motion

by Dickmanns

             Jan ‘08   [html]     [pdf]



by Polanski and Kimmel

             Jan ‘08   [html]     [pdf]


Introduction to clustering large and high-dimensional data

by Kogan

             Jan ‘08   [html]     [pdf]


The Text Mining Handbook

by Feldman and Sanger

             Jan ‘08   [html]     [pdf]


Information Theory, Inference,

and Learning Algorithms

by Makay

             Jan ‘08   [html]     [pdf]


Geometric Tomography

by Gardner

           Oct ‘07   [html]     [pdf]


“Foundations and Trends in Computer Graphics and Vision”

Curless, Van Gool, and Szeliski., Editors

           Oct ‘07   [html]     [pdf]


Applied Combinatorics on Words

by M. Lothaire

           Jul ‘07    [html]     [pdf]



Human Identification Based on Gait

by Nixon, Tan and Chellappar

             Apr ‘07   [html]     [pdf]


Mathematics of Digital Images

by Stuart Hogan

             Apr ‘07   [html]     [pdf]


Advances in Image and Video Segmentation

Zhang, Editor

             Jan ‘07 [html]      [pdf]


Graph-Theoretic Techniques for Web Content Mining

by Schenker, Bunke, Last and Kandel

             Jan ‘07 [html]      [pdf]


Handbook of Mathematical Models in Computer Vision

by Paragios, Chen, and Faugeras (Editors)

           Oct ‘06     [html]     [pdf]


The Geometry of Information Retrieval

by van Rijsbergen

           Oct ‘06     [html]     [pdf]


Biometric Inverse Problems

by Yanushkevich, Stoica, Shmerko and Popel

           Oct ‘06     [html]     [pdf]


Correlation Pattern Recognition

by Kumar, Mahalanobis, and Juday

           Jul. ‘06     [html]     [pdf]


Pattern Recognition 3rd Edition

by Theodoridis and Koutroumbas

           Apr. ‘06    [html]     [pdf]


Dictionary of Computer Vision and

Image Processing

by R.B. Fisher, et. Al

           Jan. ‘06    [html]     [pdf]


Kernel Methods for Pattern Analysis

by Shawe-Taylor and Cristianini

           Oct. ‘05    [html]     [pdf]


Machine Vision Books

           Jul. ‘05     [html]     [pdf]


CVonline:  an overview

           Apr. ‘05    [html]     [pdf]


The Guide to Biometrics by Bolle, et al

           Jan. ‘05    [html]     [pdf]


Pattern Recognition Books

           Jul. ‘04                  [pdf]