I was born in Greece in 1934 from parents who had been born in Turkey and found themselves in Greece as a result of the Greek/Turkish wars and the infamous “population exchange” of 1924. I grew up listening to stories of the “lost fatherlands,” and I developed a keen interest in history. I might have become a historian if it were not for the poverty of my parents, a result of their disastrous relocation. Instead, I went into Engineering, the usual pathway to middle class status. I graduated in 1957, and I won a scholarship for graduate work at MIT. However the Greek authorities would not give me a deferment for my military service, and I had, instead, to serve for two years in the Greek army. Fortunately, in 1961 I was able to start graduate work at the University of California at Berkeley. In 1964 I received my Ph.D. with a thesis in Control Theory. From there I accepted a position at Princeton University. I had become interested in Pattern Recognition in the spring of 1964 when I took the course Learning Machines taught by Nils Nilsson. However, given the realities of academic life, I could not pursue my new interest right away. I continued to work on Control Theory, focusing on its applications in Biology. My main project had to do with mathematical models of Circadian Rhythms (biological clocks). This research eventually led to a book, Biological Oscillators: Their Mathematical Analysis (Academic Press, 1973). After I received tenure at Princeton in 1968, Pattern Recognition became my focus. I have narrated my transition into this field in my K. S. Fu Prize lecture that I gave in Barcelona in the year 2000. I will now cover my later years. In 1980 I joined Bell Labs, and there I was able to focus on research without having to worry about writing proposals. I could ignore the prevailing wisdom, and I realized that there was something wrong with machine vision. I was invited to give a lecture at the 1986 ICPR in Paris and I chose to speak on "Why Progress in Machine Vision is so Slow." Most of the researchers in Machine Vision thought that I was too pessimistic but my real views were even more negative. My pessimism was rooted in the realization that human vision is neither purely bottom up nor purely top down but, instead, is cycling between bottom up and top down processes. The general idea goes back to Helmholtz and there is significant literature on this issue. I found (years later) the best exposition of this point of view in the book Phantoms in the Brain by V.S. Ramachandran where (on page 56) he states that "Perceptions emerge as a result of reverberations of signals between different levels of the sensory hierarchy, indeed across different senses." He then goes on to criticize the view that "sensory processing involves a one-way cascade of information (processing)." Many machine vision researchers have tried to deal with this issue by imposing mathematical smoothness constraints on the results of bottom up vision but such constraints are at the same time too severe and too lax. Too severe because they do not allow for discontinuities and too lax because they may not capture all the constraints expected in the physical world. I think that until we understand such “middle vision” the general machine vision problem cannot be solved. As a result, the most promising research areas are in specific applications where there is a fairly complete model of the images under consideration. Research should also aim for high performance solutions that, as my former colleague Henry Baird has pointed out, are needed if we are going to make credible claims that we have solved a problem. I decided to focus on Document Image Analysis (mainly OCR) because the problems there are well constrained while still quite challenging. Ironically, there are claims that OCR is a solved problem. I will agree with that view when I see recognition methods that need to be trained like humans, using only a handful of samples rather than the several thousand that are currently needed. After I left Bell Labs to join Stony Brook University I continued research in that field (supported by NSF, the U.S. Postal Service, and Ricoh) but also in other application areas. I had two projects on aerial image analysis from the aerospace industry; one from Lockheed on building detection and the other from Grumman on road detection. Since the two sponsors were competitors, one of the challenges was to keep the projects separate. I also spent considerable time on problems related to bar codes in a collaborative effort with Symbol Technologies. I supervised three PhD theses by people who were employed by the company while they were enrolled as graduate students at Stony Brook. The best known result of these collaborations is the development of the two-dimensional bar code PDF417 shown on the left. My student, Y. P. Wang not only did the theoretical work leading to the development of the code but also led an engineering team that made the new code a product. However, the idea with the most widespread impact was to decode barcodes directly from gray scale without binarization. The power of that approach was demonstrated in the Ph. D. thesis of E. Joseph but his method was not practical for the low end processors used in scanners. S. J. Shellhammer and D. P. Goren developed another way to implement the concept and that method is used in most bar code scanners today. The reading of bar codes may appear to be a trivial problem until one realizes the demands of the application. Misreads (substitution errors) must not occur more than once in a million scans while the rejection rate should be under 1%. In addition there are severe constraints on the cost of the scanners so that captured signal is quite blurred. There is a vast distance between solving a problem in the laboratory and solutions that are viable for a commercial application. The success of bar code reading without binarization encouraged me to pursue text recognition without binarization. I supervised two PhD theses on this methodology (and we published several papers) but I am not aware of the methodology being used in any commercial products. For the last few years I have been able to find the time to satisfy my early interest in history, and I teach a two-semester course on Middle East History in the Osher Lifelong Learning Institute at Stony Brook. My engineering background is helping me to look for the deep currents of history. You can find out what I am currently up to from my web site theopavlidis.com. |
Getting to know… Theo Pavlidis, IAPR Fellow |
Other articles in the Getting to Know...Series:
How IAPR helped me to become Rector of the Belarusion National University by Sergey Ablameyko, IAPR Fellow
Getting to know...Herbert Freeman, IAPR Fellow
From Document Analysis to Anti-Phishing by Wenyin Liu, IAPR Fellow
In Memoriam…Piero Mussio, IAPR Fellow by Paolo Bottoni and Stefano Levialdi
Image Analysis with Discrete Tools by Gabriella Sanniti di Baja
Has the time for telepresence finally come? by Larry O’Gorman
Biometrics: The key to the gates of a secure and modern paradise by Nalini K. Ratha
Recognition of Human Activities: A Grand Challenge by J.K. Aggarwal |
Why Applications are Important |
By Theo Pavlidis, IAPR Fellow (USA) |
Professor Theo Pavlidis, IAPR Fellow
ICPR 1994, Jerusalem, Israel For contributions to computer graphics and image processing and service to IAPR |
Theo Pavlidis received a Ph.D. in Electrical Engineering from the University of California at Berkeley in 1964. He was on the faculty of Princeton University during 1964-80, a member of the technical staff at AT&T Bell Labs during 1980-86 and on the faculty of Stony Brook University as a Leading Professor during 1986-1995 and as a Distinguished Professor during 1995-2001. During 2001-2002 he was chief computer scientist of Symbol Technologies. He is now Distinguished Professor Emeritus at Stony Brook University. He has consulted for numerous companies in the past including Symbol Technologies, Ricoh of Japan, AT&T Bell Labs, Datacopy, Exxon, RCA, etc. He became a fellow of IEEE (Institute of Electrical and Electronic Engineers) in 1979 and of IAPR (International Association for Pattern Recognition) in 1994. In 1999 he became a Life Fellow of IEEE. In 2000 he was awarded by IAPR the King-Sun Fu prize for "fundamental contributions to the theory and methodology of structural pattern recognition." He has authored more than 150 technical papers. He also authored five books, co-edited three books and received fifteen patents on various aspects of bar coding and document analysis. He is the co-inventor of the two-dimensional bar code PDF417. He was the editor-in-chief of the IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) from 1982 to 1986, and has been a member of the editorial board of many other journals, including the IEEE Proceedings. He was the program chairman of IDCAR'93 (Intern. Conf. on Document Analysis and Recognition) and has served in the past as general chairman of the Fifth International Conference on Pattern Recognition (1980) and the 1988 IEEE Conference on Robotics and Automation. From 1993 to 1996 he was on the board of governors of the IEEE Computer Society. |