Google, Inc., USA
Title: Advice to a Promising OCR Researcher
Document Analysis and Recognition remains a vibrant and challenging field, spanning and touching several domains, including pattern recognition, computer vision, linguistics, digital humanities, and augmented reality. Probably most of the best work in this field remains to be done. That work will build on what came before -- in terms of techniques and understanding already achieved, but also by learning from the best practices of our colleagues and predecessors. As an OCR researcher, in this talk I'll try to reflect on some of the advice I've received from mentors, colleagues, and others in various places, including MIT, Xerox PARC, and Google. I'll present the ideas in the context of developing an Optical Character Recognition system at Google.
Ashok C. Popat received the SB and SM degrees from the Massachusetts Institute of Technology in Electrical Engineering in 1986 and 1990, and a PhD from the MIT Media Lab in 1997. He is a Research Scientist at Google in Mountain View, California. At Google he has worked on several projects, including Books, Translate, and (most recently) Optical Character Recognition (OCR). He is part of a team that has developed an OCR system that can handle more than 200 languages, many of which are currently supported through the Cloud Vision web-based API. Prior to joining Google in 2005 he worked at Xerox PARC with Gary Kopec and Henry Baird, on Document Image Decoding. Between 2002 and 2005 he was also a consulting assistant professor of Electrical Engineering at Stanford, where he co-taught (with Dan Bloomberg) a course "Electronic documents: paper to digital." He has also worked at Motorola, Hewlett Packard, PictureTel, and the EPFL in Switzerland. His areas of interest include signal processing, data compression, and pattern recognition. He enjoys running, skiing, sailing, hiking, and spending time with his wife and two daughters.