Feature Article |
In the course of recalling the evolution of her work toward applying pattern recognition to astronomy, the realization hit Tin Kam Ho like a bolt of lightning. “I find one continuing theme in my different application areas - they all have something to do with light! I started with optical character recognition, then continued with optical spectra, optical fiber design, optical network simulation, and eventually optical astronomy.” It is often the case that technologists must adapt and learn new areas to keep up with the fast pace of technology. It is a tribute to Tin’s knowledge in the field of pattern recognition and a testimonial to pattern recognition’s versatility that such diverse problems as these can be attacked with the same tools.
Tin joined Bell Labs in 1992. Her early work was on multiple classifiers, and this was applied to optical character recognition systems. This was valued work at Bell Labs in the early 1990s because it was applied to one of the parent company’s divisions, NCR, which built systems to manage paper documents for the financial and retail industries. In the mid-1990s, Tin became interested in another area, spectral analysis. She was able to apply this new interest to another NCR project, analyzing the spectra of different fruits presented to self-service checkout devices at retail stores. These two projects ended in 1997 with Lucent’s divestiture of NCR.
With her main customer gone and Lucent re-focusing on its telecommunications core competency, Tin was faced with the challenge of how to apply her expertise to aid the company. As Tin tried her hand at various telecommunication projects, a colleague Lawrence Cowsar was intrigued by her talk on clustering. Lawrence was working on a simulation tool for designing optical fibers at the time. He felt that the clustering work could help to sort groups in the design space and to understand their implications. Tin proceeded to build a graphical tool to help Lawrence see the design clusters and explore their correlations with computed fiber properties. The tools shortened the design time of the fiber considerably and led to a successful prototype. This started their collaboration, which has continued for several years. They worked on simulation of power control dynamics in sophisticated optical transport systems, and helped design and engineer a major optical network now being deployed in the US. But photonics is only part of the story we wish to tell here. While Tin was looking for projects that involved spectra, she became interested in light of a different nature.
IRAS (Infrared Astronomical Satellite) was a collaboration of the space agencies of the United States, the Netherlands, and the United Kingdom in the early 1980s to create an all-sky atlas of objects measurable by an infrared telescope carried by a satellite. Over 300,000 point sources of light were detected, and many were observed with a low resolution spectra (LRS). A version of the LRS data went into the UC-Irvine machine learning data depository, and attracted Tin’s attention. She tried her classifiers on the data, but she was unhappy about the results. Tin wanted to know more about the classes, so she pursued the story behind this dataset. She eventually found an expert, Kevin Volk in Canada, and the Netherlands team who created the instrument. Kevin and the team told Tin a lot more about the data, including a need to reprocess and recalibrate the spectra from the raw scans. This required significant learning in professional astronomy, and Tin needed a local mentor.
Tony Tyson was a scientist at Bell Labs (now at UC Davis) who learned of Tin’s interest and areas of knowledge and enthusiastically supported Tin’s application to his own area, astronomy. Tony was (and still is) a principal investigator on a project called the Deep Lens Survey, the purpose of which is to survey mass structures in the distant universe. An unknown form of matter, called dark matter, is not directly observable because it emits no light. Because it cannot be seen, one way to detect it is to measure how light from more distant galaxies is bent around the dark matter during its travel to our own Milky Way galaxy. So they captured images of far-away galaxies by taking very long and multiple exposures of locations in space, and looked for systematic shape distortions in the sample. A need in the analysis is to first separate those galaxies from the foreground stars. Tony recommended Tin to work on this classification problem and suggested that she read Physical Universe – An Introduction to Astronomy in order to become familiar with this field.
What she learned intrigued her. Not just about the science, but about the astronomy community as well. She was astonished to find a worldwide group that was united in a pursuit to see literally beyond the stars. For instance, astronomical journal articles greater than two years from publication were available online and free in a single source, NASA’s ADS (Astrophysics Data System) digital library. Recent articles could be obtained from the arXiv e-print service. Tin was amazed to see the entire literature available at her desktop. She was also amazed at the cohesiveness and singularity of efforts of the astronomy community. An International Virtual Observatory Alliance was being built. The scientists were working as one to reveal the secrets of the universe. She leapt at the opportunity to participate with this community.
As Tin started, she found an immediate difficulty: there is no training data, as the objects most difficult to separate are the faintest and thus have never been seen before. Discrimination is nevertheless possible for a few reasons. Galaxies have shapes and spectral features that are different from point-function stars. In addition, there is an expectation on how many objects of each type should appear in each grade of brightness, and how the objects should distribute in the color space according to stellar evolution theory. In attempts to relate all the supporting evidence, Tin found the tools she was using were clumsy and inconvenient. This brings us back to the photonics part of the story.
Concurrent to this need for a better tool for astronomical image analysis, came an urgent call for a tool to analyze the results of large photonics simulations. Tin decided to extend the pattern recognition functionality of her fiber design tool, and delivered Mirage 0.0 to her optical engineer colleagues. Mirage was designed for the needs in classification. This involves ways to view high-dimensional feature vectors and their classes in histograms, scatter lots, connecting structures of clusters, minimum spanning trees and tools for navigating between clusters. This enables the optical engineers—as well as astronomers—to perform interactive pattern discovery by manipulating the multidimensional data and visualizing its interrelationships. Tin took the tool to the ADASS (Astronomical Data Analysis Software and Systems) conference in 2002. It became popular instantly and has had over 700 downloads to date.
Tin has found productive areas of research in both astronomical science and telecommunications technology. There is precedence for this combination at Bell Labs where scientists won the Nobel Prize in physics in1978 for the Big Bang Theory, work that resulted from investigating how background microwave radiation caused noise to communication systems. With this business environment, collaborators throughout the world, and a pattern recognition skill-set, who knows what further problems Tin will be shedding light on? |
Feature Articles from the IAPR Newsletter
html: Pattern Recognition in Origami, Jan. ‘05
pdf: Pattern Recognition in Origami, Jan. ‘05
ICPR2004 Invited Talks, Oct. ‘04
Pattern Recognition in Defense Applications, Jan. ‘04
Pattern Recognition in Maps, Sep. 03
Pattern Recognition in Security and Entertainment, Jun. ‘03
Pattern Recognition in Sports, Apr. ‘03
What is Pattern Recognition?, Winter 2003 |