Dependency and Structure in Pattern Recognition
Title: Dependency and Structure in Pattern Recognition
Professor Robert Haralick, Distinguished Professor, Computer Science Department, Graduate Center, City University of New York.
Abstract: Dependency has multiple forms, each of which in its own way restricts possibilities. If xNx1 is a discretely valued random N-tuple governed by a probability P, the uncertainty about a random value of x is the entropy associated with P. If the uncertainty is less than the maximal possible uncertainty for x, that decrease in uncertainty induces a stochastic dependency among the components of x and/or makes certain values of certain components more or less probable.
The most visual case of stochastic dependency occurs in the image domain. Any patch of an image that shows a texture is a region having a stochastic dependency among the pixel values of the patch. The gray level co-occurrence matrix captures a second order dependency among the values of neighboring pixels. Functionals of the co-occurrence matrix can be used as features in distinguishing one texture from another. We show how the commonly used functionals can be improved by putting the co-occurrence matrix into the setting of graphical models and how it is possible to produce a texture transform image. Each pixel in a textural transform image gives the joint occurrence probability of the neighborhood values centered around that pixel. The size of the neighborhood can be arbitrary.
Another class of dependencies is driven by relationships, for example spatial relationships. A consistent labeling approach permits recognizing objects that are always in a legal spatial relationship. We review how a consistent labeling problem can be posed and then define a relation join-decomposition generalization. Solving the relation join-decomposition is equivalent to learning relationships in data. It is interesting that learning relationships necessarily involves learning relationships of no-influence. No-influence obeys the same semi-graphoid properties that conditional independence obeys.
Structure specifies the organization of data. One level of organization is the partitioning of data into clusters, each of whose points are similar to one another. The classic K-means algorithm uses a structure of zero-dimensional manifolds. Similarity is specified by the distance of a point to the cluster center. Clustering can be done where the ideal of each cluster is a k-dimensional manifold where k can vary from cluster to cluster. Here similarity can be specified by the distance of the point to the manifold. We give an algorithm for doing such clustering and show how it can be used prior to doing regression to get more meaningful regression relationships.
Biography: Haralick received a B.A. degree in mathematics from the University of Kansas in 1964, a B.S. degree in electrical engineering in 1966, and a M.S. degree in electrical engineering in 1967. In 1969, after completing his Ph.D. at the University of Kansas, he joined the faculty of the electrical engineering department, serving as professor from 1975 to 1978. In 1979 Haralick joined the electrical engineering department at Virginia Polytechnic Institute and State University, where he was a professor and director of the spatial data analysis laboratory.
From 1984 to 1986 Haralick served as vice president of research at Machine Vision International, Ann Arbor, MI. Haralick occupied the Boeing Clairmont Egtvedt Professorship in the department of electrical engineering at the University of Washington from 1986 through 2000. At UW, Haralick was an adjunct professor in the computer science department and the bioengineering department.
In 2000 Haralick accepted a Distinguished Professorship position at the computer science department of the Graduate Center, City University of New York.