Talk 3

Content-based image retrieval (CBIR) remains a hard problem after more than 20 years of research efforts. Dr. Pavlidis’ talk started with a thought-provoking overview on the state-of-the-art of the field. Despite the numerous publications in this area, there are very few systems that allow on-line testing of authors’ methodologies, and the results from on-line testing are in general poorer than the published results.

There is a distinction between general and application-specific CBIR. In general CBIR, a query image needs to be matched to an arbitrary collection of images, such as images found on internet. General CBIR is closely related to the object recognition problem, since the goal of the query is to obtain images containing the same object of the query. Application-specific CBIR performs matching of the query image on collections of images of a specific type (such as fingerprints, X-rays of a specific organ, etc). One may note that general CBIR problems can be much harder than application-oriented ones, which take advantage of domain-specific knowledge.

Dr. Pavlidis identified some methodological issues in general CBIR. First, some papers propose solutions in search of a problem, thus the queries that they use have little practical significance. Second, classifier-based methods are not able to deal with open collections of images, since the retrieval system is limited to the finite set of classes of objects. Third, there is a scaling issue in methods that are developed over small sets of samples; it is unclear how well will these methods generalize over an ever-expanding amount of image data. Perhaps the most critical shortcoming of existing CBIR methods is their reliance upon pixel-wise similarity, which is very different from perceptual similarity. Dr. Pavlidis referred to this issue as “a semantic abyss” that needs to be addressed by new methodologies.

Why is general CBIR so difficult? Text retrieval is much easier than image retrieval because primitives in text (i.e. characters) carry semantic meaning; in images, the semantic meaning of their content is not distributed on a per pixel basis. The main challenge is to find measures for perceptual and conceptual similarity in images.

Dr. Pavlidis recommends pursuing research on specific CBIR problems that satisfy certain feasibility criteria. He considers that “until there is sufficient progress in object recognition or in application specific CBIR, general CBIR research is unlikely to be fruitful, especially when constraints on real time performance are added”.