Semantic Labeling for Image ClassificationPublic Deposited
The rapid growth of digital imaging technology and the accumulation of large collections of digital images has created the need for efficient and intelligent schemes for content-based image retrieval. Our goal is to organize the contents semantically, according to meaningful categories. We present a new approach for semantic classification that utilizes a recently proposed color-texture segmentation algorithm (by Chen et al.), which combines knowledge of human perception and signal characteristics to segment natural scenes into perceptually uniform regions. The features of these regions are then used as medium level descriptors that can effectively bridge the ``semantic gap'' between low level primitives and high level semantics. The goal is to extract semantic labels, first at the segment and then at the scene level. The focus of this thesis is on region classification. We develop segment features that consist of spatial texture orientation information and color composition in terms of a limited number of locally adapted dominant colors. We also consider segment size and position. We use a hierarchical vocabulary of segment labels that is consistent with subjective experiments and the labels used in the NIST TRECVID 2003 development set. We have gathered a database of 13000 automatically segmented and manually labeled segments obtained from 3300 photographs of natural scenes. This database is used for training and testing. For training and classification we use the Linear Discriminant Analysis (LDA) technique. We examine the performance of the algorithm (precision and recall rates) when different sets of features e.g., one or two most dominant colors versus four quantized dominant colors) are used. We also consider the performance of other techniques such as Gaussian Mixture Models (GMMs) and Support Vector Machines (SVMs). Our results indicate that the proposed approach offers significant performance improvements over existing approaches. We also compare with human performance. For this, we use human segmentations to do the feature extraction and segment classification. We show that both the segment statistics and algorithm performance remain approximately the same when the automatic segmentations are replaced with human segmentations.