In recent years, the indexing and retrieval of digital content has become an urgent issue and has attracted large amount of research due to the rapid increase of digital content. A new world standard group, MPEG-7, has recently been created to address this urgent issue.


At the initial stage of MPEG-7, digital content are described using low level content features such as color, texture, shape and motion etc. However, the low level content description is against human understanding of digital content, as human tends to understand digital content semantically, such as water, tree, rain, sheep etc. In this project, we attempt to interpret digital image semantically the same way as human does.
Contrast to conventional image retrieval approaches which interpret images holistically, we treat images in an object oriented way. Specifically, we interpret an image as collections of local regions or objects.
In the indexing stage, images representing large number of categories are collected and segmented into regions. Low level features are extracted from each region and the regions are classified into semantic categories using machine learning techniques. Images in the database are then indexed using the learned semantic categories. During the retrieval stage, given a semantic query from the user, images in the database which have regions best matching the query concept are returned to the user. With this approach, searching images from a database is the same as searching textual documents in the Internet. However, different from existing Internet image search engines which index and retrieve images based on human annotations, it is based on image content which is more efficient and objective.