Links

Tools

Export citation

Search in Google Scholar

Latent semantic indexing for image retrieval systems

Journal article published in 2003 by Pavel Praks, Jiří Dvorsk, Václav Snášel
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown

Abstract

Matrix computation is used as a basis for information retrieval in the retrieval strategy called Latent Semantic Indexing (LSI) [1]. The premise is that more con-ventional retrieval strategies (i.e. vector space, probabilistic and extended Boolean) all have problems because they match directly on keywords. Since the same concept can be described using many different keywords, this type of matching is prone to failure. The authors cite a study in which two people used the same word for the same concept only twenty percent of the time. LSI tries to search for something that is closer to representing the underlying semantics of a document. The searching is done by using matrix computation, in particular Singular Value Decomposition (SVD). This filters out the noise found in a document, such that two documents that have same semantics will be located close to one another in a multi-dimensional space. Majority of images in real world are stored as raster images. Image can be viewed as vector of pixels; every pixel is described by its color. The vector of pixel represents some kind of keywords in image. But human observer extracts from image important features that define semantics of image for him. The man doesn't think about pixel but about persons or objects on image. So we need technique that is able to extract this features and that is resistant to minor changes of images (e.g. amount of light, contrast and moves of objects on the images). Direct usage of keyword based systems leads to results that are sensitive to small change of any keyword (pixel in query).