Links

Tools

Export citation

Search in Google Scholar

Web Image Classification for Information Extraction Web Image Classification for Information Extraction

Journal article published in 2005 by Martin Labsk, Miroslav Vacura, Pavel Praks
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown

Abstract

We describe an approach to classifying images found on the WWW for the purpose of information extraction (IE). Among features used for classification are image sizes, colour histograms, and the simi-larity of the classified image's content to images in a training collection. Our content similarity metric is based on the latent semantic index. Re-sults are presented on a collection of 1624 image occurrences found on bicycle shop websites, and the task is to distinguish bicycle images from the rest.