Springer Verlag, Machine Learning, 3(106), p. 359-386
DOI: 10.1007/s10994-016-5610-8
Full text: Download
Feature vectors used in "Nearest neighbors distance ratio open-set classifier" paper to appear in Springer Machine Learning journal. In the 15-Scenes (15scenes.dat) dataset, with 15 classes, the 4,485 images were represented by a bag-of-visual-word vector created with soft assignment and max pooling, based on a codebook of 1,000 Scale Invariant Feature Transform (SIFT) codewords. The 26 classes of the letter (letter.dat) dataset represent the letters of the English alphabet (black-and-white rectangular pixel displays). The 20,000 samples contain 16 attributes. The Auslan (auslan.dat) dataset contains 95 classes of Australian Sign Language (Auslan) signs collected from a volunteer native Auslan signer. Data was acquired using two Fifth Dimension Technologies (5DT) gloves hardware and two Ascension Flock-of-Birds magnetic position trackers. There are 146,949 samples represented with 22 features ( x , y , z positions, bend measures, etc). The Caltech-256 (caltech256.dat) dataset comprises 256 object classes. The feature vectors consider a bag-of-visual-words characterization approach and contain 1,000 features, acquired with dense sampling, SIFT descriptor for the points of interest, hard assignment, and average pooling. In total, there are 29,780 samples. The ALOI (aloi.dat) dataset has 1,000 classes and 108 samples for each class (108,000 in total). The features were extracted with the Border/Interior (BIC) descriptor and contain 128 dimensions. The ukbench (ukbench.dat) dataset comprises 2,550 classes of four images each. In our work, the images were represented with BIC descriptor (128 dimensions).