Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains random objects from home. The objects are taken mostly from kitchen, bathroom and living-room environments. Each directory ("Train" and "Test") contains a subdirectory named 'Gtruth/', containing ground truth .mat files. The .mat files contain a variable 'outline'. For training images, 'outline' contains the corners of a bounding box defining the model in the training image. For test images, 'outline' contains the names of the models present in the image along with a bounding box for each of them. If a same training object is defined by several training images, 'outline' references all these training images, with the same bounding box. You can also find a tar archive of the whole dataset in "HomeObjects06.tar".
Citation: Moreels, P., & Perona, P. (2022). Caltech Home Objects 2006 (1.0) [Data set]. CaltechDATA. https://doi.org/10.22002/D1.20089
Facebook
TwitterLi, F.-F., Andreeto, M., Ranzato, M., & Perona, P. (2022). Caltech 101 (1.0) [Data set]. CaltechDATA. https://doi.org/10.22002/D1.20086
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset doesn't contain source data, only train-val description.
We introduce a challenging set of 256 object categories containing a total of 30607 images. The original Caltech-101 was collected by choosing a set of object categories, downloading examples from Google Images and then manually screening out all images that did not fit the category. Caltech-256 is collected in a similar manner with several improvements: a) the number of categories is more than doubled, b) the minimum number of images in any category is increased from 31 to 80, c) artifacts due to image rotation are avoided and d) a new and larger clutter category is introduced for testing background rejection. We suggest several testing paradigms to measure classification performance, then benchmark the dataset using two simple metrics as well as a state-of-the-art spatial pyramid matching algorithm. Finally we use the clutter category to train an interest detector which rejects uninformative background regions.
Griffin, G., Holub, A., & Perona, P. (2022). Caltech 256 (1.0) [Data set]. CaltechDATA.
Facebook
TwitterGriffin, G., Holub, A., & Perona, P. (2022). Caltech 256 (1.0) [Data set]. CaltechDATA. https://doi.org/10.22002/D1.20087
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The dataset contains images of people collected from the web by typing common given names into Google Image Search. The coordinates of the eyes, the nose and the center of the mouth for each frontal face are provided in a ground truth file. This information can be used to align and crop the human faces or as a ground truth for a face detection algorithm. The dataset has 10,524 human faces of various resolutions and in different settings, e.g. portrait images, groups of people, etc. Profile faces or very low resolution faces are not labeled. The data contains a total of 10,524 faces in 7,092 images. The average image resolution is 304x312 pixels across the data.
The ground truth contains the following information: image-name Leye-x Leye-y Reye-x Reye-y nose-x nose-y mouth-x mouth-y
Cite wherever required: Fink, M., & Perona, P. (2022). Caltech 10k Web Faces (1.0) [Data set]. CaltechDATA. https://doi.org/10.22002/D1.20132
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset Caltech 101, which contains around 9000 labeled images belonging to 102 object categories (Yes, the dataset has 102 and not 101 classes! But in its name it says 101...)
Carefully clicked outlines of each object in these pictures, these are included under the 'Annotations.tar'. There is also a MATLAB script to view the annotations, 'show_annotations.m'.
Related Publication: One-shot learning of object categories Li, Fei-Fei Fergus, Rob Perona, Pietro IEEE Transactions on Pattern Analysis and Machine Intelligence 2006-10-16 https://doi.org/10.1109/TPAMI.2006.79 eng
Facebook
Twitterhttps://tccon-wiki.caltech.edu/Main/DataLicensehttps://tccon-wiki.caltech.edu/Main/DataLicense
The Total Carbon Column Observing Network (TCCON) is a network of ground-based Fourier Transform Spectrometers that record direct solar absorption spectra of the atmosphere in the near-infrared. From these spectra, accurate and precise column-averaged abundances of atmospheric constituents including CO2, CH4, N2O, HF, CO, H2O, and HDO, are retrieved. This is the GGG2020 data release of observations from the TCCON station at Sodankylä, Finland
Kivi, R., Heikkinen, P., & Kyrö, E. (2022). TCCON data from Sodankylä (FI), Release GGG2020.R0 (Version R0) [Data set]. CaltechDATA. https://doi.org/10.14291/tccon.ggg2020.sodankyla01.R0 Please review the TCCON data use policy before downloading TCCON data.
Facebook
TwitterCaltech-101 Webdataset (Test set only)
Original paper: One-shot learning of object categories Homepage: https://data.caltech.edu/records/mzrjq-6wc02 Bibtex: @misc{li_andreeto_ranzato_perona_2022, title={Caltech 101}, DOI={10.22002/D1.20086}, publisher={CaltechDATA}, author={Li, Fei-Fei and Andreeto, Marco and Ranzato, Marc'Aurelio and Perona, Pietro}, year={2022}, month={Apr} }
Facebook
Twitterhttps://tccon-wiki.caltech.edu/Main/DataLicensehttps://tccon-wiki.caltech.edu/Main/DataLicense
The Total Carbon Column Observing Network (TCCON) is a network of ground-based Fourier Transform Spectrometers that record direct solar absorption spectra of the atmosphere in the near-infrared. From these spectra, accurate and precise column-averaged abundances of atmospheric constituents including CO2, CH4, N2O, HF, CO, H2O, and HDO, are retrieved. This is the GGG2020 data release of observations from the TCCON station at Ny-Ålesund, Svalbard, Norway
Buschmann, M., Petri, C., Palm, M., Warneke, T., & Notholt, J. (2022). TCCON data from Ny-Ålesund, Svalbard (NO), Release GGG2020.R0 (Version R0) [Data set]. CaltechDATA. https://doi.org/10.14291/tccon.ggg2020.nyalesund01.R0 Please review the TCCON data use policy before downloading TCCON data.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Motorbikes (Side) dataset. Collected by undergrads at Caltech (Feb '01) from the web. 826 images of motorbikes from the side. JPEG format. ImageData.mat is a MATLAB file containing the variable SubDir_Data which is an 8 x 826 matrix. Each column of this matrix hold the coordinates of the bike within the image, in the form: [x_bot_left y_bot_left x_top_left y_top_left x_top_right y_top_right x_bot_right y_bot_right].
Citation: Perona, P. (2022). Caltech Motorcycles 2001 (1.0) [Data set]. CaltechDATA. https://doi.org/10.22002/D1.20088
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains random objects from home. The objects are taken mostly from kitchen, bathroom and living-room environments. Each directory ("Train" and "Test") contains a subdirectory named 'Gtruth/', containing ground truth .mat files. The .mat files contain a variable 'outline'. For training images, 'outline' contains the corners of a bounding box defining the model in the training image. For test images, 'outline' contains the names of the models present in the image along with a bounding box for each of them. If a same training object is defined by several training images, 'outline' references all these training images, with the same bounding box. You can also find a tar archive of the whole dataset in "HomeObjects06.tar".
Citation: Moreels, P., & Perona, P. (2022). Caltech Home Objects 2006 (1.0) [Data set]. CaltechDATA. https://doi.org/10.22002/D1.20089