100+ datasets found

P
PETA Dataset
paperswithcode.com
Updated Aug 18, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yubin Deng; Ping Luo; Chen Change Loy; Xiaoou Tang (2024). PETA Dataset [Dataset]. https://paperswithcode.com/dataset/peta
Explore at:
Dataset updated
Aug 18, 2024
Authors
Yubin Deng; Ping Luo; Chen Change Loy; Xiaoou Tang
Description
The PEdesTrian Attribute dataset (PETA) is a dataset fore recognizing pedestrian attributes, such as gender and clothing style, at a far distance. It is of interest in video surveillance scenarios where face and body close-shots and hardly available. It consists of 19,000 pedestrian images with 65 attributes (61 binary and 4 multi-class). Those images contain 8705 persons.
O
SUN Attribute
opendatalab.com
paperswithcode.com
zip
Updated Sep 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hayes Laboratory (2022). SUN Attribute [Dataset]. https://opendatalab.com/OpenDataLab/SUN_Attribute
Explore at:
zip(1863929102 bytes)Available download formats
Dataset updated
Sep 22, 2022
Dataset provided by
Hayes Laboratory
License
https://paperswithcode.com/dataset/sun-attributehttps://paperswithcode.com/dataset/sun-attribute
Description
The SUN Attribute dataset consists of 14,340 images from 717 scene categories, and each category is annotated with a taxonomy of 102 discriminate attributes. The dataset can be used for high-level scene understanding and fine-grained scene recognition.
O
Clothing Attributes Dataset
opendatalab.com
zip
Updated Mar 24, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford University (2023). Clothing Attributes Dataset [Dataset]. https://opendatalab.com/OpenDataLab/Clothing_Attributes_Dataset
Explore at:
zip(254675070 bytes)Available download formats
Dataset updated
Mar 24, 2023
Dataset provided by
Stanford University
Kodak Research Laboratories
Description
We introduce the Clothing Attribute Dataset for promoting research in learning visual attributes for objects. The dataset contains 1856 images, with 26 ground truth clothing attributes such as "long-sleeves", "has collar", and "striped pattern". The labels were collected using Amazon Mechanical Turk.
Data from: Car Evaluation Data Set
hypi.ai
zip
Updated Sep 1, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahiale Darlington (2017). Car Evaluation Data Set [Dataset]. https://hypi.ai/wp/wp-content/uploads/2019/10/car-evaluation-data-set/
Explore at:
zip(4775 bytes)Available download formats
Dataset updated
Sep 1, 2017
Authors
Ahiale Darlington
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
from: https://archive.ics.uci.edu/ml/datasets/car+evaluation

Title: Car Evaluation Database

Sources: (a) Creator: Marko Bohanec (b) Donors: Marko Bohanec (marko.bohanec@ijs.si) Blaz Zupan (blaz.zupan@ijs.si) (c) Date: June, 1997

Past Usage:

The hierarchical decision model, from which this dataset is derived, was first presented in

M. Bohanec and V. Rajkovic: Knowledge acquisition and explanation for multi-attribute decision making. In 8th Intl Workshop on Expert Systems and their Applications, Avignon, France. pages 59-78, 1988.

Within machine-learning, this dataset was used for the evaluation of HINT (Hierarchy INduction Tool), which was proved to be able to completely reconstruct the original hierarchical model. This, together with a comparison with C4.5, is presented in

B. Zupan, M. Bohanec, I. Bratko, J. Demsar: Machine learning by function decomposition. ICML-97, Nashville, TN. 1997 (to appear)

Relevant Information Paragraph:

Car Evaluation Database was derived from a simple hierarchical decision model originally developed for the demonstration of DEX (M. Bohanec, V. Rajkovic: Expert system for decision making. Sistemica 1(1), pp. 145-157, 1990.). The model evaluates cars according to the following concept structure:

CAR car acceptability . PRICE overall price . . buying buying price . . maint price of the maintenance . TECH technical characteristics . . COMFORT comfort . . . doors number of doors . . . persons capacity in terms of persons to carry . . . lug_boot the size of luggage boot . . safety estimated safety of the car

Input attributes are printed in lowercase. Besides the target concept (CAR), the model includes three intermediate concepts: PRICE, TECH, COMFORT. Every concept is in the original model related to its lower level descendants by a set of examples (for these examples sets see http://www-ai.ijs.si/BlazZupan/car.html).

The Car Evaluation Database contains examples with the structural information removed, i.e., directly relates CAR to the six input attributes: buying, maint, doors, persons, lug_boot, safety.

Because of known underlying concept structure, this database may be particularly useful for testing constructive induction and structure discovery methods.

Number of Instances: 1728 (instances completely cover the attribute space)

Number of Attributes: 6

Attribute Values:

buying v-high, high, med, low maint v-high, high, med, low doors 2, 3, 4, 5-more persons 2, 4, more lug_boot small, med, big safety low, med, high

Missing Attribute Values: none

Class Distribution (number of instances per class)

class N N[%]

unacc 1210 (70.023 %) acc 384 (22.222 %) good 69 ( 3.993 %) v-good 65 ( 3.762 %)
CMAB-The World's First National-Scale Multi-Attribute Building Dataset
figshare.com
bin
Updated Apr 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yecheng Zhang; Huimin Zhao; Ying Long (2025). CMAB-The World's First National-Scale Multi-Attribute Building Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.27992417.v7
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.27992417.v7
Dataset updated
Apr 20, 2025
Dataset provided by
figshare
Authors
Yecheng Zhang; Huimin Zhao; Ying Long
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
World
Description
Rapidly acquiring three-dimensional (3D) building data, including geometric attributes like rooftop, height and orientations, as well as indicative attributes like function, quality, and age, is essential for accurate urban analysis, simulations, and policy updates. Current building datasets suffer from incomplete coverage of building multi-attributes. This paper presents the first national-scale Multi-Attribute Building dataset (CMAB) with artificial intelligence, covering 3,667 spatial cities, 31 million buildings, and 23.6 billion m² of rooftops with an F1-Score of 89.93% in OCRNet-based extraction, totaling 363 billion m³ of building stock. We trained bootstrap aggregated XGBoost models with city administrative classifications, incorporating morphology, location, and function features. Using multi-source data, including billions of remote sensing images and 60 million street view images (SVIs), we generated rooftop, height, structure, function, style, age, and quality attributes for each building with machine learning and large multimodal models. Accuracy was validated through model benchmarks, existing similar products, and manual SVI validation, mostly above 80%. Our dataset and results are crucial for global SDGs and urban planning.Data records: A building dataset with a total rooftop area of 23.6 billion square meters in 3,667 natural cities in China, including the attribute of building rooftop, height, structure, function, age, style and quality, as well as the code files used to calculate these data. The deep learning models used are OCRNet, XGBoost, fine-tuned CLIP and Yolo-v8.Supplementary note: The architectural structure, style, and quality are affected by the temporal and spatial distribution of street views in China. Regarding the recognition of building colors, we found that the existing CLIP series model can not accurately judge the composition and proportion of building colors, and then it will be accurately calculated and supplemented by semantic segmentation and image processing. Please contact zhangyec23@mails.tsinghua.edu.cn or ylong@tsinghua.edu.cn if you have any technical problems.Reference Format: Zhang, Y., Zhao, H. & Long, Y. CMAB: A Multi-Attribute Building Dataset of China. Sci Data 12, 430 (2025). https://doi.org/10.1038/s41597-025-04730-5.
a
Animals with Attributes 2 (AwA2) dataset
academictorrents.com
bittorrent
Updated Oct 24, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
None (2017). Animals with Attributes 2 (AwA2) dataset [Dataset]. https://academictorrents.com/details/1490aec815141cdb50a32b81ef78b1eaf6b38b03
Explore at:
bittorrentAvailable download formats
Dataset updated
Oct 24, 2017
Authors
None
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
This dataset provides a platform to benchmark transfer-learning algorithms, in particular attribute base classification and zero-shot learning [1]. It can act as a drop-in replacement to the original Animals with Attributes (AwA) dataset [2,3], as it has the same class structure and almost the same characteristics. It consists of 37322 images of 50 animals classes with pre-extracted feature representations for each image. The classes are aligned with Osherson s classical class/attribute matrix [3,4], thereby providing 85 numeric attribute values for each class. Using the shared attributes, it is possible to transfer information between different classes. The image data was collected from public sources, such as Flickr, in 2016. In the process we made sure to only include images that are licensed for free use and redistribution, please see the archive for the individual license files. ![]() ### Publications Please cite the following paper when
d
Residential Property Attributes Data (LGATE-287) - Datasets - data.wa.gov.au...
catalogue.data.wa.gov.au
Updated Jan 20, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). Residential Property Attributes Data (LGATE-287) - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/residential-property-atributes-data
Explore at:
Dataset updated
Jan 20, 2020
Area covered
Western Australia
Description
Residential Property Attribute data provides the most current building attributes available for residential properties as captured within Landgate's Valuation Database. Attribute information is captured as part of the Valuation process and is maintained via a range of sources including building and sub division approval notifications. This data set should not be confused with Sales Evidence data which is based on property attributes as at the time of last sale. This dataset has been spatially enabled by linking cadastral land parcel polygons, sourced from Landgatge's Spatial Cadastral Database (SCDB), to the Residential Property Attribute data sourced from the Valuation database. Customers wishing to access this data set should contact Landgate on +61 (0)8 9273 7683 or email businesssolutions@landgate.wa.gov.au © Western Australian Land Information Authority (Landgate). Use of Landgate data is subject to Personal Use License terms and conditions unless otherwise authorised under approved License terms and conditions. Changes will be applied to this dataset resulting from the implementation of the Community Titles Act 2018 please refer to the Data Dictionary below.
Z
DCASE 2024 Challenge Task 2 Additional Training Dataset
data.niaid.nih.gov
Updated May 15, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harsh, Purohit (2024). DCASE 2024 Challenge Task 2 Additional Training Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11183283
Explore at:
Dataset updated
May 15, 2024
Dataset provided by
Daisuke, Niizumi
Sannino, Roberto
Noboru, Harada
Takashi, Endo
Augusti, Filippo
Pradolini, Simone
Harsh, Purohit
Keisuke, Imoto
Tomoya, Nishida
Kota, Dohi
Albertini, Davide
Yohei, Kawaguchi
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Description

This dataset is the "additional training dataset" for the DCASE 2024 Challenge Task 2.

The data consists of the normal/anomalous operating sounds of nine types of real/toy machines. Each recording is a single-channel audio that includes both a machine's operating sound and environmental noise. The duration of recordings varies from 6 to 10 seconds. The following nine types of real/toy machines are used in this task:

3DPrinter

AirCompressor

BrushlessMotor

HairDryer

HoveringDrone

RoboticArm

Scanner

ToothBrush

ToyCircuit

Overview of the task

Anomalous sound detection (ASD) is the task of identifying whether the sound emitted from a target machine is normal or anomalous. Automatic detection of mechanical failure is an essential technology in the fourth industrial revolution, which involves artificial-intelligence-based factory automation. Prompt detection of machine anomalies by observing sounds is useful for monitoring the condition of machines.

This task is the follow-up from DCASE 2020 Task 2 to DCASE 2023 Task 2. The task this year is to develop an ASD system that meets the following five requirements.

Train a model using only normal sound (unsupervised learning scenario) Because anomalies rarely occur and are highly diverse in real-world factories, it can be difficult to collect exhaustive patterns of anomalous sounds. Therefore, the system must detect unknown types of anomalous sounds that are not provided in the training data. This is the same requirement as in the previous tasks.

Detect anomalies regardless of domain shifts (domain generalization task) In real-world cases, the operational states of a machine or the environmental noise can change to cause domain shifts. Domain-generalization techniques can be useful for handling domain shifts that occur frequently or are hard-to-notice. In this task, the system is required to use domain-generalization techniques for handling these domain shifts. This requirement is the same as in DCASE 2022 Task 2 and DCASE 2023 Task 2.

Train a model for a completely new machine typeFor a completely new machine type, hyperparameters of the trained model cannot be tuned. Therefore, the system should have the ability to train models without additional hyperparameter tuning. This requirement is the same as in DCASE 2023 Task 2.

Train a model using a limited number of machines from its machine typeWhile sounds from multiple machines of the same machine type can be used to enhance the detection performance, it is often the case that only a limited number of machines are available for a machine type. In such a case, the system should be able to train models using a few machines from a machine type. This requirement is the same as in DCASE 2023 Task 2.

5 . Train a model both with or without attribute informationWhile additional attribute information can help enhance the detection performance, we cannot always obtain such information. Therefore, the system must work well both when attribute information is available and when it is not.

The last requirement is newly introduced in DCASE 2024 Task2.

Definition

We first define key terms in this task: "machine type," "section," "source domain," "target domain," and "attributes.".

"Machine type" indicates the type of machine, which in the additional training dataset is one of nine: 3D-printer, air compressor, brushless motor, hair dryer, hovering drone, robotic arm, document scanner (scanner), toothbrush, and Toy circuit.

A section is defined as a subset of the dataset for calculating performance metrics.

The source domain is the domain under which most of the training data and some of the test data were recorded, and the target domain is a different set of domains under which some of the training data and some of the test data were recorded. There are differences between the source and target domains in terms of operating speed, machine load, viscosity, heating temperature, type of environmental noise, signal-to-noise ratio, etc.

Attributes are parameters that define states of machines or types of noise. For several machine types, the attributes are hidden.

Dataset

This dataset consists of nine machine types. For each machine type, one section is provided, and the section is a complete set of training data. A set of test data corresponding to this training data will be provided in another seperate zenodo page as an "evaluation dataset" for the DCASE 2024 Challenge task 2. For each section, this dataset provides (i) 990 clips of normal sounds in the source domain for training and (ii) ten clips of normal sounds in the target domain for training. The source/target domain of each sample is provided. Additionally, the attributes of each sample in the training and test data are provided in the file names and attribute csv files.

File names and attribute csv files

File names and attribute csv files provide reference labels for each clip. The given reference labels for each training clip include machine type, section index, normal/anomaly information, and attributes regarding the condition other than normal/anomaly. The machine type is given by the directory name. The section index is given by their respective file names. For the datasets other than the evaluation dataset, the normal/anomaly information and the attributes are given by their respective file names. Note that for machine types that has its attribute information hidden, the attribute information in each file names are only labeled as "noAttributes". Attribute csv files are for easy access to attributes that cause domain shifts. In these files, the file names, name of parameters that cause domain shifts (domain shift parameter, dp), and the value or type of these parameters (domain shift value, dv) are listed. Each row takes the following format:

[filename (string)], [d1p (string)], [d1v (int | float | string)], [d2p], [d2v]...

For machine types that have their attribute information hidden, all columns except the filename column are left blank for each row.

Recording procedure

Normal/anomalous operating sounds of machines and its related equipment are recorded. Anomalous sounds were collected by deliberately damaging target machines. For simplifying the task, we use only the first channel of multi-channel recordings; all recordings are regarded as single-channel recordings of a fixed microphone. We mixed a target machine sound with environmental noise, and only noisy recordings are provided as training/test data. The environmental noise samples were recorded in several real factory environments. We will publish papers on the dataset to explain the details of the recording procedure by the submission deadline.

Directory structure

/eval_data

/raw - /3DPrinter - /train (only normal clips) - /section_00_source_train_normal_0001_.wav - ... - /section_00_source_train_normal_0990_.wav - /section_00_target_train_normal_0001_.wav - ... - /section_00_target_train_normal_0010_.wav - attributes_00.csv (attribute csv for section 00) - /AirCompressor (The other machine types have the same directory structure as 3DPrinter.) - /BrushlessMotor - /HairDryer - /HoveringDrone - /RoboticArm - /Scanner - /ToothBrush - /ToyCircuit

Baseline system

The baseline system is available on the Github repository . The baseline systems provide a simple entry-level approach that gives a reasonable performance in the dataset of Task 2. They are good starting points, especially for entry-level researchers who want to get familiar with the anomalous-sound-detection task.

Condition of use

This dataset was created jointly by Hitachi, Ltd., NTT Corporation and STMicroelectronics and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

Citation

Contact

If there is any problem, please contact us:

Tomoya Nishida, tomoya.nishida.ax@hitachi.com

Keisuke Imoto, keisuke.imoto@ieee.org

Noboru Harada, noboru@ieee.org

Daisuke Niizumi, daisuke.niizumi.dt@hco.ntt.co.jp

Yohei Kawaguchi, yohei.kawaguchi.xk@hitachi.com
S
A dataset of for cross-course learning path planning with 7 types of learner...
scidb.cn
Updated May 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yong-Wei Zhang (2024). A dataset of for cross-course learning path planning with 7 types of learner and 7 types of course materials [Dataset]. http://doi.org/10.57760/sciencedb.18420
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.18420
Dataset updated
May 14, 2024
Dataset provided by
Science Data Bank
Authors
Yong-Wei Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset accompanies the research paper titled "Enhancing Personalized Learning in Online Education through Integrated Cross-Course Learning Path Planning." The dataset consists of MATLAB data files (.mat format).The dataset includes data on seven types of learner attributes, named from LearnerA.mat to LearnerG.mat. Each learner dataset contains two variables: L and LP. L is a 10x16 matrix that stores learner attributes, where each row represents a learner. The first column indicates the learner's ability level, the second column indicates the expected learning time, columns 3 to 6 represent normalized learning styles, and columns 7 to 16 represent learning objectives. LP is a structure that stores statistical information about this matrix.The dataset also includes data on seven types of learning resource attributes, named DatasetA.mat, DatasetB.mat, DatasetC.mat, DatasetAB.mat, DatasetAC.mat, DatasetBC.mat, and DatasetABC.mat. Each resource dataset contains two variables: M and MP. M is a matrix that stores the attributes of learning materials, where each row represents a material. The first column indicates the material's difficulty level, the second column represents the learning time required for the material, columns 3 to 6 describe the type of material, columns 7 to 16 cover the knowledge points addressed by the material, and columns 17 to 26 list the prerequisite knowledge points required for the material. MP is a structure that stores statistical information about this matrix.The dataset encompasses results from learning path planning involving seven types of learners across seven datasets, totaling 49 datasets, named in the format PathCost4_LSHADE_cnEpSin_D_X_L_Y.mat. Here, X represents the type of learning resource dataset (A, B, C, AB, AC, BC, ABC) and Y represents the type of learner (A to G). Each data file contains three variables: Gbest, Gtime, and S. Gbest is a 30x10 matrix, where each column stores the best cost function obtained from 30 runs of path planning for a learner on the corresponding dataset. Gtime is a 30x10 matrix, where each column stores the time spent on each run for a learner on the corresponding dataset. S is a 30x10 cell array storing the status information from each run.Finally, the dataset includes a compilation of the best cost functions for all runs for all learners across all learning material datasets, named learnerBest.mat. The file contains a variable, learnerBest, which is a 7x7x10x30 four-dimensional array. The first dimension represents the type of learner, the second dimension represents the type of learning material, the third dimension represents the learner index, and the fourth dimension represents the run index.
FairFace
kaggle.com
opendatalab.com
+3more
Updated Apr 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ghaida Al-Atoum (2024). FairFace [Dataset]. https://www.kaggle.com/datasets/ghaidaalatoum/fairface
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 26, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ghaida Al-Atoum
Description
❗❗Note: This is merely uploading dataset under FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age

FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age

The paper: https://openaccess.thecvf.com/content/WACV2021/papers/Karkkainen_FairFace_Face_Attribute_Dataset_for_Balanced_Race_Gender_and_Age_WACV_2021_paper.pdf

@inproceedings{karkkainenfairface, title={FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age for Bias Measurement and Mitigation}, author={Karkkainen, Kimmo and Joo, Jungseock}, booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision}, year={2021}, pages={1548--1558} }
R
Face Attribute Dataset
universe.roboflow.com
zip
Updated Apr 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MY AI (2025). Face Attribute Dataset [Dataset]. https://universe.roboflow.com/my-ai/face-attribute/model/17
Explore at:
zipAvailable download formats
Dataset updated
Apr 3, 2025
Dataset authored and provided by
MY AI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Face Attribute
Description
Face Attribute

## Overview Face Attribute is a dataset for classification tasks - it contains Face Attribute annotations for 7,233 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
r
Property level energy consumption (modelled on building attributes) -...
researchdata.edu.au
data.melbourne.vic.gov.au
Updated Mar 7, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.vic.gov.au (2023). Property level energy consumption (modelled on building attributes) - baseline 2011 and business as usual projections 2016-2026 [Dataset]. https://researchdata.edu.au/property-level-energy-2016-2026/2296380
Explore at:
Dataset updated
Mar 7, 2023
Dataset provided by
data.vic.gov.au
Description
This dataset should be read alongside other energy consumption datasets on the City of Melbourne open data platform as well as the following report:

http://imap.vic.gov.au/uploads/Meeting%20Agendas/2014%20August/Att%207a_IMAP_Energy_Map_-_CSIRO_-_Energy_Use_2011-2026_Report_-_2014June30_-Final_pdf_11.2MB.pdf

The dataset outlines modelled energy consumption across the City of Melbourne municipality. It is not energy consumption data captured by a meter, but modelled data based on building attributes such as building age, floor area etc. This data was provided by the CSIRO as a result of a study commissioned by IMAP Councils. The study was governed by a Grant Agreement between Councils and the CSIRO, which stated an intent for the data to be published. This specific dataset is presented at a property level scale. It includes both commercial and residential buildings and projections for energy consumption have been made for between 2016 and 2026, based on a business-as-usual scenario. It does not include the industrial sector.
Activity FACTS Common Attributes (Feature Layer)
catalog.data.gov
agdatacommons.nal.usda.gov
+5more
Updated Jun 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Forest Service (2025). Activity FACTS Common Attributes (Feature Layer) [Dataset]. https://catalog.data.gov/dataset/activity-facts-common-attributes-feature-layer-dcdbb
Explore at:
Dataset updated
Jun 5, 2025
Dataset provided by
U.S. Department of Agriculture Forest Servicehttp://fs.fed.us/
Description
The data in this map service is updated every weekend.Note: This data includes all activities regardless of whether there is a spatial feature attached.Note: This is a large dataset. Metadata and Downloads are available at: https://data.fs.usda.gov/geodata/edw/datasets.php?xmlKeyword=FACTS+common+attributesTo download FACTS activities layers, search for the activity types you want, such as timber harvest or hazardous fuels treatments. The Forest Service's Natural Resource Manager (NRM) Forest Activity Tracking System (FACTS) is the agency standard for managing information about activities related to fire/fuels, silviculture, and invasive species. This feature class contains the FACTS attributes most commonly needed to describe FACTS activities.
P
Market1501-Attributes Dataset
library.toponeai.link
paperswithcode.com
Updated Feb 7, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yutian Lin; Liang Zheng; Zhedong Zheng; Yu Wu; Zhilan Hu; Chenggang Yan; Yi Yang (2021). Market1501-Attributes Dataset [Dataset]. https://library.toponeai.link/dataset/market1501-attributes
Explore at:
Dataset updated
Feb 7, 2021
Authors
Yutian Lin; Liang Zheng; Zhedong Zheng; Yu Wu; Zhilan Hu; Chenggang Yan; Yi Yang
Description
The Market1501-Attributes dataset is built from the Market1501 dataset. Market1501 Attribute is an augmentation of this dataset with 28 hand annotated attributes, such as gender, age, sleeve length, flags for items carried as well as upper clothes colors and lower clothes colors.
d
Wc Attribute Change (OpenData) - Dataset - data.govt.nz - discover and use...
catalogue.data.govt.nz
Updated Apr 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Wc Attribute Change (OpenData) - Dataset - data.govt.nz - discover and use data [Dataset]. https://catalogue.data.govt.nz/dataset/wc-attribute-change-opendata
Explore at:
Dataset updated
Apr 30, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This model tracks changes in attributes (e.g. width, depth) of watercourses. It includes geospatial references, and documents change detection methods.It is spatially abstracted to a Point.Entity type: Concept
d
Attributes for NHDPlus Catchments (Version 1.1)for the Conterminous United...
catalog.data.gov
data.usgs.gov
+6more
Updated Nov 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Attributes for NHDPlus Catchments (Version 1.1)for the Conterminous United States: Contact Time, 2002 [Dataset]. https://catalog.data.gov/dataset/attributes-for-nhdplus-catchments-version-1-1for-the-conterminous-united-states-contact-ti
Explore at:
Dataset updated
Nov 30, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Contiguous United States, United States
Description
This data set represents the average contact time, in units of days, compiled for every catchment of NHDPlus for the conterminous United States. Contact time, as described in Wolock and others (1989), is the baseflow residence time in the subsurface. The source data set was the U.S. Geological Survey's (USGS) 1-kilometer grid for the conterminous United States (D.M. Wolock, U.S. Geological Survey, written commun., 2008). The grid was created using a method described by Wolock and others (1997a; see equation 3). In the source data set, the contact time was estimated from 1-kilometer resolution elevation data (Verdin and Greenlee, 1996 ) and STATSGO soil characteristics (Wolock, 1997b). The NHDPlus Version 1.1 is an integrated suite of application-ready geospatial datasets that incorporates many of the best features of the National Hydrography Dataset (NHD) and the National Elevation Dataset (NED). The NHDPlus includes a stream network (based on the 1:100,00-scale NHD), improved networking, naming, and value-added attributes (VAAs). NHDPlus also includes elevation-derived catchments (drainage areas) produced using a drainage enforcement technique first widely used in New England, and thus referred to as "the New England Method." This technique involves "burning in" the 1:100,000-scale NHD and when available building "walls" using the National Watershed Boundary Dataset (WBD). The resulting modified digital elevation model (HydroDEM) is used to produce hydrologic derivatives that agree with the NHD and WBD. Over the past two years, an interdisciplinary team from the U.S. Geological Survey (USGS), and the U.S. Environmental Protection Agency (USEPA), and contractors, found that this method produces the best quality NHD catchments using an automated process (USEPA, 2007). The NHDPlus dataset is organized by 18 Production Units that cover the conterminous United States. The NHDPlus version 1.1 data are grouped by the U.S. Geologic Survey's Major River Basins (MRBs, Crawford and others, 2006). MRB1, covering the New England and Mid-Atlantic River basins, contains NHDPlus Production Units 1 and 2. MRB2, covering the South Atlantic-Gulf and Tennessee River basins, contains NHDPlus Production Units 3 and 6. MRB3, covering the Great Lakes, Ohio, Upper Mississippi, and Souris-Red-Rainy River basins, contains NHDPlus Production Units 4, 5, 7 and 9. MRB4, covering the Missouri River basins, contains NHDPlus Production Units 10-lower and 10-upper. MRB5, covering the Lower Mississippi, Arkansas-White-Red, and Texas-Gulf River basins, contains NHDPlus Production Units 8, 11 and 12. MRB6, covering the Rio Grande, Colorado and Great Basin River basins, contains NHDPlus Production Units 13, 14, 15 and 16. MRB7, covering the Pacific Northwest River basins, contains NHDPlus Production Unit 17. MRB8, covering California River basins, contains NHDPlus Production Unit 18.
h
entity-attribute-sft-dataset-GPT-4.0-generated-v1
huggingface.co
Updated May 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BaSalam (2024). entity-attribute-sft-dataset-GPT-4.0-generated-v1 [Dataset]. https://huggingface.co/datasets/BaSalam/entity-attribute-sft-dataset-GPT-4.0-generated-v1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 7, 2024
Dataset authored and provided by
BaSalam
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Entity Attribute Dataset 50k (GPT-4.0 Generated)

Dataset Summary

The Entity Attribute SFT Dataset (GPT-4.0 Generated) is a machine-generated dataset designed for instruction fine-tuning. It includes detailed product information generated based on the title of each product, aiming to create a structured catalog in JSON format. The dataset encompasses a variety of product categories such as food, home and kitchen, clothing, handicrafts, tools, automotive equipment, and… See the full description on the dataset page: https://huggingface.co/datasets/BaSalam/entity-attribute-sft-dataset-GPT-4.0-generated-v1.
DatasetofDatasets (DoD)
kaggle.com
Updated Aug 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Konstantinos Malliaridis (2024). DatasetofDatasets (DoD) [Dataset]. https://www.kaggle.com/datasets/terminalgr/datasetofdatasets-124-1242024
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 12, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Konstantinos Malliaridis
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset is essentially the metadata from 164 datasets. Each of its lines concerns a dataset from which 22 features have been extracted, which are used to classify each dataset into one of the categories 0-Unmanaged, 2-INV, 3-SI, 4-NOA (DatasetType).

This Dataset consists of 164 Rows. Each row is the metadata of an other dataset. The target column is datasetType which has 4 values indicating the dataset type. These are:

2 - Invoice detail (INV): This dataset type is a special report (usually called Detailed Sales Statement) produced by a Company Accounting or an Enterprise Resource Planning software (ERP). Using a INV-type dataset directly for ARM is extremely convenient for users as it relieves them from the tedious work of transforming data into another more suitable form. INV-type data input typically includes a header but, only two of its attributes are essential for data mining. The first attribute serves as the grouping identifier creating a unique transaction (e.g., Invoice ID, Order Number), while the second attribute contains the items utilized for data mining (e.g., Product Code, Product Name, Product ID).

3 - Sparse Item (SI): This type is widespread in Association Rules Mining (ARM). It involves a header and a fixed number of columns. Each item corresponds to a column. Each row represents a transaction. The typical cell stores a value, usually one character in length, that depicts the presence or absence of the item in the corresponding transaction. The absence character must be identified or declared before the Association Rules Mining process takes place.

4 - Nominal Attributes (NOA): This type is commonly used in Machine Learning and Data Mining tasks. It involves a fixed number of columns. Each column registers nominal/categorical values. The presence of a header row is optional. However, in cases where no header is provided, there is a risk of extracting incorrect rules if similar values exist in different attributes of the dataset. The potential values for each attribute can vary.

0 - Unmanaged for ARM: On the other hand, not all datasets are suitable for extracting useful association rules or frequent item sets. For instance, datasets characterized predominantly by numerical features with arbitrary values, or datasets that involve fragmented or mixed types of data types. For such types of datasets, ARM processing becomes possible only by introducing a data discretization stage which in turn introduces information loss. Such types of datasets are not considered in the present treatise and they are termed (0) Unmanaged in the sequel.

The dataset type is crucial to determine for ARM, and the current dataset is used to classify the dataset's type using a Supervised Machine Learning Model.

There is and another dataset type named 1 - Market Basket List (MBL) where each dataset row is a transaction. A transaction involves a variable number of items. However, due to this characteristic, these datasets can be easily categorized using procedural programming and DoD does not include instances of them. For more details about Dataset Types please refer to article "WebApriori: a web application for association rules mining". https://link.springer.com/chapter/10.1007/978-3-030-49663-0_44
T
celeb_a
tensorflow.org
datasetninja.com
+3more
Updated Jun 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). celeb_a [Dataset]. https://www.tensorflow.org/datasets/catalog/celeb_a
Explore at:
Dataset updated
Jun 1, 2024
Description
CelebFaces Attributes Dataset (CelebA) is a large-scale face attributes dataset with more than 200K celebrity images, each with 40 attribute annotations. The images in this dataset cover large pose variations and background clutter. CelebA has large diversities, large quantities, and rich annotations, including - 10,177 number of identities, - 202,599 number of face images, and - 5 landmark locations, 40 binary attributes annotations per image.

The dataset can be employed as the training and test sets for the following computer vision tasks: face attribute recognition, face detection, and landmark (or facial part) localization.

Note: CelebA dataset may contain potential bias. The fairness indicators example goes into detail about several considerations to keep in mind while using the CelebA dataset.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('celeb_a', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/celeb_a-2.1.0.png" alt="Visualization" width="500px">
d
Asset database for the Cooper subregion on 27 August 2015
data.gov.au
researchdata.edu.au
+1more
Updated Aug 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2023). Asset database for the Cooper subregion on 27 August 2015 [Dataset]. https://data.gov.au/data/dataset/0b122b2b-e5fe-4166-93d1-3b94fc440c82
Explore at:
Dataset updated
Aug 9, 2023
Dataset authored and provided by
Bioregional Assessment Program
Description
Abstract

The public version of this Asset database can be accessed via the following dataset:

Asset database for the Cooper subregion on 27 August 2015 Public (526707e0-9d32-47de-a198-9c8f35761a7e)

The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

The asset database for Cooper subregion (v3) supersedes previous version (v2) of the Cooper Asset database (Asset database for the Cooper subregion on 14 August 2015, 5c3697e6-8077-4de7-b674-e0dfc33b570c). The M2_Reason in the Assetlist table and DecisionBrief in the AssetDecisions table have been updated with short descriptions (<255 characters) provided by project team 21/8, and the draft "water-dependent asset register and asset list" (BA-LEB-COO-130-WaterDependentAssetRegister-AssetList-V20150827) also updated accordingly. This change was made to avoid truncation in the brief reasons fields of the database and asset register. There have been no changes to assets or asset numbers.

This dataset contains a combination of spatial and non-spatial (attribute) components of the Cooper subregion Asset List - an mdb file (readable as an MS Access database or as an ESRI personal geodatabase) holds the non-spatial tabular attribute data, and an ESRI file geodatabase contains the spatial data layers, which are attributed only with unique identifiers ("AID" for assets, and "ElementID" for elements). The dataset also contains an update of the draft "Water-dependent asset register and asset list" spreadsheet (BA-NIC-COO-130-WaterDependentAssetRegister-AssetList-V20150827.xlsx).

The tabular attribute data can be joined in a GIS to the "Assetlist" table in the mdb database using the "AID" field to view asset attributes (BA attribution). To view the more detailed attribution at the element-level, the intermediate table "Element_to_asset" can be joined to the assets spatial datasets using AID, and then joining the individual attribute tables from the Access database using the common "ElementID" fields. Alternatively, the spatial feature layers representing elements can be linked directly to the individual attribute tables in the Access database using "ElementID", but this arrangement will not provide the asset-level groupings.

Further information is provided in the accompanying document, "COO_asset_database_doc20150827.doc" located within this dataset.

Dataset History

Version ID Date Notes

1.0 27/03/2015 Initial database

2.0 14/08/2015 "(1) Updated the database for M2 test results provided from COO assessment team and created the draft BA-LEB-COO-130-WaterDependentAssetRegister-AssetList-V20150814.xlsx

(2) updated the group, subgroup, class and depth for (up to) 2 NRM WAIT assets to cooperate the feedback to OWS from relevant SA NRM regional office (whose staff missed the asset workshop). The AIDs and names of those assets are listed in table LUT_changed_asset_class_20150814 in COO_asset_database_20150814.mdb

(3) As a result of (2), added one new asset separated from one existing asset. This asset and its parent are listed in table LUT_ADD_1_asste_20150814 in COO_asset_database_20150814.mdb. The M2 test result for this asset is inherited from its parent in this version

(5) Added Appendix C in COO_asset_database_doc_201500814.doc is about total elements/assets in current Group and subgroup

(6)Added Four SQL queries (Find_All_Used_Assets, Find_All_WD_Assets, Find_Amount_Asset_in_Class and Find_Amount_Elements_in_Class) in COO_asset_database_20150814.mdb.mdb for total assets and total numbers

(7)The databases, especially spatial database (COO_asset_database_20150814Only.gdb), were changed such as duplicated attribute fields in spatial data were removed and only ID field is kept. The user needs to join the Table Assetlist or Elementlist to the relevant spatial data"

3.0 27/08/2015 M2_Reason in the Assetlist table and DecisionBrief in the AssetDecisions table have been updated with short descriptions (<255 characters) provided by project team 21/8, and the draft "water-dependent asset register and asset list" (BA-LEB-COO-130-WaterDependentAssetRegister-AssetList-V20150827) also updated accordingly. No changes to asset numbers.

Dataset Citation

Bioregional Assessment Programme (2014) Asset database for the Cooper subregion on 27 August 2015. Bioregional Assessment Derived Dataset. Viewed 27 November 2017, http://data.bioregionalassessments.gov.au/dataset/0b122b2b-e5fe-4166-93d1-3b94fc440c82.

Dataset Ancestors

Derived From QLD Dept of Natural Resources and Mines, Groundwater Entitlements 20131204

Derived From Queensland QLD - Regional - NRM - Water Asset Information Tool - WAIT - databases

Derived From Matters of State environmental significance (version 4.1), Queensland

Derived From Geofabric Surface Network - V2.1

Derived From Communities of National Environmental Significance Database - RESTRICTED - Metadata only

Derived From South Australia SA - Regional - NRM Board - Water Asset Information Tool - WAIT - databases

Derived From National Groundwater Dependent Ecosystems (GDE) Atlas

Derived From National Groundwater Information System (NGIS) v1.1

Derived From Birds Australia - Important Bird Areas (IBA) 2009

Derived From Queensland QLD Regional CMA Water Asset Information WAIT tool databases RESTRICTED Includes ALL Reports

Derived From Queensland wetland data version 3 - wetland areas.

Derived From SA Department of Environment, Water and Natural Resources (DEWNR) Water Management Areas 141007

Derived From South Australian Wetlands - Groundwater Dependent Ecosystems (GDE) Classification

Derived From National Groundwater Dependent Ecosystems (GDE) Atlas (including WA)

Derived From Asset database for the Cooper subregion on 14 August 2015

Derived From QLD Dept of Natural Resources and Mines, Groundwater Entitlements linked to bores v3 03122014

Derived From Species Profile and Threats Database (SPRAT) - Australia - Species of National Environmental Significance Database (BA subset - RESTRICTED - Metadata only)

Derived From Ramsar Wetlands of Australia

Derived From Permanent and Semi-Permanent Waterbodies of the Lake Eyre Basin (Queensland and South Australia) (DRAFT)

Derived From SA EconomicElements v1 20141201

Derived From QLD Dept of Natural Resources and Mines, Groundwater Entitlements linked to bores and NGIS v4 28072014

Derived From National Heritage List Spatial Database (NHL) (v2.1)

Derived From Great Artesian Basin and Laura Basin groundwater recharge areas

Derived From SA Department of Environment, Water and Natural Resources (DEWNR) Groundwater Licences 141007

Derived From Lake Eyre Basin (LEB) Aquatic Ecosystems Mapping and Classification

Derived From Australia - Species of National Environmental Significance Database

Derived From Asset database for the Cooper subregion on 27 March 2015

Derived From Australia, Register of the National Estate (RNE) - Spatial Database (RNESDB) Internal

Derived From Directory of Important Wetlands in Australia (DIWA) Spatial Database (Public)

Derived From Collaborative Australian Protected Areas Database (CAPAD) 2010 (Not current release)

Facebook

Twitter

Click to copy link

Link copied

Cite

Yubin Deng; Ping Luo; Chen Change Loy; Xiaoou Tang (2024). PETA Dataset [Dataset]. https://paperswithcode.com/dataset/peta

PETA Dataset

Pedestrian Attribute

Explore at:

Dataset updated

Aug 18, 2024

Authors

Yubin Deng; Ping Luo; Chen Change Loy; Xiaoou Tang

Description

The PEdesTrian Attribute dataset (PETA) is a dataset fore recognizing pedestrian attributes, such as gender and clothing style, at a far distance. It is of interest in video surveillance scenarios where face and body close-shots and hardly available. It consists of 19,000 pedestrian images with 65 attributes (61 binary and 4 multi-class). Those images contain 8705 persons.

Clear search

Close search

Google apps

Main menu

PETA Dataset

SUN Attribute

Clothing Attributes Dataset

Data from: Car Evaluation Data Set

class N N[%]

CMAB-The World's First National-Scale Multi-Attribute Building Dataset

Animals with Attributes 2 (AwA2) dataset

Residential Property Attributes Data (LGATE-287) - Datasets - data.wa.gov.au...

DCASE 2024 Challenge Task 2 Additional Training Dataset

A dataset of for cross-course learning path planning with 7 types of learner...

FairFace

❗❗Note: This is merely uploading dataset under FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age

FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age

Face Attribute Dataset

Face Attribute

Property level energy consumption (modelled on building attributes) -...

Activity FACTS Common Attributes (Feature Layer)

Market1501-Attributes Dataset

Wc Attribute Change (OpenData) - Dataset - data.govt.nz - discover and use...

Attributes for NHDPlus Catchments (Version 1.1)for the Conterminous United...

entity-attribute-sft-dataset-GPT-4.0-generated-v1

DatasetofDatasets (DoD)

celeb_a

Asset database for the Cooper subregion on 27 August 2015

Abstract

Dataset History

Dataset Citation

Dataset Ancestors

PETA DatasetSee More Versions

Pedestrian Attribute

PETA Dataset