Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Indian Traffic Annotated is a dataset for object detection tasks - it contains Traffic Signs annotations for 453 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains 30 images (224x224 pixels) of Bangla Sign Language (BSL) captured using a web camera. It includes three categories: সাহায্য (Help), হ্যাঁ (Yes), and না (No). Each image is accompanied by annotations in XML format, making it suitable for sign language recognition and gesture-based communication research.
Dataset Details: Total Files: 30 (Images + Annotations) Categories: 3 (সাহায্য, হ্যাঁ, না) Image Resolution: 224x224 pixels File Format: JPEG (Images), XML (Annotations) Dataset Size: 289 kb
Usage This dataset is ideal for training deep learning models for gesture recognition, sign language translation, and assistive communication tools.
Acknowledgment: This dataset was manually collected for research and educational purposes.
Facebook
Twitter-Secure Implementation: NDA is signed to gurantee secure implementation and Annotated Imagery Data is destroyed upon delivery.
-Quality: Multiple rounds of quality inspections ensures high quality data output, certified with ISO9001
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ODDS Smart Building Depth Dataset
The goal of this dataset is to facilitate research focusing on recognizing objects in smart buildings using the depth sensor mounted at the ceiling. This dataset contains annotations of depth images for eight frequently seen object classes. The classes are: person, backpack, laptop, gun, phone, umbrella, cup, and box.
We collected data from two settings. We had Kinect mounted at a 9.3 feet ceiling near to a 6 feet wide door. We also used a tripod with a horizontal extender holding the kinect at a similar height looking downwards. We asked about 20 volunteers to enter and exit a number of times each in different directions (3 times walking straight, 3 times walking towards left side, 3 times walking towards right side) holding objects in many different ways and poses underneath the Kinect. Each subject was using his/her own backpack, purse, laptop, etc. As a result, we considered varieties within the same object, e.g., for laptops, we considered Macbooks, HP laptops, Lenovo laptops of different years and models, and for backpacks, we considered backpacks, side bags, and purse of women. We asked the subjects to walk while holding it in many ways, e.g., for laptop, the laptop was fully open, partially closed, and fully closed while carried. Also, people hold laptops in front and side of their bodies, and underneath their elbow. The subjects carried their backpacks in their back, in their side at different levels from foot to shoulder. We wanted to collect data with real guns. However, bringing real guns to the office is prohibited. So, we obtained a few nerf guns and the subjects were carrying these guns pointing it to front, side, up, and down while walking.
The Annotated dataset is created following the structure of Pascal VOC devkit, so that the data preparation becomes simple and it can be used quickly with different with object detection libraries that are friendly to Pascal VOC style annotations (e.g. Faster-RCNN, YOLO, SSD). The annotated data consists of a set of images; each image has an annotation file giving a bounding box and object class label for each object in one of the eight classes present in the image. Multiple objects from multiple classes may be present in the same image. The dataset has 3 main directories:
1)DepthImages: Contains all the images of training set and validation set.
2)Annotations: Contains one xml file per image file, (e.g., 1.xml for image file 1.png). The xml file includes the bounding box annotations for all objects in the corresponding image.
3)ImagesSets: Contains two text files training_samples.txt and testing_samples.txt. The training_samples.txt file has the name of images used in training and the testing_samples.txt has the name of images used for testing. (We randomly choose 80%, 20% split)
The un-annotated data consists of several set of depth images. No ground-truth annotation is available for these images yet. These un-annotated sets contain several challenging scenarios and no data has been collected from this office during annotated dataset construction. Hence, it will provide a way to test generalization performance of the algorithm.
If you use ODDS Smart Building dataset in your work, please cite the following reference in any publications: @inproceedings{mithun2018odds, title={ODDS: Real-Time Object Detection using Depth Sensors on Embedded GPUs}, author={Niluthpol Chowdhury Mithun and Sirajum Munir and Karen Guo and Charles Shelton}, booktitle={ ACM/IEEE Conference on Information Processing in Sensor Networks (IPSN)}, year={2018}, }
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Dataset Tn Annotated Exemples is a dataset for object detection tasks - it contains Bounding Boxes annotations for 1,000 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
Twittereturok/zerobench-annotated dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
ELEXIS-WSD is a parallel sense-annotated corpus in which content words (nouns, adjectives, verbs, and adverbs) have been assigned senses. Version 1.1 contains sentences for 10 languages: Bulgarian, Danish, English, Spanish, Estonian, Hungarian, Italian, Dutch, Portuguese, and Slovene.
The corpus was compiled by automatically extracting a set of sentences from WikiMatrix (Schwenk et al., 2019), a large open-access collection of parallel sentences derived from Wikipedia, using an automatic approach based on multilingual sentence embeddings. The sentences were manually validated according to specific formal, lexical and semantic criteria (e.g. by removing incorrect punctuation, morphological errors, notes in square brackets and etymological information typically provided in Wikipedia pages). To obtain a satisfying semantic coverage, we filtered out sentences with less than 5 words and less than 2 polysemous words were filtered out. Subsequently, in order to obtain datasets in the other nine target languages, for each selected sentence in English, the corresponding WikiMatrix translation into each of the other languages was retrieved. If no translation was available, the English sentence was translated manually. The resulting corpus is comprised of 2,024 sentences for each language.
The sentences were tokenized, lemmatized, and tagged with POS tags using UDPipe v2.6 (https://lindat.mff.cuni.cz/services/udpipe/). Senses were annotated using LexTag (https://elexis.babelscape.com/): each content word (noun, verb, adjective, and adverb) was assigned a sense from among the available senses from the sense inventory selected for the language (see below) or BabelNet. Sense inventories were also updated with new senses during annotation.
List of sense inventories BG: Dictionary of Bulgarian DA: DanNet – The Danish WordNet EN: Open English WordNet ES: Spanish Wiktionary ET: The EKI Combined Dictionary of Estonian HU: The Explanatory Dictionary of the Hungarian Language IT: PSC + Italian WordNet NL: Open Dutch WordNet PT: Portuguese Academy Dictionary (DACL) SL: Digital Dictionary Database of Slovene
The corpus is available in the CoNLL-U tab-separated format. In order, the columns contain the token ID, its form, its lemma, its UPOS-tag, five empty columns (reserved for e.g. dependency parsing, which is absent from this version), and the final MISC column containing the following: the token's whitespace information (whether the token is followed by a whitespace or not), the ID of the sense assigned to the token, and the index of the multiword expression (if the token is part of an annotated multiword expression).
Each language has a separate sense inventory containing all the senses (and their definitions) used for annotation in the corpus. Not all the senses from the sense inventory are necessarily included in the corpus annotations: for instance, all occurrences of the English noun "bank" in the corpus might be annotated with the sense of "financial institution", but the sense inventory also contains the sense "edge of a river" as well as all other possible senses to disambiguate between.
For more information, please refer to 00README.txt.
Differences to version 1.0: - Several minor errors were fixed (e.g. a typo in one of the Slovene sense IDs). - The corpus was converted to the true CoNLL-U format (as opposed to the CoNLL-U-like format used in v1.0). - An error was fixed that resulted in missing UPOS tags in version 1.0. - The sentences in all corpora now follow the same order (from 1 to 2024).
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset contains over 1,000 high-quality images, each annotated in YOLO format for human face detection. It is designed for use in computer vision tasks such as face detection, recognition, and facial feature analysis.
Key Features: - Number of Images: 1,000+ - Annotations: YOLO format (.txt files), with each file containing bounding box coordinates for detected human faces. - Single Class: All images are labeled with a single class (class 0), which corresponds to a human face. - Versatile Use: Can be used for training and evaluating face detection models, particularly with real-time detection systems like YOLOv8.
Applications: - Real-time face detection in security and surveillance systems - Augmented reality and face filters - Emotion or expression recognition - Facial feature extraction for biometric systems
Feel free to use this dataset to train your models or improve existing face detection systems!
Facebook
Twitterhttp://researchdatafinder.qut.edu.au/display/n47576http://researchdatafinder.qut.edu.au/display/n47576
md5sum: 116aade568ccfeaefcdd07b5110b815a QUT Research Data Respository Dataset Resource available for download
Facebook
TwitterOverview This dataset is a collection of 10,000+ high quality images of supermarket & store display shelves that are ready to use for optimizing the accuracy of computer vision models. All of the contents is sourced from PIXTA's stock library of 100M+ Asian-featured images and videos. PIXTA is the largest platform of visual materials in the Asia Pacific region offering fully-managed services, high quality contents and data, and powerful tools for businesses & organisations to enable their creative and machine learning projects.
Use case The dataset could be used for various AI & Computer Vision models: Store Management, Stock Monitoring, Customer Experience, Sales Analysis, Cashierless Checkout,... Each data set is supported by both AI and human review process to ensure labelling consistency and accuracy. Contact us for more custom datasets.
About PIXTA PIXTASTOCK is the largest Asian-featured stock platform providing data, contents, tools and services since 2005. PIXTA experiences 15 years of integrating advanced AI technology in managing, curating, processing over 100M visual materials and serving global leading brands for their creative and data demands. Visit us at https://www.pixta.ai/ or contact via our email admin.bi@pixta.co.jp.
Facebook
Twitterhttps://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms
This repository contains an expert-annotated dataset of 1261 tweets and the corresponding annotation framework from the publication "SciTweets - A Dataset and Annotation Framework for Detecting Scientific Online Discourse" (https://arxiv.org/abs/2206.07360). The tweets are annotated with three different categories of science-relatedness:
(1) Scientific knowledge (scientifically verifiable claims): Tweets that include a claim or a question that could be scientifically verified, (2) Reference to scientific knowledge: Tweets that include at least one reference to scientific knowledge (references can either be direct, e.g., DOI, title of a paper or indirect, e.g., a link to an article that includes a direct reference), and (3) Related to scientific research in general: Tweets that mention a scientific research context (e.g., mention a scientist, scientific research efforts, research findings).
Further, the annotations include the annotators' confidence scores as well as labels for compound claims and ironic tweets.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains images of papaya leaf diseases. Images were collected from Changao , Ashulia , Dhaka, Bangladesh area with coordinates at 23° 53' 2" N and 90° 19' 28" E. The images were taken from July 12-August 2, 2023 , approximately 22 days. The .rar directory has one parent folder. Inside parent folder there are 3 sub directory. Original Images, Annotations, Labels. ‘Original Images’ folder has all the original jpg images classified into five classes. ‘Annotations’ folder has annotation images in xml format . ‘Labels’ folder has annotation images in txt format . Number of images: 2159 Number of classes: 5 Name if the classes: Anthracnose, Bacterial spot, Curl, Ring spot, and Healthy . Number of annotation images : 1050 Annotation format: XML,TXT
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
pathologies
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The dataset comprises of annotated video frames from positioned in a public space camera. The tracking of each individual in the camera's view has been achieved using the rectangle tool in the Computer Vision Annotation Tool (CVAT).
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Fc5a8dc4f63fe85c64a5fead10fad3031%2Fpersons_gif.gif?generation=1690705558283123&alt=media" alt="">
images directory houses the original video frames, serving as the primary source of raw data. annotations.xml file provides the detailed annotation data for the images. boxes directory contains frames that visually represent the bounding box annotations, showing the locations of the tracked individuals within each frame. These images can be used to understand how the tracking has been implemented and to visualize the marked areas for each individual.The annotations are represented as rectangle bounding boxes that are placed around each individual. Each bounding box annotation contains the position ( xtl-ytl-xbr-ybr coordinates ) for the respective box within the frame.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F4f274551e10db2754c4d8a16dff97b33%2Fcarbon%20(10).png?generation=1687776281548084&alt=media" alt="">
🚀 You can learn more about our high-quality unique datasets here
keywords: multiple people tracking, human detection dataset, object detection dataset, people tracking dataset, tracking human object interactions, human Identification tracking dataset, people detection annotations, detecting human in a crowd, human trafficking dataset, deep learning object tracking, multi-object tracking dataset, labeled web tracking dataset, large-scale object tracking dataset
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
A fully human labelled data set consists annotated objects from 5 waste categories. Data Set Preparation Steps are as follows -
1) First, We’ve visited 10 different dumping stations in Dhaka City; captured the spotted trash using mobile phones and accumulated more than 1200 images. We have taken images from different angles and verticals trying to mimic a moving agent which will automate the process. All the images are captured in different lighting conditions (Haze, Cloudy, Sunny etc.) during day time at an optimum resolution. 2) Then, We have carefully investigated the images and identified the varieties and different orientations with which waste appears in open environment. 3) After that, we have written a web scrapper in python to download photos of given category (refer to Table-I) from google images automatically. We have accumulated photos of more than 70 different sub-categories waste, which belong to our main five categories. 4) From there, We have carefully investigated and picked 1000 photos from the collection that matches the real scenarios by hand. 5) We have annotated the final images using the labelImg tool LabelImg in PASCAL VOC format. Later conversions have been performed on the format, as per model requirements. 6) The annotated images are then further verified by two another human individual and finally used for model development purpose. The primary objective had been to keep the contextual images only to guard against the data drift and concept drift problem
link to the paper - https://link.springer.com/chapter/10.1007/978-981-19-8032-9_28
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The booming AI data annotation market, projected to reach $10 billion by 2033, is driven by increasing demand for high-quality training data in sectors like healthcare, autonomous driving, and content moderation. Learn about market trends, key players, and growth projections in this comprehensive analysis.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Annotated T2-weighted MR images of the Lower Spine
Chengwen Chu, Daniel Belavy, Gabriele Armbrecht, Martin Bansmann, Dieter Felsenberg, and Guoyan Zheng
Introduction
The Institute for Surgical Technology and Biomechanics, University of Bern, Switzerland, Charité - University Medicine Berlin, Centre of Muscle and Bone Research, Free University & Humboldt-University Berlin, Germany, Centre for Physical Activity and Nutrition Research, School of Exercise and Nutrition Sciences, Deakin University Burwood Campus, Australia and Institut für Diagnostische und Interventionelle Radiologie, Krankenhaus Porz Am Rhein gGmbH, Köln, Germany, are making this dataset available as a resource in the development of algorithms and tools for spinal image analysis.
Description
The database consists of T2-weighted turbo spin echo MR spine images of 23 anonymized patients, each containing at least 7 vertebral bodies (VBs) of the lower spine (T11 – L5). For each vertebral body, reference manual segmentation is provided in the form of a binary mask. All images and binary masks are stored in the Neuroimaging Informatics Technology Initiative (NIFTI) file format, see details at http://nifti.nimh.nih.gov/. Image files are stored as "Img_xx.nii" while the associated annotation files are stored as "Img_xx_Labels.nii", where "xx" is the internal case number for the patient.
Image annotations were prepared by Mr. Chengwen Chu (no professional training in radiology).
Acknowledgements
Reference
C. Chu, D. Belavy, W. Yu, G. Armbrecht, M. Bansmann, D. Felsenberg, and G. Zheng, “Fully Automatic Localization and Segmentation of 3D Vertebral Bodies from CT/MR Images via A Learning-based Method”, PLoS One. 2015 Nov 23;10(11):e0143327. doi: 10.1371/journal.pone.0143327. eCollection 2015.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was recorded in the Virtual Annotated Cooking Environment (VACE), a new open-source virtual reality dataset (https://sites.google.com/view/vacedataset) and simulator (https://github.com/michaelkoller/vacesimulator) for object interaction tasks in a rich kitchen environment. We use the Unity-based VR simulator to create thoroughly annotated video sequences of a virtual human avatar performing food preparation activities. Based on the MPII Cooking 2 dataset, it enables the recreation of recipes for meals such as sandwiches, pizzas, fruit salads and smaller activity sequences such as cutting vegetables. For complex recipes, multiple samples are present, following different orderings of valid partially ordered plans. The dataset includes an RGB and depth camera view, bounding boxes, object masks segmentation, human joint poses and object poses, as well as ground truth interaction data in the form of temporally labeled semantic predicates (holding, on, in, colliding, moving, cutting). In our effort to make the simulator accessible as an open-source tool, researchers are able to expand the setting and annotation to create additional data samples.
The research leading to these results has received funding from the Austrian Science Fund (FWF) under grant agreement No. I3969-N30 InDex and the project Doctorate College TrustRobots by TU Wien. Thanks go out to Simon Schreiberhuber for sharing his Unity expertise and to the colleagues at the TU Wien Center for Research Data Management for data hosting and support.
Facebook
Twitterhttps://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This dataset contains image annotations derived from "The Clinical Proteomic Tumor Analysis Consortium Uterine Corpus Endometrial Carcinoma Collection (CPTAC-UCEC)”. This dataset was generated as part of a National Cancer Institute project to augment images from The Cancer Imaging Archive with annotations that will improve their value for cancer researchers and artificial intelligence experts.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
We introduce PLANesT-3D; a new annotated dataset of 3D color point clouds of plants. PLANesT-3D is composed of 34 point cloud models representing 34 real plants from three different plant species: Capsicum annuum, Rosa kordana, and Ribes rubrum.
PLANesT-3D Plant Point Clouds
The folders Pepper, Rose and Ribes contain 10 point clouds of Pepper, 10 point clouds of Rose, and 14 point clouds of Ribes plants, respectively. The point clouds were reconstructed from multiple images captured manually around each plant. For reconstruction of 3D color point clouds from 2D color images, Agisoft Metashape Professional (Agisoft LLC, St. Petersburg, Russia) was employed. The plant point cloud was separated from the background, pose normalized, and scaled to the correct size through a semi-automatic process.
Raw point clouds of the PLANesT-3D Dataset
The raw point clouds of the PLANesT-3D Dataset can be downloaded from:
Images of the PLANesT-3D dataset
2D camera images used to reconstruct the 3D point clouds of the PLANesT-3D dataset and camera information files can be downloaded from:
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Indian Traffic Annotated is a dataset for object detection tasks - it contains Traffic Signs annotations for 453 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).