24 datasets found
  1. P

    COCO-WholeBody Dataset

    • paperswithcode.com
    Updated Oct 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sheng Jin; Lumin Xu; Jin Xu; Can Wang; Wentao Liu; Chen Qian; Wanli Ouyang; Ping Luo (2022). COCO-WholeBody Dataset [Dataset]. https://paperswithcode.com/dataset/coco-wholebody
    Explore at:
    Dataset updated
    Oct 9, 2022
    Authors
    Sheng Jin; Lumin Xu; Jin Xu; Can Wang; Wentao Liu; Chen Qian; Wanli Ouyang; Ping Luo
    Description

    COCO-WholeBody is an extension of COCO dataset with whole-body annotations. There are 4 types of bounding boxes (person box, face box, left-hand box, and right-hand box) and 133 keypoints (17 for body, 6 for feet, 68 for face and 42 for hands) annotations for each person in the image.

  2. g

    COCO Dataset 2017

    • gts.ai
    json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS, COCO Dataset 2017 [Dataset]. https://gts.ai/dataset-download/coco-dataset-2017/
    Explore at:
    jsonAvailable download formats
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset.

  3. t

    T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár,...

    • service.tib.eu
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C. L. Zitnick (2024). Dataset: COCO keypoint-2017 dataset. https://doi.org/10.57702/j5uw1ss7 [Dataset]. https://service.tib.eu/ldmservice/dataset/coco-keypoint-2017-dataset
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    The COCO keypoint-2017 dataset contains over 200,000 images and 250,000 human instances labeled with 17 keypoints.

  4. R

    Microsoft Coco Dataset

    • universe.roboflow.com
    zip
    Updated Apr 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft (2025). Microsoft Coco Dataset [Dataset]. https://universe.roboflow.com/microsoft/coco/model/3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 4, 2025
    Dataset authored and provided by
    Microsoft
    Variables measured
    Object Bounding Boxes
    Description

    Microsoft Common Objects in Context (COCO) Dataset

    The Common Objects in Context (COCO) dataset is a widely recognized collection designed to spur object detection, segmentation, and captioning research. Created by Microsoft, COCO provides annotations, including object categories, keypoints, and more. The model it a valuable asset for machine learning practitioners and researchers. Today, many model architectures are benchmarked against COCO, which has enabled a standard system by which architectures can be compared.

    While COCO is often touted to comprise over 300k images, it's pivotal to understand that this number includes diverse formats like keypoints, among others. Specifically, the labeled dataset for object detection stands at 123,272 images.

    The full object detection labeled dataset is made available here, ensuring researchers have access to the most comprehensive data for their experiments. With that said, COCO has not released their test set annotations, meaning the test data doesn't come with labels. Thus, this data is not included in the dataset.

    The Roboflow team has worked extensively with COCO. Here are a few links that may be helpful as you get started working with this dataset:

  5. h

    coco_keypoints

    • huggingface.co
    Updated Jul 14, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huai-Yuan Wang (2024). coco_keypoints [Dataset]. https://huggingface.co/datasets/whyen-wang/coco_keypoints
    Explore at:
    Dataset updated
    Jul 14, 2024
    Authors
    Huai-Yuan Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for "COCO Keypoints"

      Quick Start
    
    
    
    
    
    
    
      Usage
    

    from datasets.load import load_dataset

    dataset = load_dataset('whyen-wang/coco_keypoints') example = dataset['train'][0] print(example) {'image':

  6. h

    COCO

    • huggingface.co
    • datasets.activeloop.ai
    Updated Feb 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HuggingFaceM4 (2023). COCO [Dataset]. https://huggingface.co/datasets/HuggingFaceM4/COCO
    Explore at:
    Dataset updated
    Feb 6, 2023
    Dataset authored and provided by
    HuggingFaceM4
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MS COCO is a large-scale object detection, segmentation, and captioning dataset. COCO has several features: Object segmentation, Recognition in context, Superpixel stuff segmentation, 330K images (>200K labeled), 1.5 million object instances, 80 object categories, 91 stuff categories, 5 captions per image, 250,000 people with keypoints.

  7. h

    CropCOCO

    • huggingface.co
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Visual Recognition Group FEE CTU in Prague (2025). CropCOCO [Dataset]. https://huggingface.co/datasets/vrg-prague/CropCOCO
    Explore at:
    Dataset updated
    Apr 10, 2025
    Dataset authored and provided by
    Visual Recognition Group FEE CTU in Prague
    License

    https://choosealicense.com/licenses/gpl-3.0/https://choosealicense.com/licenses/gpl-3.0/

    Description

    CropCOCO Dataset

    CropCOCO is a validation-only dataset of COCO val 2017 images cropped such that some keypoints annotations are outside of the image. It can be used for keypoint detection, out-of-image keypoint detection and localization, person detection and amodal person detection.

      📦 Dataset Details
    

    Total images: 4,114 Annotations: COCO-style (bounding boxes, human keypoints, both in and out-of-image)Resolution: Varies Format: JSON annotations + JPG images… See the full description on the dataset page: https://huggingface.co/datasets/vrg-prague/CropCOCO.

  8. n

    COCO

    • scidm.nchc.org.tw
    Updated Oct 10, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). COCO [Dataset]. https://scidm.nchc.org.tw/dataset/coco
    Explore at:
    Dataset updated
    Oct 10, 2020
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    COCO is a large-scale object detection, segmentation, and captioning dataset. http://cocodataset.org COCO has several features: Object segmentation Recognition in context Superpixel stuff segmentation 330K images (>200K labeled) 1.5 million object instances 80 object categories 91 stuff categories 5 captions per image * 250,000 people with keypoints

  9. P

    Cow Pose Estimation Dataset Dataset

    • paperswithcode.com
    Updated Mar 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Cow Pose Estimation Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/cow-pose-estimation-dataset
    Explore at:
    Dataset updated
    Mar 5, 2025
    Description

    Description:

    👉 Download the dataset here

    This dataset has been specifically curated for cow pose estimation, designed to enhance animal behavior analysis and monitoring through computer vision techniques. The dataset is annotated with 12 keypoints on the cow’s body, enabling precise tracking of body movements and posture. It is structured in the COCO format, making it compatible with popular deep learning models like YOLOv8, OpenPose, and others designed for object detection and keypoint estimation tasks.

    Applications:

    This dataset is ideal for agricultural tech solutions, veterinary care, and animal behavior research. It can be used in various use cases such as health monitoring, activity tracking, and early disease detection in cattle. Accurate pose estimation can also assist in optimizing livestock management by understanding animal movement patterns and detecting anomalies in their gait or behavior.

    Download Dataset

    Keypoint Annotations:

    The dataset includes the following 12 keypoints, strategically marked to represent significant anatomical features of cows:

    Nose: Essential for head orientation and overall movement tracking.

    Right Eye: Helps in head pose estimation.

    Left Eye: Complements the right eye for accurate head direction.

    Neck (side): Marks the side of the neck, key for understanding head and body coordination.

    Left Front Hoof: Tracks the front left leg movement.

    Right Front Hoof: Tracks the front right leg movement.

    Left Back Hoof: Important for understanding rear leg motion.

    Right Back Hoof: Completes the leg movement tracking for both sides.

    Backbone (side): Vital for posture and overall body orientation analysis.

    Tail Root: Used for tracking tail movements and posture shifts.

    Backpose Center (near tail’s midpoint): Marks the midpoint of the back, crucial for body stability and movement analysis.

    Stomach (center of side pose): Helps in identifying body alignment and weight distribution.

    Dataset Format:

    The data is structure in the COCO format, with annotations that include image coordinates for each keypoint. This format is highly suitable for integration into popular deep learning frameworks. Additionally, the dataset includes metadata like bounding boxes, image sizes, and segmentation masks to provide detail context for each cow in an image.

    Compatibility:

    This dataset is optimize for use with cutting-edge pose estimation models such as YOLOv8 and other keypoint detection models like DeepLabCut and HRNet, enabling efficient training and inference for cow pose tracking. It can be seamlessly integrate into existing machine learning pipelines for both real-time and post-processed analysis.

    This dataset is sourced from Kaggle.

  10. ActiveHuman Part 2

    • zenodo.org
    • data.niaid.nih.gov
    Updated Apr 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Charalampos Georgiadis; Charalampos Georgiadis (2025). ActiveHuman Part 2 [Dataset]. http://doi.org/10.5281/zenodo.8361114
    Explore at:
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Charalampos Georgiadis; Charalampos Georgiadis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is Part 2/2 of the ActiveHuman dataset! Part 1 can be found here.

    Dataset Description

    ActiveHuman was generated using Unity's Perception package.

    It consists of 175428 RGB images and their semantic segmentation counterparts taken at different environments, lighting conditions, camera distances and angles. In total, the dataset contains images for 8 environments, 33 humans, 4 lighting conditions, 7 camera distances (1m-4m) and 36 camera angles (0-360 at 10-degree intervals).

    The dataset does not include images at every single combination of available camera distances and angles, since for some values the camera would collide with another object or go outside the confines of an environment. As a result, some combinations of camera distances and angles do not exist in the dataset.

    Alongside each image, 2D Bounding Box, 3D Bounding Box and Keypoint ground truth annotations are also generated via the use of Labelers and are stored as a JSON-based dataset. These Labelers are scripts that are responsible for capturing ground truth annotations for each captured image or frame. Keypoint annotations follow the COCO format defined by the COCO keypoint annotation template offered in the perception package.

    Folder configuration

    The dataset consists of 3 folders:

    • JSON Data: Contains all the generated JSON files.
    • RGB Images: Contains the generated RGB images.
    • Semantic Segmentation Images: Contains the generated semantic segmentation images.

    Essential Terminology

    • Annotation: Recorded data describing a single capture.
    • Capture: One completed rendering process of a Unity sensor which stored the rendered result to data files (e.g. PNG, JPG, etc.).
    • Ego: Object or person on which a collection of sensors is attached to (e.g., if a drone has a camera attached to it, the drone would be the ego and the camera would be the sensor).
    • Ego coordinate system: Coordinates with respect to the ego.
    • Global coordinate system: Coordinates with respect to the global origin in Unity.
    • Sensor: Device that captures the dataset (in this instance the sensor is a camera).
    • Sensor coordinate system: Coordinates with respect to the sensor.
    • Sequence: Time-ordered series of captures. This is very useful for video capture where the time-order relationship of two captures is vital.
    • UIID: Universal Unique Identifier. It is a unique hexadecimal identifier that can represent an individual instance of a capture, ego, sensor, annotation, labeled object or keypoint, or keypoint template.

    Dataset Data

    The dataset includes 4 types of JSON annotation files files:

    • annotation_definitions.json: Contains annotation definitions for all of the active Labelers of the simulation stored in an array. Each entry consists of a collection of key-value pairs which describe a particular type of annotation and contain information about that specific annotation describing how its data should be mapped back to labels or objects in the scene. Each entry contains the following key-value pairs:
      • id: Integer identifier of the annotation's definition.
      • name: Annotation name (e.g., keypoints, bounding box, bounding box 3D, semantic segmentation).
      • description: Description of the annotation's specifications.
      • format: Format of the file containing the annotation specifications (e.g., json, PNG).
      • spec: Format-specific specifications for the annotation values generated by each Labeler.

    Most Labelers generate different annotation specifications in the spec key-value pair:

    • BoundingBox2DLabeler/BoundingBox3DLabeler:
      • label_id: Integer identifier of a label.
      • label_name: String identifier of a label.
    • KeypointLabeler:
      • template_id: Keypoint template UUID.
      • template_name: Name of the keypoint template.
      • key_points: Array containing all the joints defined by the keypoint template. This array includes the key-value pairs:
        • label: Joint label.
        • index: Joint index.
        • color: RGBA values of the keypoint.
        • color_code: Hex color code of the keypoint
      • skeleton: Array containing all the skeleton connections defined by the keypoint template. Each skeleton connection defines a connection between two different joints. This array includes the key-value pairs:
        • label1: Label of the first joint.
        • label2: Label of the second joint.
        • joint1: Index of the first joint.
        • joint2: Index of the second joint.
        • color: RGBA values of the connection.
        • color_code: Hex color code of the connection.
    • SemanticSegmentationLabeler:
      • label_name: String identifier of a label.
      • pixel_value: RGBA values of the label.
      • color_code: Hex color code of the label.

    • captures_xyz.json: Each of these files contains an array of ground truth annotations generated by each active Labeler for each capture separately, as well as extra metadata that describe the state of each active sensor that is present in the scene. Each array entry in the contains the following key-value pairs:
      • id: UUID of the capture.
      • sequence_id: UUID of the sequence.
      • step: Index of the capture within a sequence.
      • timestamp: Timestamp (in ms) since the beginning of a sequence.
      • sensor: Properties of the sensor. This entry contains a collection with the following key-value pairs:
        • sensor_id: Sensor UUID.
        • ego_id: Ego UUID.
        • modality: Modality of the sensor (e.g., camera, radar).
        • translation: 3D vector that describes the sensor's position (in meters) with respect to the global coordinate system.
        • rotation: Quaternion variable that describes the sensor's orientation with respect to the ego coordinate system.
        • camera_intrinsic: matrix containing (if it exists) the camera's intrinsic calibration.
        • projection: Projection type used by the camera (e.g., orthographic, perspective).
      • ego: Attributes of the ego. This entry contains a collection with the following key-value pairs:
        • ego_id: Ego UUID.
        • translation: 3D vector that describes the ego's position (in meters) with respect to the global coordinate system.
        • rotation: Quaternion variable containing the ego's orientation.
        • velocity: 3D vector containing the ego's velocity (in meters per second).
        • acceleration: 3D vector containing the ego's acceleration (in ).
      • format: Format of the file captured by the sensor (e.g., PNG, JPG).
      • annotations: Key-value pair collections, one for each active Labeler. These key-value pairs are as follows:
        • id: Annotation UUID .
        • annotation_definition: Integer identifier of the annotation's definition.
        • filename: Name of the file generated by the Labeler. This entry is only present for Labelers that generate an image.
        • values: List of key-value pairs containing annotation data for the current Labeler.

    Each Labeler generates different annotation specifications in the values key-value pair:

    • BoundingBox2DLabeler:
      • label_id: Integer identifier of a label.
      • label_name: String identifier of a label.
      • instance_id: UUID of one instance of an object. Each object with the same label that is visible on the same capture has different instance_id values.
      • x: Position of the 2D bounding box on the X axis.
      • y: Position of the 2D bounding box position on the Y axis.
      • width: Width of the 2D bounding box.
      • height: Height of the 2D bounding box.
    • BoundingBox3DLabeler:
      • label_id: Integer identifier of a label.
      • label_name: String identifier of a label.
      • instance_id: UUID of one instance of an object. Each object with the same label that is visible on the same capture has different instance_id values.
      • translation: 3D vector containing the location of the center of the 3D bounding box with respect to the sensor coordinate system (in meters).
      • size: 3D

  11. Z

    Parcel2D Real - A real-world image dataset of cuboid-shaped parcels with 2D...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Furmans, Kai (2023). Parcel2D Real - A real-world image dataset of cuboid-shaped parcels with 2D and 3D annotations [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8031970
    Explore at:
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Naumann, Alexander
    Zhou, Benchun
    Dörr, Laura
    Hertlein, Felix
    Furmans, Kai
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    World
    Description

    Real-world dataset of ~400 images of cuboid-shaped parcels with full 2D and 3D annotations in the COCO format.

    Relevant computer vision tasks:

    bounding box detection

    instance segmentation

    keypoint estimation

    3D bounding box estimation

    3D voxel reconstruction (.binvox files)

    3D reconstruction (.obj files)

    For details, see our paper and project page.

    If you use this resource for scientific research, please consider citing

    @inproceedings{naumannScrapeCutPasteLearn2022, title = {Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel Logistics}, author = {Naumann, Alexander and Hertlein, Felix and Zhou, Benchun and Dörr, Laura and Furmans, Kai}, booktitle = {{{IEEE Conference}} on {{Machine Learning}} and Applications ({{ICMLA}})}, date = 2022 }

  12. Parcel3D - A Synthetic Dataset of Damaged and Intact Parcel Images with 2D...

    • zenodo.org
    • explore.openaire.eu
    • +1more
    zip
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Naumann; Alexander Naumann; Felix Hertlein; Felix Hertlein; Laura Dörr; Laura Dörr; Kai Furmans; Kai Furmans (2023). Parcel3D - A Synthetic Dataset of Damaged and Intact Parcel Images with 2D and 3D Annotations [Dataset]. http://doi.org/10.5281/zenodo.8032204
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander Naumann; Alexander Naumann; Felix Hertlein; Felix Hertlein; Laura Dörr; Laura Dörr; Kai Furmans; Kai Furmans
    Description

    Synthetic dataset of over 13,000 images of damaged and intact parcels with full 2D and 3D annotations in the COCO format. For details see our paper and for visual samples our project page.


    Relevant computer vision tasks:

    • bounding box detection
    • classification
    • instance segmentation
    • keypoint estimation
    • 3D bounding box estimation
    • 3D voxel reconstruction
    • 3D reconstruction

    The dataset is for academic research use only, since it uses resources with restrictive licenses.
    For a detailed description of how the resources are used, we refer to our paper and project page.

    Licenses of the resources in detail:

    You can use our textureless models (i.e. the obj files) of damaged parcels under CC BY 4.0 (note that this does not apply to the textures).

    If you use this resource for scientific research, please consider citing

    @inproceedings{naumannParcel3DShapeReconstruction2023,
      author  = {Naumann, Alexander and Hertlein, Felix and D\"orr, Laura and Furmans, Kai},
      title   = {Parcel3D: Shape Reconstruction From Single RGB Images for Applications in Transportation Logistics},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
      month   = {June},
      year   = {2023},
      pages   = {4402-4412}
    }
  13. P

    OccludedPASCAL3D+ Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Feb 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Angtian Wang; Yihong Sun; Adam Kortylewski; Alan Yuille (2024). OccludedPASCAL3D+ Dataset [Dataset]. https://paperswithcode.com/dataset/occludedpascal3d
    Explore at:
    Dataset updated
    Feb 9, 2024
    Authors
    Angtian Wang; Yihong Sun; Adam Kortylewski; Alan Yuille
    Description

    The OccludedPASCAL3D+ is a dataset is designed to evaluate the robustness to occlusion for a number of computer vision tasks, such as object detection, keypoint detection and pose estimation. In the OccludedPASCAL3D+ dataset, we simulate partial occlusion by superimposing objects cropped from the MS-COCO dataset on top of objects from the PASCAL3D+ dataset. We only use ImageNet subset in PASCAL3D+, which has 10812 testing images.

  14. coco_music_keypoints

    • kaggle.com
    Updated Feb 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kwon-Young Choi (2024). coco_music_keypoints [Dataset]. https://www.kaggle.com/datasets/kwonyoungchoi/coco-music-keypoints/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 18, 2024
    Dataset provided by
    Kaggle
    Authors
    Kwon-Young Choi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Kwon-Young Choi

    Released under MIT

    Contents

  15. C

    Annotations for ConfLab A Rich Multimodal Multisensor Dataset of...

    • data.4tu.nl
    Updated Jun 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chirag Raman; Jose Vargas Quiros; Stephanie Tan; Ashraful Islam; Ekin Gedik; Hayley Hung (2022). Annotations for ConfLab A Rich Multimodal Multisensor Dataset of Free-Standing Social Interactions In-the-Wild [Dataset]. http://doi.org/10.4121/20017664.v1
    Explore at:
    Dataset updated
    Jun 8, 2022
    Dataset provided by
    4TU.ResearchData
    Authors
    Chirag Raman; Jose Vargas Quiros; Stephanie Tan; Ashraful Islam; Ekin Gedik; Hayley Hung
    License

    https://data.4tu.nl/info/fileadmin/user_upload/Documenten/4TU.ResearchData_Restricted_Data_2022.pdfhttps://data.4tu.nl/info/fileadmin/user_upload/Documenten/4TU.ResearchData_Restricted_Data_2022.pdf

    Description

    This file contains the annotations for the ConfLab dataset, including actions (speaking status), pose, and F-formations.

    ------------------

    ./actions/speaking_status:

    ./processed: the processed speaking status files, aggregated into a single data frame per segment. Skipped rows in the raw data (see https://josedvq.github.io/covfee/docs/output for details) have been imputed using the code at: https://github.com/TUDelft-SPC-Lab/conflab/tree/master/preprocessing/speaking_status

    The processed annotations consist of:

    ./speaking: The first row contains person IDs matching the sensor IDs,

    The rest of the row contains binary speaking status annotations at 60fps for the corresponding 2 min video segment (7200 frames).

    ./confidence: Same as above. These annotations reflect the continuous-valued rating of confidence of the annotators in their speaking annotation.

    To load these files with pandas: pd.read_csv(p, index_col=False)


    ./raw.zip: the raw outputs from speaking status annotation for each of the eight annotated 2-min video segments. These were were output by the covfee annotation tool (https://github.com/josedvq/covfee)

    Annotations were done at 60 fps.

    --------------------

    ./pose:

    ./coco: the processed pose files in coco JSON format, aggregated into a single data frame per video segment. These files have been generated from the raw files using the code at: https://github.com/TUDelft-SPC-Lab/conflab-keypoints

    To load in Python: f = json.load(open('/path/to/cam2_vid3_seg1_coco.json'))

    The skeleton structure (limbs) is contained within each file in:

    f['categories'][0]['skeleton']

    and keypoint names at:

    f['categories'][0]['keypoints']

    ./raw.zip: the raw outputs from continuous pose annotation. These were were output by the covfee annotation tool (https://github.com/josedvq/covfee)

    Annotations were done at 60 fps.

    ---------------------

    ./f_formations:

    seg 2: 14:00 onwards, for videos of the form x2xxx.MP4 in /video/raw/ for the relevant cameras (2,4,6,8,10).

    seg 3: for videos of the form x3xxx.MP4 in /video/raw/ for the relevant cameras (2,4,6,8,10).

    Note that camera 10 doesn't include meaningful subject information/body parts that are not already covered in camera 8.

    First column: time stamp

    Second column: "()" delineates groups, "<>" delineates subjects, cam X indicates the best camera view for which a particular group exists.


    phone.csv: time stamp (pertaining to seg3), corresponding group, ID of person using the phone

  16. P

    AIC Dataset

    • paperswithcode.com
    Updated Dec 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiahong Wu; He Zheng; Bo Zhao; Yixin Li; Baoming Yan; Rui Liang; Wenjia Wang; Shipei Zhou; Guosen Lin; Yanwei Fu; Yizhou Wang; Yonggang Wang (2023). AIC Dataset [Dataset]. https://paperswithcode.com/dataset/aic
    Explore at:
    Dataset updated
    Dec 3, 2023
    Authors
    Jiahong Wu; He Zheng; Bo Zhao; Yixin Li; Baoming Yan; Rui Liang; Wenjia Wang; Shipei Zhou; Guosen Lin; Yanwei Fu; Yizhou Wang; Yonggang Wang
    Description

    A large-scale dataset named AIC (AI Challenger) with three sub-datasets, human keypoint detection (HKD), large-scale attribute dataset (LAD) and image Chinese captioning (ICC).

  17. iRodent: a keypoint and segmentation dataset of rodents in the wild

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Aug 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shaokai Ye; Shaokai Ye; Anastasiia Filippova; Anastasiia Filippova; Jessy Lauer; Jessy Lauer; Maxime Vidal; Steffen Schneider; Steffen Schneider; Tian Qiu; Alexander Mathis; Alexander Mathis; Mackenzie Weygandt Mathis; Mackenzie Weygandt Mathis; Maxime Vidal; Tian Qiu (2023). iRodent: a keypoint and segmentation dataset of rodents in the wild [Dataset]. http://doi.org/10.5281/zenodo.8250392
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Aug 16, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Shaokai Ye; Shaokai Ye; Anastasiia Filippova; Anastasiia Filippova; Jessy Lauer; Jessy Lauer; Maxime Vidal; Steffen Schneider; Steffen Schneider; Tian Qiu; Alexander Mathis; Alexander Mathis; Mackenzie Weygandt Mathis; Mackenzie Weygandt Mathis; Maxime Vidal; Tian Qiu
    License

    http://www.apache.org/licenses/LICENSE-2.0http://www.apache.org/licenses/LICENSE-2.0

    Description

    Description: The "iRodent" dataset contains rodent species observations obtained using the iNaturalist API, with a focus on Suborder Myomorpha (Taxon ID: 16). The dataset features prominent rodent species like Muskrat, Brown Rat, House Mouse, Black Rat, Hispid Cotton Rat, Meadow Vole, Bank Vole, Deer Mouse, White-footed Mouse, and Striped Field Mouse. The dataset provides manually labeled keypoints for pose estimation and segmentation masks for a subset of images using a Mask R-CNN model.
    Creator: Adaptive Motor Control Lab
    Data Format: COCO format
    Number of Images: 443
    Species: Muskrat, Brown Rat, House Mouse, Black Rat, Hispid Cotton Rat, Meadow Vole, Bank Vole, Deer Mouse, White-footed Mouse, Striped Field Mouse
    Image Resolution: Varied (800x600 to 5184x3456 pixels)
    Annotations: Pose keypoints and generated segmentation masks by Tian Qiu and Mackenzie Mathis.
    License: Apache 2.0
    Keywords: animal pose estimation, behaviour analysis, keypoints, rodent
    Contact: Mackenzie Mathis
    Email: mackenzie.mathis@epfl.ch

  18. TAMPAR: Visual Tampering Detection for Parcels Logistics in Postal Supply...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Nov 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Naumann; Alexander Naumann; Felix Hertlein; Laura Dörr; Kai Furmans; Felix Hertlein; Laura Dörr; Kai Furmans (2023). TAMPAR: Visual Tampering Detection for Parcels Logistics in Postal Supply Chains [Dataset]. http://doi.org/10.5281/zenodo.10057090
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander Naumann; Alexander Naumann; Felix Hertlein; Laura Dörr; Kai Furmans; Felix Hertlein; Laura Dörr; Kai Furmans
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    TAMPAR is a real-world dataset of parcel photos for tampering detection with annotations in COCO format. For details see our paper and for visual samples our project page. Features are:

    • >900 annotated real-world images with >2,700 visible parcel side surfaces
    • 6 different tampering types
    • 6 different distortion strengths

    Relevant computer vision tasks:

    • bounding box detection
    • classification
    • instance segmentation
    • keypoint estimation
    • tampering detection and classification

    If you use this resource for scientific research, please consider citing our WACV 2024 paper "TAMPAR: Visual Tampering Detection for Parcel Logistics in Postal Supply Chains".

  19. O

    Bizarre Pose Dataset (Bizarre Pose Dataset of Illustrated Characters)

    • opendatalab.com
    zip
    Updated Sep 21, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Maryland (2022). Bizarre Pose Dataset (Bizarre Pose Dataset of Illustrated Characters) [Dataset]. https://opendatalab.com/OpenDataLab/Bizarre_Pose_Dataset
    Explore at:
    zip(6289330405 bytes)Available download formats
    Dataset updated
    Sep 21, 2022
    Dataset provided by
    University of Maryland
    License

    https://choosealicense.com/licenses/agpl-3.0/https://choosealicense.com/licenses/agpl-3.0/

    Description

    Human keypoint dataset of anime/manga-style character illustrations. Extension of the AnimeDrawingsDataset, with additional features: all 17 COCO-compliant human keypoints character bounding boxes 2000 additional samples (4000 total) from Danbooru with difficult tags Useful for pose estimation of illustrated characters, which allows downstream tasks such as pose-guided reference drawing retrieval (e.g. Hermit Purple).

  20. h

    roboflow-garlic

    • huggingface.co
    Updated Dec 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Lips (2024). roboflow-garlic [Dataset]. https://huggingface.co/datasets/tlpss/roboflow-garlic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 17, 2024
    Authors
    Thomas Lips
    Description

    Garlic Keypoint Detection dataset

    This dataset contains 1000 images of a single garlic clove in a pressumably industrial setting. The annotations are coco-formatted and are composed of a bounding box and 2 keypoints: head and tail. The dataset was taken from https://universe.roboflow.com/gesture-recognition-dsn2n/garlic_keypoint/dataset/1. Refer to the original repo for licensing questions. The annotations json files were slightly modified (formatting, image base directory,..)… See the full description on the dataset page: https://huggingface.co/datasets/tlpss/roboflow-garlic.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sheng Jin; Lumin Xu; Jin Xu; Can Wang; Wentao Liu; Chen Qian; Wanli Ouyang; Ping Luo (2022). COCO-WholeBody Dataset [Dataset]. https://paperswithcode.com/dataset/coco-wholebody

COCO-WholeBody Dataset

Explore at:
147 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Oct 9, 2022
Authors
Sheng Jin; Lumin Xu; Jin Xu; Can Wang; Wentao Liu; Chen Qian; Wanli Ouyang; Ping Luo
Description

COCO-WholeBody is an extension of COCO dataset with whole-body annotations. There are 4 types of bounding boxes (person box, face box, left-hand box, and right-hand box) and 133 keypoints (17 for body, 6 for feet, 68 for face and 42 for hands) annotations for each person in the image.

Search
Clear search
Close search
Google apps
Main menu