100+ datasets found
  1. 25,581 Images - 88 Facial Landmarks Annotation Data

    • m.nexdata.ai
    • nexdata.ai
    Updated Apr 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). 25,581 Images - 88 Facial Landmarks Annotation Data [Dataset]. https://m.nexdata.ai/datasets/computervision/1353?source=Github
    Explore at:
    Dataset updated
    Apr 10, 2024
    Dataset authored and provided by
    Nexdata
    Variables measured
    Data size, Data format, Accuracy rate, Data diversity, Age distribution, Race distribution, Annotation content, Gender distribution, Collecting environment
    Description

    25,581 Images - 88 Facial Landmarks Annotation Data. The dataset includes Asian, black race, Caucasian and brown race. In order to be more challenging, the data includes multiple scenes, multiple poses, different ages, light conditions and complicated expressions. For annotation, 88 facial landmarks and visible and invisible attributes of landmarks were annotated. This data can be used for tasks such as face detection and face recognition.

  2. 399 Asians - 35,112 Images Multi-pose Face Data with 21 Facial Landmarks...

    • nexdata.ai
    Updated Oct 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). 399 Asians - 35,112 Images Multi-pose Face Data with 21 Facial Landmarks Annotation [Dataset]. https://www.nexdata.ai/datasets/computervision/173
    Explore at:
    Dataset updated
    Oct 16, 2023
    Dataset authored and provided by
    Nexdata
    Variables measured
    Device, Accuracy, Data size, Data format, Data diversity, Age distribution:, Race distribution, Annotation content, Gender distribution, Collecting environment
    Description

    The 399 Asians - 35,112 Images Multi-pose Face Data with 21 Facial Landmarks Annotation data is collected from 399 people. The data diversity includes multiple poses, different ages, different light conditions and multiple scenes. This data can be used for tasks such as face detection and face recognition. Thee accuracy of labels of gender, face pose, year of birth, light condition, scene and wearing glasses or not is more than 97%;annotation accuracy of facial landmarks is more than 97%

  3. h

    face_synthetics_spiga

    • huggingface.co
    Updated Jun 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Cuenca (2023). face_synthetics_spiga [Dataset]. https://huggingface.co/datasets/pcuenq/face_synthetics_spiga
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 22, 2023
    Authors
    Pedro Cuenca
    Description

    Dataset Card for "face_synthetics_spiga"

    This is a copy of Microsoft FaceSynthetics dataset with SPIGA landmark annotations. For a copy of the original FaceSynthetics dataset with no extra annotations, please refer to pcuenq/face_synthetics. Please, refer to the original license, which we replicate in this repo. The SPIGA annotations were created by Hugging Face Inc. and are distributed under the MIT license. This dataset was prepared using the code below. It iterates through the… See the full description on the dataset page: https://huggingface.co/datasets/pcuenq/face_synthetics_spiga.

  4. P

    Google Landmarks Dataset v2 Dataset

    • paperswithcode.com
    • opendatalab.com
    • +1more
    Updated Jul 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tobias Weyand; Andre Araujo; Bingyi Cao; Jack Sim (2023). Google Landmarks Dataset v2 Dataset [Dataset]. https://paperswithcode.com/dataset/google-landmarks-dataset-v2
    Explore at:
    Dataset updated
    Jul 10, 2023
    Authors
    Tobias Weyand; Andre Araujo; Bingyi Cao; Jack Sim
    Description

    This is the second version of the Google Landmarks dataset (GLDv2), which contains images annotated with labels representing human-made and natural landmarks. The dataset can be used for landmark recognition and retrieval experiments. This version of the dataset contains approximately 5 million images, split into 3 sets of images: train, index and test

  5. Data records for Clinical Quantification of Radiographic Annotation Accuracy...

    • figshare.com
    bin
    Updated Aug 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuan Chai (2023). Data records for Clinical Quantification of Radiographic Annotation Accuracy [Dataset]. http://doi.org/10.6084/m9.figshare.23831553.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 3, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Yuan Chai
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This database contains three data records that are key components of the study titled "Clinical Quantification of Radiographic Landmark Annotations Using Probabilistic Method for AI Accuracy Analysis," which has been submitted to Scientific Data:1. Annotation Dataset: This is a .xlsx file where the first column corresponds to the file names in the 'Imaging dataset'. The remaining columns in each row represent the coordinates of the landmarks for the corresponding image file.2. Benchmark Dataset: This is a .xlsx file that includes the maximum length and angular disagreement of the parameters at different data probability thresholds.3. MATLAB Code: This is a .m file that encapsulates all the codes utilized to record the coordinates of the landmark annotations.

  6. n

    15 People - 22 Landmarks Annotation Data of 3D Human Body

    • m.nexdata.ai
    • nexdata.ai
    Updated Jun 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). 15 People - 22 Landmarks Annotation Data of 3D Human Body [Dataset]. https://m.nexdata.ai/datasets/computervision/1235?source=Github
    Explore at:
    Dataset updated
    Jun 21, 2024
    Dataset provided by
    Nexdata
    nexdata technology inc
    Authors
    Nexdata
    Variables measured
    Device, Accuracy, Data size, Data format, Data diversity, Annotation content, Collecting content, Collecting environment, Population distribution
    Description

    15 People - 22 Landmarks Annotation Data of 3D Human Body. The dataset diversity includes multiple scenes, different ages, different costumes, different human body sitting postures. In terms of annotation, we annotate the 2D and 3D coordinates of the 22 landmarks of the human body, landmark attributes, the rectangular frame of the human body. The dataset can be used for tasks such as human body instance segmentation and human behavior recognition.

  7. P

    AFLW Dataset

    • paperswithcode.com
    Updated Jan 10, 2012
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Köstinger; Paul Wohlhart; Peter M. Roth; Horst Bischof (2012). AFLW Dataset [Dataset]. https://paperswithcode.com/dataset/aflw
    Explore at:
    Dataset updated
    Jan 10, 2012
    Authors
    Martin Köstinger; Paul Wohlhart; Peter M. Roth; Horst Bischof
    Description

    The Annotated Facial Landmarks in the Wild (AFLW) is a large-scale collection of annotated face images gathered from Flickr, exhibiting a large variety in appearance (e.g., pose, expression, ethnicity, age, gender) as well as general imaging and environmental conditions. In total about 25K faces are annotated with up to 21 landmarks per image.

  8. d

    200K+ Landmark Images | AI Training Data | Annotated imagery data for AI |...

    • datarade.ai
    Updated Aug 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Seeds (2019). 200K+ Landmark Images | AI Training Data | Annotated imagery data for AI | Object & Scene Detection | Global Coverage [Dataset]. https://datarade.ai/data-products/120k-landmark-images-ai-training-data-annotated-imagery-data-seeds
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Aug 22, 2019
    Dataset authored and provided by
    Data Seeds
    Area covered
    Germany, Bonaire, Singapore, Martinique, Guernsey, Belize, Guadeloupe, Greece, Åland Islands, Grenada
    Description

    This dataset features over 200,000 high-quality images of historical and cultural landmarks sourced from photographers worldwide. Designed to support AI and machine learning applications, it provides a diverse and richly annotated collection of landmark imagery.

    Key Features: 1. Comprehensive Metadata: the dataset includes full EXIF data, detailing camera settings such as aperture, ISO, shutter speed, and focal length. Additionally, each image is pre-annotated with object and scene detection metadata, making it ideal for tasks like classification, detection, and segmentation. Popularity metrics, derived from engagement on our proprietary platform, are also included.

    1. Unique Sourcing Capabilities: the images are collected through a proprietary gamified platform for photographers. Competitions focused on landmark photography ensure fresh, relevant, and high-quality submissions. Custom datasets can be sourced on-demand within 72 hours, allowing for specific requirements such as particular landmarks or geographic regions to be met efficiently.

    2. Global Diversity: photographs have been sourced from contributors in over 100 countries, ensuring a vast array of landmark types and cultural contexts. The images feature varied settings, including historical monuments, iconic structures, natural landmarks, and urban architecture, providing an unparalleled level of diversity.

    3. High-Quality Imagery: the dataset includes images with resolutions ranging from standard to high-definition to meet the needs of various projects. Both professional and amateur photography styles are represented, offering a mix of artistic and practical perspectives suitable for a variety of applications.

    4. Popularity Scores: each image is assigned a popularity score based on its performance in GuruShots competitions. This unique metric reflects how well the image resonates with a global audience, offering an additional layer of insight for AI models focused on user preferences or engagement trends.

    5. AI-Ready Design: this dataset is optimized for AI applications, making it ideal for training models in tasks such as image recognition, classification, and segmentation. It is compatible with a wide range of machine learning frameworks and workflows, ensuring seamless integration into your projects.

    6. Licensing & Compliance: the dataset complies fully with data privacy regulations and offers transparent licensing for both commercial and academic use.

    Use Cases: 1. Training AI systems for landmark recognition and geolocation. 2. Enhancing navigation and travel AI applications. 3. Building datasets for educational, tourism, and augmented reality tools. 4.Supporting cultural heritage preservation and analysis through AI-powered solutions.

    This dataset offers a comprehensive, diverse, and high-quality resource for training AI and ML models, tailored to deliver exceptional performance for your projects. Customizations are available to suit specific project needs. Contact us to learn more!

  9. R

    Landmark Dataset

    • universe.roboflow.com
    zip
    Updated Apr 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    itsp (2025). Landmark Dataset [Dataset]. https://universe.roboflow.com/itsp/landmark-evhnn/dataset/5
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 23, 2025
    Dataset authored and provided by
    itsp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Landmark Bounding Boxes
    Description

    Landmark

    ## Overview
    
    Landmark is a dataset for object detection tasks - it contains Landmark annotations for 421 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  10. d

    Annotated Imagery Data | AI Training Data| Face ID + 106 key points facial...

    • datarade.ai
    Updated Nov 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pixta AI (2022). Annotated Imagery Data | AI Training Data| Face ID + 106 key points facial landmark images | 30,000 Stock Images [Dataset]. https://datarade.ai/data-products/unique-face-ids-with-facial-landmark-106-key-points-pixta-ai
    Explore at:
    .json, .xml, .csv, .txtAvailable download formats
    Dataset updated
    Nov 3, 2022
    Dataset authored and provided by
    Pixta AI
    Area covered
    Vietnam, Spain, Belgium, Poland, Canada, Australia, Malaysia, New Zealand, Portugal, Korea (Republic of)
    Description
    1. Overview This dataset is a collection of 30,000+ images of Face ID + 106 key points facial landmark that are ready to use for optimizing the accuracy of computer vision models. Images in the dataset includes People image with specific requirements as follow:
    2. Age: above 20
    3. Race: various
    4. Angle: no more than 90 degree All of the contents is sourced from PIXTA's stock library of 100M+ Asian-featured images and videos.

    5. Annotated Imagery Data of Face ID + 106 key points facial landmark This dataset contains 30,000+ images of Face ID + 106 key points facial landmark. The dataset has been annotated in - face bounding box, Attribute of race, gender, age, skin tone and 106 keypoints facial landmark. Each data set is supported by both AI and human review process to ensure labelling consistency and accuracy.

    6. About PIXTA PIXTASTOCK is the largest Asian-featured stock platform providing data, contents, tools and services since 2005. PIXTA experiences 15 years of integrating advanced AI technology in managing, curating, processing over 100M visual materials and serving global leading brands for their creative and data demands.

  11. P

    WFLW Dataset

    • paperswithcode.com
    Updated Feb 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wayne Wu; Chen Qian; Shuo Yang; Quan Wang; Yici Cai; Qiang Zhou (2021). WFLW Dataset [Dataset]. https://paperswithcode.com/dataset/wflw
    Explore at:
    Dataset updated
    Feb 2, 2021
    Authors
    Wayne Wu; Chen Qian; Shuo Yang; Quan Wang; Yici Cai; Qiang Zhou
    Description

    The Wider Facial Landmarks in the Wild or WFLW database contains 10000 faces (7500 for training and 2500 for testing) with 98 annotated landmarks. This database also features rich attribute annotations in terms of occlusion, head pose, make-up, illumination, blur and expressions.

  12. d

    200K+ Landmark Images | AI Training Data | Annotated imagery data for AI |...

    • data.dataseeds.ai
    Updated Aug 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Seeds (2019). 200K+ Landmark Images | AI Training Data | Annotated imagery data for AI | Object & Scene Detection | Global Coverage [Dataset]. https://data.dataseeds.ai/products/120k-landmark-images-ai-training-data-annotated-imagery-data-seeds
    Explore at:
    Dataset updated
    Aug 22, 2019
    Dataset authored and provided by
    Data Seeds
    Area covered
    Montenegro, Serbia, Western Sahara, Dominica, Uganda, Poland, Uruguay, Malta, Canada, Papua New Guinea
    Description

    A comprehensive dataset of 200K+ landmark images sourced globally, featuring full EXIF data, including camera settings and photography details. Enriched with object and scene detection metadata, this dataset is ideal for AI model training in image recognition, classification & segmentation

  13. 28,972 Images - Driver Face Detection & Face 96 Landmarks Annotation Data

    • m.nexdata.ai
    Updated Apr 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2025). 28,972 Images - Driver Face Detection & Face 96 Landmarks Annotation Data [Dataset]. https://m.nexdata.ai/datasets/computervision/1588
    Explore at:
    Dataset updated
    Apr 25, 2025
    Dataset authored and provided by
    Nexdata
    Variables measured
    Device, Data size, Data format, Vehicle Type, Accuracy rate, Data diversity, Collecting time, Shooting position, Annotation content, Collecting environment, and 1 more
    Description

    100 People - Face Detection & Face 96 Landmarks Annotation Data. The data includes multiple ages, multiple time periods and multiple races (Caucasian, Black, Indian). The driver behaviors includes dangerous behavior, fatigue behavior and visual movement behavior. In terms of device, infrared cameras were applied. In terms of annotation, each individual consists of 274 to 299 photos, with annotations for detected facial bounding boxes and 96 facial landmarks. The data can be used for tasks such as facial detection, 96 facial landmark recognition.

  14. P

    AFLW-19 Dataset

    • paperswithcode.com
    Updated Dec 27, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shizhan Zhu; Cheng Li; Chen-Change Loy; Xiaoou Tang (2021). AFLW-19 Dataset [Dataset]. https://paperswithcode.com/dataset/aflw-19
    Explore at:
    Dataset updated
    Dec 27, 2021
    Authors
    Shizhan Zhu; Cheng Li; Chen-Change Loy; Xiaoou Tang
    Description

    The original AFLW provides at most 21 points for each face, but excluding coordinates for invisible landmarks, causing difficulties for training most of the existing baseline approaches. To make fair comparisons, the authors manually annotate the coordinates of these invisible landmarks to enable training of those baseline approaches. The new annotation does not include two ear points because it is very difficult to decide the location of invisible ears. This causes the point number of AFLW-19 to be 19.

    The original AFLW does not provide train-test partition. AFLW-19 adopts a partition with 20,000 images for training and 4,386 images for testing (AFLW-Full). In addition, a frontal subset (AFLW-Frontal) is proposed where all landmarks are visible (totally 1,165 images).

    The new 19-point annotation file is available at the project page.

  15. f

    Registered landmarks csv files

    • springernature.figshare.com
    • figshare.com
    zip
    Updated Aug 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mariano Cabezas; Yago Diez; Clara Martinez-Diago; Anna Maroto (2024). Registered landmarks csv files [Dataset]. http://doi.org/10.6084/m9.figshare.24849075.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 23, 2024
    Dataset provided by
    figshare
    Authors
    Mariano Cabezas; Yago Diez; Clara Martinez-Diago; Anna Maroto
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Compressed file with all the registered landmark points. For each subject and structure a csv file with 2D point coordinates is provided.

  16. R

    Landmark Finder Dataset

    • universe.roboflow.com
    zip
    Updated Aug 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    landmarkclass (2023). Landmark Finder Dataset [Dataset]. https://universe.roboflow.com/landmarkclass/landmark-finder/model/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 10, 2023
    Dataset authored and provided by
    landmarkclass
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Landmarks Bounding Boxes
    Description

    Landmark Finder

    ## Overview
    
    Landmark Finder is a dataset for object detection tasks - it contains Landmarks annotations for 1,048 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  17. D

    Data from: MPIIFaceGaze

    • darus.uni-stuttgart.de
    Updated Mar 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andreas Bulling (2023). MPIIFaceGaze [Dataset]. http://doi.org/10.18419/DARUS-3240
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 14, 2023
    Dataset provided by
    DaRUS
    Authors
    Andreas Bulling
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Dataset funded by
    Alexander von Humboldt Postdoctoral Fellowship, Germany
    JST CREST Research Grant, Japan
    DFG
    Description

    We present the MPIIFaceGaze dataset which is based on the MPIIGaze dataset, with the additional human facial landmark annotation and the face regions available. We added additional facial landmark and pupil center annotations for 37,667 face images. Facial landmarks annotations were conducted in a semi-automatic manner as running facial landmark detection method first and then checking by two human annotators. The pupil centers were annotated by two human annotators from scratch. For sake of privacy, we only released the face region and blocked the background in images. See readme.txt for more details about the data.

  18. 87,871 Images of 106 Facial Landmarks Annotation Data (complicated scenes)

    • nexdata.ai
    • m.nexdata.ai
    Updated Oct 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). 87,871 Images of 106 Facial Landmarks Annotation Data (complicated scenes) [Dataset]. https://www.nexdata.ai/datasets/computervision/961
    Explore at:
    Dataset updated
    Oct 15, 2023
    Dataset authored and provided by
    Nexdata
    Variables measured
    Accuracy, Data size, Data format, Data diversity, Age distribution:, Race distribution, Annotation content, Gender distribution, Collecting environment
    Description

    87,871 Images of 106 Facial Landmarks Annotation Data (complicated scenes),this dataset includes yellow race, black race, white race and Indian people. In order to be more challenging, the data includes multiple scenes, multiple poses, different ages, light conditions and complicated expressions. This data can be used for tasks such as face detection and face recognition.

  19. P

    Chest x-ray landmark dataset Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolás Gaggion; Lucas Mansilla; Candelaria Mosquera; Diego H. Milone; Enzo Ferrante, Chest x-ray landmark dataset Dataset [Dataset]. https://paperswithcode.com/dataset/chest-x-ray-landmark-dataset
    Explore at:
    Authors
    Nicolás Gaggion; Lucas Mansilla; Candelaria Mosquera; Diego H. Milone; Enzo Ferrante
    Description

    Set of landmark annotations for JSRT, Montgomery, Shenzhen and a subset of Padchest datasets

  20. Z

    AI-derived annotations for the NLST and NSCLC-Radiomics computed tomography...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugo Aerts (2024). AI-derived annotations for the NLST and NSCLC-Radiomics computed tomography imaging collections [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7473970
    Explore at:
    Dataset updated
    Jan 22, 2024
    Dataset provided by
    David Clunie
    Hugo Aerts
    Deepa Krishnaswamy
    Dennis Bontempi
    Andrey Fedorov
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Public imaging datasets are critical for the development and evaluation of automated tools in cancer imaging. Unfortunately, many of the available datasets do not provide annotations of tumors or organs-at-risk, crucial for the assessment of these tools. This is due to the fact that annotation of medical images is time consuming and requires domain expertise. It has been demonstrated that artificial intelligence (AI) based annotation tools can achieve acceptable performance and thus can be used to automate the annotation of large datasets. As part of the effort to enrich the public data available within NCI Imaging Data Commons (IDC) (https://imaging.datacommons.cancer.gov/) [1], we introduce this dataset that consists of such AI-generated annotations for two publicly available medical imaging collections of Computed Tomography (CT) images of the chest. For detailed information concerning this dataset, please refer to our publication here [2].

    We use publicly available pre-trained AI tools to enhance CT lung cancer collections that are unlabeled or partially labeled. The first tool is the nnU-Net deep learning framework [3] for volumetric segmentation of organs, where we use a pretrained model (Task D18 using the SegTHOR dataset) for labeling volumetric regions in the image corresponding to the heart, trachea, aorta and esophagus. These are the major organs-at-risk for radiation therapy for lung cancer. We further enhance these annotations by computing 3D shape radiomics features using the pyradiomics package [4]. The second tool is a pretrained model for per-slice automatic labeling of anatomic landmarks and imaged body part regions in axial CT volumes [5].

    We focus on enhancing two publicly available collections, the Non-small Cell Lung Cancer Radiomics (NSCLC-Radiomics collection) [6,7], and the National Lung Screening Trial (NLST collection) [8,9]. The CT data for these collections are available both in The Cancer Imaging Archive (TCIA) [10] and in NCI Imaging Data Commons (IDC). Further, the NSLSC-Radiomics collection includes expert-generated manual annotations of several chest organs, allowing us to quantify performance of the AI tools in that subset of data.

    IDC is relying on the DICOM standard to achieve FAIR [10] sharing of data and interoperability. Generated annotations are saved as DICOM Segmentation objects (volumetric segmentations of regions of interest) created using the dcmqi [12], and DICOM Structured Report (SR) objects (per-slice annotations of the body part imaged, anatomical landmarks and radiomics features) created using dcmqi and highdicom [13]. 3D shape radiomics features and corresponding DICOM SR objects are also provided for the manual segmentations available in the NSCLC-Radiomics collection.

    The dataset is available in IDC, and is accompanied by our publication here [2]. This pre-print details how the data were generated, and how the resulting DICOM objects can be interpreted and used in tools. Additionally, for further information about how to interact with and explore the dataset, please refer to our repository and accompanying Google Colaboratory notebook.

    The annotations are organized as follows. For NSCLC-Radiomics, three nnU-Net models were evaluated ('2d-tta', '3d_lowres-tta' and '3d_fullres-tta'). Within each folder, the PatientID and the StudyInstanceUID are subdirectories, and within this the DICOM Segmentation object and the DICOM SR for the 3D shape features are stored. A separate directory for the DICOM SR body part regression regions ('sr_regions') and landmarks ('sr_landmarks') are also provided with the same folder structure as above. Lastly, the DICOM SR for the existing manual annotations are provided in the 'sr_gt' directory. For NSCLC-Radiomics, each patient has a single StudyInstanceUID. The DICOM Segmentation and SR objects are named according to the SeriesInstanceUID of the original CT files.

    nsclc

    2d-tta

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_SEG.dcm

    ReferencedSeriesInstanceUID_features_SR.dcm

    3d_lowres-tta

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_SEG.dcm

    ReferencedSeriesInstanceUID_features_SR.dcm

    3d_fullres-tta

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_SEG.dcm

    ReferencedSeriesInstanceUID_features_SR.dcm

    sr_regions

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_regions_SR.dcm

    sr_landmarks

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_landmarks_SR.dcm

    sr_gt

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_features_SR.dcm

    For NLST, the '3d_fullres-tta' model was evaluated. The data is organized the same as above, where within each folder the PatientID and the StudyInstanceUID are subdirectories. For the NLST collection, it is possible that some patients have more than one StudyInstanceUID subdirectory. A separate directory for the DICOM SR body par regions ('sr_regions') and landmarks ('sr_landmarks') are also provided. The DICOM Segmentation and SR objects are named according to the SeriesInstanceUID of the original CT files.

    nlst

    3d_fullres-tta

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_SEG.dcm

    ReferencedSeriesInstanceUID_features_SR.dcm

    sr_regions

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_regions_SR.dcm

    sr_landmarks

    PatientID

    StudyInstanceUID

    ReferencedSeriesInstanceUID_landmarks_SR.dcm

    The query used for NSCLC-Radiomics is here, and a list of corresponding SeriesInstanceUIDs (along with PatientIDs and StudyInstanceUIDs) is here. The query used for NLST is here, and a list of corresponding SeriesInstanceUIDs (along with PatientIDs and StudyInstanceUIDs) is here. The two csv files that describe the series analyzed, nsclc_series_analyzed.csv and nlst_series_analyzed.csv, are also available as uploads to this repository.

    Version updates:

    Version 2: For the regions SR and landmarks SR, changed to use a distinct TrackingUniqueIdentifier for each MeasurementGroup. Also instead of using TargetRegion, changed to use FindingSite. Additionally for the landmarks SR, the TopographicalModifier was made a child of FindingSite instead of a sibling.

    Version 3: Added the two csv files that describe which series were analyzed

    Version 4: Modified the landmarks SR as the TopographicalModifier for the Kidney landmark (bottom) does not describe the landmark correctly. The Kidney landmark is the "first slice where both kidneys can be seen well." Instead, removed the use of the TopographicalModifier for that landmark. For the features SR, modified the units code for the Flatness and Elongation, as we incorrectly used mm units instead of no units.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nexdata (2024). 25,581 Images - 88 Facial Landmarks Annotation Data [Dataset]. https://m.nexdata.ai/datasets/computervision/1353?source=Github
Organization logo

25,581 Images - 88 Facial Landmarks Annotation Data

Explore at:
Dataset updated
Apr 10, 2024
Dataset authored and provided by
Nexdata
Variables measured
Data size, Data format, Accuracy rate, Data diversity, Age distribution, Race distribution, Annotation content, Gender distribution, Collecting environment
Description

25,581 Images - 88 Facial Landmarks Annotation Data. The dataset includes Asian, black race, Caucasian and brown race. In order to be more challenging, the data includes multiple scenes, multiple poses, different ages, light conditions and complicated expressions. For annotation, 88 facial landmarks and visible and invisible attributes of landmarks were annotated. This data can be used for tasks such as face detection and face recognition.

Search
Clear search
Close search
Google apps
Main menu