100+ datasets found
  1. Penn-Fudan Pedestrian dataset for segmentation

    • kaggle.com
    zip
    Updated Mar 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sovit Ranjan Rath (2023). Penn-Fudan Pedestrian dataset for segmentation [Dataset]. https://www.kaggle.com/datasets/sovitrath/penn-fudan-pedestrian-dataset-for-segmentation
    Explore at:
    zip(53687127 bytes)Available download formats
    Dataset updated
    Mar 4, 2023
    Authors
    Sovit Ranjan Rath
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Penn-Fudan dataset for semantic segmentation. The dataset has been split into 146 training samples and 24 validation samples.

    Corresponding blog post => Training UNet from Scratch using PyTorch

    Original data set => https://www.cis.upenn.edu/~jshi/ped_html/

  2. Dog Segmentation Dataset

    • kaggle.com
    zip
    Updated Mar 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Santhoshkumar (2023). Dog Segmentation Dataset [Dataset]. https://www.kaggle.com/datasets/santhoshkumarv/dog-segmentation-dataset
    Explore at:
    zip(5252057 bytes)Available download formats
    Dataset updated
    Mar 31, 2023
    Authors
    Santhoshkumar
    Description

    A dog segmentation dataset created manually typically involves the following steps:

    Image selection: Selecting a set of images that include dogs in various poses and backgrounds.

    Image labeling: Manually labeling the dogs in each image using a labeling tool, where each dog is segmented and assigned a unique label.

    Image annotation: Annotating the labeled images with the corresponding segmentation masks, where the dog region is assigned a value of 1 and the background region is assigned a value of 0.

    Dataset splitting: Splitting the annotated dataset into training, validation, and test sets.

    Dataset format: Saving the annotated dataset in a format suitable for use in machine learning frameworks such as TensorFlow or PyTorch.

    Dataset characteristics: The dataset may have varying image sizes and resolutions, different dog breeds, backgrounds, lighting conditions, and other variations that are typical of natural images.

    Dataset size: The size of the dataset can vary, but it should be large enough to provide a sufficient amount of training data for deep learning models.

    Dataset availability: The dataset may be made publicly available for research and educational purposes.

    Overall, a manually created dog segmentation dataset provides a high-quality training data for deep learning models and is essential for developing robust segmentation models.

  3. N

    Code for using and training spray segmentation models

    • search.nfdi4chem.de
    • darus.uni-stuttgart.de
    html
    Updated Jul 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DaRUS (2025). Code for using and training spray segmentation models [Dataset]. https://search.nfdi4chem.de/dataset/doi-10-18419-darus-4739
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jul 29, 2025
    Dataset provided by
    DaRUS
    Description

    This dataset contains the necessary code for using our spray segmentation model used in the paper, ML-based semantic segmentation for quantitative spray atomization description. See README for more information.

  4. f

    The performance of different semantic segmentation models on the...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li, Qingfu; Cheng, Yinlei (2025). The performance of different semantic segmentation models on the self-constructed training dataset. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002082718
    Explore at:
    Dataset updated
    May 15, 2025
    Authors
    Li, Qingfu; Cheng, Yinlei
    Description

    The performance of different semantic segmentation models on the self-constructed training dataset.

  5. Sentinel-2 urban green training dataset

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Oct 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ondřej Pešek; Ondřej Pešek (2023). Sentinel-2 urban green training dataset [Dataset]. http://doi.org/10.5281/zenodo.8413116
    Explore at:
    binAvailable download formats
    Dataset updated
    Oct 10, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ondřej Pešek; Ondřej Pešek
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Training dataset for urban green land cover and land use detection for Sentinel-2 satellite images. Samples are pixel-wise labelled scenes over the city of Prague, including bigger parks and smaller vegetation patches within high-density urban areas.

    Contains four classes:

    * 0: Non-vegetated pixels
    * 1: Low recreational vegetation
    * 2: High recreational vegetation
    * 3: Non-recreational vegetation

  6. D

    Code for training and using the droplet segmentation models

    • darus.uni-stuttgart.de
    • search.nfdi4chem.de
    Updated Feb 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Basil Jose; Fabian Hampp (2025). Code for training and using the droplet segmentation models [Dataset]. http://doi.org/10.18419/DARUS-4147
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 10, 2025
    Dataset provided by
    DaRUS
    Authors
    Basil Jose; Fabian Hampp
    License

    https://spdx.org/licenses/MIT.htmlhttps://spdx.org/licenses/MIT.html

    Dataset funded by
    SimTech
    DFG
    Description

    This dataset contains the necessary code for using our spray segmentation model used in the paper, Machine learning based spray process quantification. More information can be found in the README.md.

  7. U

    Coast Train--Labeled imagery for training and evaluation of data-driven...

    • data.usgs.gov
    • datasets.ai
    • +1more
    Updated Aug 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Phillipe Wernette; Daniel Buscombe; Jaycee Favela; Sharon Fitzpatrick; Evan Goldstein; Nicholas Enwright; Erin Dunand (2024). Coast Train--Labeled imagery for training and evaluation of data-driven models for image segmentation [Dataset]. http://doi.org/10.5066/P91NP87I
    Explore at:
    Dataset updated
    Aug 26, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Phillipe Wernette; Daniel Buscombe; Jaycee Favela; Sharon Fitzpatrick; Evan Goldstein; Nicholas Enwright; Erin Dunand
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    Jan 1, 2008 - Dec 31, 2020
    Description

    Coast Train is a library of images of coastal environments, annotations, and corresponding thematic label masks (or ‘label images’) collated for the purposes of training and evaluating machine learning (ML), deep learning, and other models for image segmentation. It includes image sets from both geospatial satellite, aerial, and UAV imagery and orthomosaics, as well as non-geospatial oblique and nadir imagery. Images include a diverse range of coastal environments from the U.S. Pacific, Gulf of Mexico, Atlantic, and Great Lakes coastlines, consisting of time-series of high-resolution (≤1m) orthomosaics and satellite image tiles (10–30m). Each image, image annotation, and labelled image is available as a single NPZ zipped file. NPZ files follow the following naming convention: {datasource}_{numberofclasses}_{threedigitdatasetversion}.zip, where {datasource} is the source of the original images (for example, NAIP, Landsat 8, Sentinel 2), {numberofclasses} is the number of classes us ...

  8. D

    Code for training and using the soot (instance) segmentation models

    • darus.uni-stuttgart.de
    • search.nfdi4chem.de
    Updated Oct 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Basil Jose; Klaus Peter Geigle; Fabian Hampp (2025). Code for training and using the soot (instance) segmentation models [Dataset]. http://doi.org/10.18419/DARUS-5184
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 31, 2025
    Dataset provided by
    DaRUS
    Authors
    Basil Jose; Klaus Peter Geigle; Fabian Hampp
    License

    https://spdx.org/licenses/MIT.htmlhttps://spdx.org/licenses/MIT.html

    Dataset funded by
    DFG
    Description

    This dataset contains the necessary code for using our soot (instance) segmentation model used for segmenting soot filaments from PIV (Mie scattering) images. In the corresponding paper, an ablation study is conducted to delineate the effects of domain randomisation parameters of synthetically generated training data on the segmentation accuracy. The best model is used to extract high-level statistics from soot filaments in an RQL-type model combustor to enhance the fundamental understanding soot formation, transport and oxidation. B. Jose, K. P. Geigle, F. Hampp, Domain-Randomised Instance-Segmentation Benchmark for Soot in PIV Images, submitted to Machine Learning: Science and Technology (2025)

  9. d

    Data from: imageseg: An R package for deep learning-based image segmentation...

    • datadryad.org
    • data.niaid.nih.gov
    zip
    Updated Aug 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jürgen Niedballa; Jan Axtner; Timm Döbert; Andrew Tilker; An Nguyen; Seth Wong; Christian Fiderer; Marco Heurich; Andreas Wilting (2022). imageseg: An R package for deep learning-based image segmentation [Dataset]. http://doi.org/10.5061/dryad.x0k6djhnj
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 6, 2022
    Dataset provided by
    Dryad
    Authors
    Jürgen Niedballa; Jan Axtner; Timm Döbert; Andrew Tilker; An Nguyen; Seth Wong; Christian Fiderer; Marco Heurich; Andreas Wilting
    Time period covered
    Jul 19, 2022
    Description
    1. Convolutional neural networks (CNNs) and deep learning are powerful and robust tools for ecological applications, and are particularly suited for image data. Image segmentation (the classification of all pixels in images) is one such application and can for example be used to assess forest structural metrics. While CNN-based image segmentation methods for such applications have been suggested, widespread adoption in ecological research has been slow, likely due to technical difficulties in implementation of CNNs and lack of toolboxes for ecologists.
    2. Here, we present R package imageseg which implements a CNN-based workflow for general-purpose image segmentation using the U-Net and U-Net++ architectures in R. The workflow covers data (pre)processing, model training, and predictions. We illustrate the utility of the package with image recognition models for two forest structural metrics: tree canopy density and understory vegetation density. We trained the models using large and dive...
  10. t

    Resyris - a real-synthetic rock instance segmentation dataset for training...

    • service.tib.eu
    • resodate.org
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Resyris - a real-synthetic rock instance segmentation dataset for training and benchmarking - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/resyris---a-real-synthetic-rock-instance-segmentation-dataset-for-training-and-benchmarking
    Explore at:
    Dataset updated
    Dec 2, 2024
    Description

    A real-synthetic rock instance segmentation dataset for training and benchmarking.

  11. Fire Segmentation Dataset

    • kaggle.com
    zip
    Updated Mar 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bekhzod Olimov (2024). Fire Segmentation Dataset [Dataset]. https://www.kaggle.com/datasets/killa92/fire-segmentation-dataset
    Explore at:
    zip(494277783 bytes)Available download formats
    Dataset updated
    Mar 24, 2024
    Authors
    Bekhzod Olimov
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains fire images and their corresponding segmentation masks for a semantic segmentation task. The dataset can also be used for binary image classification task (to classify images into fire and not fire classes). The dataset has no split of train, validation, and test folders; so, for training purposes, it should be split into three sets necessary for Machine Learning and Deep Learning tasks, namely train, validation, and test splits. The structure of the data is as follows:

    ├── ROOT: └── images: ├── fire: ├── img_file; ├── img_file; ├── ...; └── img_file. ├── not fire: ├── img_file; ├── img_file; ├── ...; └── img_file. └── masks: ├── img_file; ├── img_file; ├── ...; └── img_file.

    For the semantic segmentation task, images name and their corresponding labels have the same file name. And for the image classification task, the class labels can be obtained directory names (fire, not fire). Good luck!

  12. Dataset from "Synthetic Training Data for Semantic Segmentation of the...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Jul 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christoph Hinniger; Joachim Rüter; Joachim Rüter; Christoph Hinniger (2023). Dataset from "Synthetic Training Data for Semantic Segmentation of the Environment from UAV Perspective" [Dataset]. http://doi.org/10.5281/zenodo.8133761
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 12, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Christoph Hinniger; Joachim Rüter; Joachim Rüter; Christoph Hinniger
    Description

    This dataset contains the images and ground truth label masks for semantic segmentation created and described in "Hinniger, C.; Rüter, J. Synthetic Training Data for Semantic Segmentation of the Environment from UAV Perspective. Aerospace 2023, 10, 604. https://doi.org/10.3390/aerospace10070604".

  13. 2D Segmentation of Concrete Samples for Training AI Models

    • data.nist.gov
    • s.cnmilf.com
    • +1more
    Updated Nov 18, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Bajcsy (2019). 2D Segmentation of Concrete Samples for Training AI Models [Dataset]. http://doi.org/10.18434/M32155
    Explore at:
    Dataset updated
    Nov 18, 2019
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Authors
    Peter Bajcsy
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    This web-based validation system has been designed to perform visual validation of automated multi-class segmentation of concrete samples from scanning electron microscopy (SEM) images. The goal is to segment automatically SEM images into no-damage and damage sub-classes, where the damage sub-classes consist of paste damage, aggregate damage, and air voids. While the no-damage sub-classes are not included in the goal, they provide context for assigning damage sub-classes. The motivation behind this web validation system is to prepare a large number of pixel-level multi-class annotated microscopy images for training artificial intelligence (AI) based segmentation models (U-Net and SegNet models). While the purpose of the AI models is to predict accurately four damage labels, such as, paste damage, aggregate damage, air voids, and no-damage, our goal is to assert trust in such predictions (a) by using contextual labels and (b) by enabling visual validations of predicted damage labels.

  14. t

    Unsupervised domain adaptation for semantic segmentation via class-balanced...

    • service.tib.eu
    • resodate.org
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Unsupervised domain adaptation for semantic segmentation via class-balanced self-training - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/unsupervised-domain-adaptation-for-semantic-segmentation-via-class-balanced-self-training
    Explore at:
    Dataset updated
    Jan 3, 2025
    Description

    Unsupervised domain adaptation for semantic segmentation via class-balanced self-training

  15. g

    Labeled satellite imagery for training machine learning semantic...

    • gimi9.com
    Updated Mar 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Labeled satellite imagery for training machine learning semantic segmentation models of coastal shorelines. | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_labeled-satellite-imagery-for-training-machine-learning-semantic-segmentation-models-of-co/
    Explore at:
    Dataset updated
    Mar 25, 2025
    Description

    A dataset of Landsat, Sentinel, and Planetscope satellite images of coastal shoreline regions, and corresponding semantic segmentations. The dataset consists of folders of images and label images. Label images are images where each pixel is given a discrete class by a human annotator, among the following classes: a) water, b) whitewater/surf, c) sediment, and d) other. These data are intended only to be used as a training and validation dataset for a machine learning based image segmentation model that is specifically designed for the task of coastal shoreline satellite image semantic segmentation.

  16. G

    Roads Segmentation Dataset

    • gts.ai
    jpg, mask/annotation +1
    Updated Nov 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2023). Roads Segmentation Dataset [Dataset]. https://gts.ai/dataset-download/dictation-notes/
    Explore at:
    jpg, mask/annotation, pngAvailable download formats
    Dataset updated
    Nov 29, 2023
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Vehicles, Road boundaries, Background objects, Traffic signs and markings
    Description

    The Roads Segmentation Dataset contains DVR-captured road scene images paired with pixel-level segmentation masks. The dataset includes diverse traffic scenarios, with detailed labels for roads, vehicles, signs, markings, and background objects. Each image undergoes strict quality checks to ensure accuracy, making it suitable for training computer vision models in autonomous driving, urban planning, and intelligent traffic systems.

  17. RGB Image Pine-seedling Dataset: Three Population with half-sib structure,...

    • figshare.com
    zip
    Updated Jun 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiri Chuchlík; Jaroslav Čepl; Eva Neuwirthová; Jan Stejskal; Jiří Korecký (2025). RGB Image Pine-seedling Dataset: Three Population with half-sib structure, dataset for segmentation model training and data of mean seedlings' color [Dataset]. http://doi.org/10.6084/m9.figshare.28239326.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 19, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Jiri Chuchlík; Jaroslav Čepl; Eva Neuwirthová; Jan Stejskal; Jiří Korecký
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The datasets contain RGB photos of Scots pine seedlings of three populations from two different ecotypes originating in the Czech Republic:Plasy - lowland ecotype,Trebon - lowland ecotype,Decin - upland ecotype.These photos were taken in three different periods (September 10th 2021, October 23rd 2021, January 22nd 2022).File dataset_for_YOLOv7_training.zip contains image data with annotations for training YOLOv7 segmentation model (training and validation sets)The dataset also contains a table with information on individual Scots pine seedlings:affiliation to parent tree (mum)affiliation to population (site)row and column in which the seedling was grown (row, col)affiliation to the planter in which the seedling was grown (box)mean RGB values of pine seedling in three different periods (B_september, G_september, R_september B_october, G_october, R_october, B_january, G_january, R_january)mean HSV values of pine seedling in three different periods (H_september, S_september, V_september, H_october, S_october, V_october, H_january, S_january, V_january)

  18. F1 score of segmented nuclei images trained on an automatically and a...

    • plos.figshare.com
    xls
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fabian Englbrecht; Iris E. Ruider; Andreas R. Bausch (2023). F1 score of segmented nuclei images trained on an automatically and a manually annotated dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0250093.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 11, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Fabian Englbrecht; Iris E. Ruider; Andreas R. Bausch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    F1 score of segmented nuclei images trained on an automatically and a manually annotated dataset.

  19. visuAAL Skin Segmentation Dataset

    • zenodo.org
    • observatorio-cientifico.ua.es
    • +2more
    Updated Aug 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kooshan Hashemifard; Kooshan Hashemifard; Francisco Florez-Revuelta; Francisco Florez-Revuelta (2022). visuAAL Skin Segmentation Dataset [Dataset]. http://doi.org/10.5281/zenodo.6973396
    Explore at:
    Dataset updated
    Aug 8, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kooshan Hashemifard; Kooshan Hashemifard; Francisco Florez-Revuelta; Francisco Florez-Revuelta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The visuAAL Skin Segmentation Dataset contains 46,775 high quality images divided into a training set with 45,623 images, and a validation set with 1,152 images. Skin areas have been obtained automatically from the FashionPedia garment dataset. The process to extract the skin areas is explained in detail in the paper 'From Garment to Skin: The visuAAL Skin Segmentation Dataset'.

    If you use the visuAAL Skin Segmentation Dataset, please, cite:

    How to use:

    1. Download the FashionPedia dataset from https://fashionpedia.github.io/home/Fashionpedia_download.html
    2. Download the visuAAL Skin Segmentation Dataset. The dataset consists of two folders, namely train_masks and val_masks. Each folder corresponds to the training and validation sets in the original FashionPedia dataset.
    3. After extracting the images from FashionPedia, for each image existing in the visuAAL skin segmentation dataset, the original image can be found with the same name (file_name in the annotations file).

    A sample of image data in the FashionPedia dataset is:

    {'id': 12305,

    'width': 680,

    'height': 1024,

    'file_name': '064c8022b32931e787260d81ed5aafe8.jpg',

    'license': 4,

    'time_captured': 'March-August, 2018',

    'original_url': 'https://farm2.staticflickr.com/1936/8607950470_9d9d76ced7_o.jpg',

    'isstatic': 1,

    'kaggle_id': '064c8022b32931e787260d81ed5aafe8'}

    NOTE: Not all the images in the FashionPedia dataset have the correponding skin mask in the visuAAL Skin Segmentation Dataset, as there are images in which only garment parts and not people are present in them. These images were removed when creating the visuAAL Skin Segmentation Dataset. However, all the instances in the visuAAL skin segmentation dataset have their corresponding match in the FashionPedia dataset.

  20. Face Segmentation Dataset – 70,846 Human Face Images for AI Training

    • nexdata.ai
    Updated Oct 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). Face Segmentation Dataset – 70,846 Human Face Images for AI Training [Dataset]. https://www.nexdata.ai/datasets/computervision/945
    Explore at:
    Dataset updated
    Oct 14, 2023
    Dataset authored and provided by
    Nexdata
    Variables measured
    Accuracy, Data size, Data diversity, Image Parameter, Annotation content, Collection environment, Population distribution
    Description

    This Human Face Segmentation Dataset contains 70,846 high-quality images featuring diverse subjects with pixel-level annotations. The dataset includes individuals across various age groups—from young children to the elderly—and represents multiple ethnicities, including Asian, Black, and Caucasian. Both males and females are included. The scenes range from indoor to outdoor environments, with pure-color backgrounds also present. Facial expressions vary from neutral to complex, including large-angle head tilts, eye closures, glowers, puckers, open mouths, and more. Each image is precisely annotated on a pixel-by-pixel basis, covering facial regions, five sense organs, body parts, and appendages. This dataset is ideal for applications such as facial recognition, segmentation, and other computer vision tasks involving human face parsing.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sovit Ranjan Rath (2023). Penn-Fudan Pedestrian dataset for segmentation [Dataset]. https://www.kaggle.com/datasets/sovitrath/penn-fudan-pedestrian-dataset-for-segmentation
Organization logo

Penn-Fudan Pedestrian dataset for segmentation

Training and validation split for training semantic segmentation models

Explore at:
zip(53687127 bytes)Available download formats
Dataset updated
Mar 4, 2023
Authors
Sovit Ranjan Rath
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Penn-Fudan dataset for semantic segmentation. The dataset has been split into 146 training samples and 24 validation samples.

Corresponding blog post => Training UNet from Scratch using PyTorch

Original data set => https://www.cis.upenn.edu/~jshi/ped_html/

Search
Clear search
Close search
Google apps
Main menu