100+ datasets found
  1. D

    Total-Text Dataset

    • datasetninja.com
    • opendatalab.com
    • +1more
    Updated Oct 27, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chee Kheng Chng; Chee Seng Chan (2017). Total-Text Dataset [Dataset]. https://datasetninja.com/total-text
    Explore at:
    Dataset updated
    Oct 27, 2017
    Dataset provided by
    Dataset Ninja
    Authors
    Chee Kheng Chng; Chee Seng Chan
    License

    https://opensource.org/license/bsd-3-clause/https://opensource.org/license/bsd-3-clause/

    Description

    Total-Text is a dataset tailored for instance segmentation, semantic segmentation, and object detection tasks, containing 1555 images with 11165 labeled objects belonging to a single class — text with text label tag. Its primary aim is to open new research avenues in the scene text domain. Unlike traditional text datasets, Total-Text uniquely includes curved-oriented text in addition to horizontal and multi-oriented text, offering diverse text orientations in more than half of its images. This variety makes it a crucial resource for advancing text-related studies in computer vision and natural language processing.

  2. R

    Plantseg Segmentation Dataset Dataset

    • universe.roboflow.com
    zip
    Updated Nov 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    uqtwei (2024). Plantseg Segmentation Dataset Dataset [Dataset]. https://universe.roboflow.com/uqtwei-6gmpn/plantseg-segmentation-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 22, 2024
    Dataset authored and provided by
    uqtwei
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Plant Diseases Masks
    Description

    We established a large-scale plant disease segmentation dataset named PlantSeg. PlantSeg comprises more than 11,400 images of 115 different plant diseases from various environments, each annotated with its corresponding segmentation label for diseased parts. To the best of our knowledge, PlantSeg is the largest plant disease segmentation dataset containing in-the-wild images. Our dataset enables researchers to evaluate their models and provides a valid foundation for the development and benchmarking of plant disease segmentation algorithms.

    Please note that due to the image limitations of Roboflow, the dataset provided here is not complete.

    Project page: https://github.com/tqwei05/PlantSeg

    Paper: https://arxiv.org/abs/2409.04038

    Complete dataset download: https://zenodo.org/records/13958858

    Reference: @article{wei2024plantseg, title={PlantSeg: A Large-Scale In-the-wild Dataset for Plant Disease Segmentation}, author={Wei, Tianqi and Chen, Zhi and Yu, Xin and Chapman, Scott and Melloy, Paul and Huang, Zi}, journal={arXiv preprint arXiv:2409.04038}, year={2024} }

  3. Z

    The Mountain Habitats Segmentation and Change Detection Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Branzan Albu, Alexandra (2020). The Mountain Habitats Segmentation and Change Detection Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_12590
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Jean, Frédéric
    Higgs, Eric
    Branzan Albu, Alexandra
    Capson, David
    Fisher, Jason T.
    Starzomski, Brian M.
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This is the dataset presented in the paper The Mountain Habitats Segmentation and Change Detection Dataset accepted for publication in the IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Beach, HI, USA, January 6-9, 2015. The full-sized images and masks along with the accompanying files and results can be downloaded here. The size of the dataset is about 2.1 GB.

    The dataset is released under the Creative Commons Attribution-Non Commercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/legalcode).

    The dataset documentation is hosted on GitHub at the following address: http://github.com/fjean/mhscd-dataset-doc. Direct download links to the latest revision of the documentation are provided below:

    PDF format: http://github.com/fjean/mhscd-dataset-doc/raw/master/mhscd-dataset-doc.pdf

    Text format: http://github.com/fjean/mhscd-dataset-doc/raw/master/mhscd-dataset-doc.rst

  4. HaN-Seg: The head and neck organ-at-risk CT & MR segmentation dataset

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Feb 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gašper Podobnik; Gašper Podobnik; Primož Strojan; Primož Strojan; Primož Peterlin; Primož Peterlin; Bulat Ibragimov; Bulat Ibragimov; Tomaž Vrtovec; Tomaž Vrtovec (2023). HaN-Seg: The head and neck organ-at-risk CT & MR segmentation dataset [Dataset]. http://doi.org/10.5281/zenodo.7442914
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 7, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gašper Podobnik; Gašper Podobnik; Primož Strojan; Primož Strojan; Primož Peterlin; Primož Peterlin; Bulat Ibragimov; Bulat Ibragimov; Tomaž Vrtovec; Tomaž Vrtovec
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    The HaN-Seg: Head and Neck Organ-at-Risk CT & MR Segmentation Dataset is a publicly available dataset of anonymized head and neck (HaN) images of 42 patients that underwent both CT and T1-weighted MR imaging for the purpose of image-guided radiotherapy planning. In addition, the dataset also contains reference segmentations of 30 organs-at-risk (OARs) for CT images in the form of binary segmentation masks, which were obtained by curating manual pixel-wise expert image annotations. A full description of the HaN-Seg dataset can be found in:

    G. Podobnik, P. Strojan, P. Peterlin, B. Ibragimov, T. Vrtovec, "HaN-Seg: The head and neck organ-at-risk CT & MR segmentation dataset", Medical Physics, 2023. https://doi.org/10.1002/mp.16197,

    and any research originating from its usage is required to cite this paper.

    In parallel with the release of the dataset, the HaN-Seg: The Head and Neck Organ-at-Risk CT & MR Segmentation Challenge is launched to promote the development of new and application of existing state-of-the-art fully automated techniques for OAR segmentation in the HaN region from CT images that exploit the information of multiple imaging modalities, in this case from CT and MR images. The task of the HaN-Seg challenge is to automatically segment up to 30 OARs in the HaN region from CT images in the devised test set, consisting of 14 CT and MR images of the same patients, given the availability of the training set (i.e. the herein publicly available HaN-Seg dataset), consisting of 42 CT and MR images of the same patients with reference 3D OAR binary segmentation masks for CT images.

  5. MedSeg Liver segments dataset

    • figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MedSeg; Håvard Bjørke Jenssen; Tomas Sakinis (2023). MedSeg Liver segments dataset [Dataset]. http://doi.org/10.6084/m9.figshare.13643252.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    MedSeg; Håvard Bjørke Jenssen; Tomas Sakinis
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    In total 50 cases segmented with liver segments 1-8.Free to use and download. Check out our 5 segment model at www.medseg.aiAll cases obtained Decathlon's dataset, see details and reference here: https://arxiv.org/abs/1902.09063Segmentations done by MedSeg#Update 2/4/21: 40 new cases added, case 1 replaced with new case

  6. d

    Dataset: Segmentation of cortical bone, trabecular bone, and medullary pores...

    • search.dataone.org
    • datadryad.org
    Updated Mar 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Lee; Julian Moore; Brandon Vera Covarrubias; Leigha Lynch (2025). Dataset: Segmentation of cortical bone, trabecular bone, and medullary pores from micro-CT images using 2D and 3D deep learning models [Dataset]. https://search.dataone.org/view/sha256%3Aa60ce1962a1e77843d080bb8d99d112b49f7e0bc06730699c0c83006de17e3b2
    Explore at:
    Dataset updated
    Mar 5, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Andrew Lee; Julian Moore; Brandon Vera Covarrubias; Leigha Lynch
    Description

    Computed tomography (CT) enables rapid imaging of large-scale studies of bone, but those datasets typically require manual segmentation, which is time-consuming and prone to error. Convolutional neural networks (CNNs) offer an automated solution, achieving superior performance on image data. Here, we used CNNs to train segmentation models from scratch on 2D and 3D patches from micro-CT scans of otter long bones. These new models, collectively called BONe (Bone One-shot Network), aimed to be fast and accurate, and we expected enhanced results from 3D training due to better spatial context. Our results showed that 3D training required substantially more memory. Contrary to expectations, 2D models performed slightly better than 3D models in labeling details such as thin trabecular bone. Although lacking in some detail, 3D models appeared to generalize better and predict smoother internal surfaces than 2D models. However, the massive computational c..., Materials Limb bones from the North American river otter (Lontra canadensis) were borrowed from four museums — OMNH (SNM): Sam Noble Oklahoma Museum of Natural History (Norman Oklahoma); UAM: University of Alaska Museum of the North (Fairbanks, Alaska); and UF: Florida Museum of Natural History (Gainesville, Florida); UWBM (BM): Burke Museum of Natural History and Culture (Seattle, Washington). In total, the sample consisted of 38 elements (humerus, radius, ulna, femur, tibia, and fibula) from nine individuals.

    Specimen

    kV

    µA

    Filter

    Res. (µm)

    Provenance

    Sex

    Side

    Element

    Group

    OMNH 44262

    160

    312

    Copper

    49.99

    Tennessee

    F

    R

    Humerus

    Fitting

    L

    Radius

    Fitting

    L

    Ulna

    Fitting

    OMNH 53994

    160

    312

    Copper

    49.99

    Tennessee

    M

    L

    Femur

    Fitting

    &..., , # Dataset: Segmentation of cortical bone, trabecular bone, and medullary pores from micro-CT images using 2D and 3D deep learning models

    https://doi.org/10.5061/dryad.b2rbnzsq4

    Introduction: The following document outlines the structure of data repository.

    “_composite.zip†contains separate folders for the raw micro-CT slices and CTP labels used for model fitting. Because Avizo’s deep learning segmentation module can only load a single pair of data files (i.e., one file consists of the raw slices and the second file consists of the labels), we appended the samples used for fitting into a composite sample.

    “_Models.zip†contains the deep learning models for bone-pore and cortical-trabecular-pores segmentation. Model files are intended for use in the commercial software Avizo/Amira but are compatible with a variety of both open source and commercial software (e.g., TensorFlow or Comet Dragonfly). Files consist of model weights (HDF5 format...

  7. D

    Alabama Buildings Segmentation Dataset

    • datasetninja.com
    Updated Oct 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Duy Cao (2023). Alabama Buildings Segmentation Dataset [Dataset]. https://datasetninja.com/alabama-buildings-segmentation
    Explore at:
    Dataset updated
    Oct 2, 2023
    Dataset provided by
    Dataset Ninja
    Authors
    Duy Cao
    License

    https://spdx.org/licenses/https://spdx.org/licenses/

    Description

    Alabama Buildings Segmentation dataset is the combination of BingMap satellite images and masks from Microsoft Maps. It is almost from Alabama, US (99%). Others from Columbia. Dataset contains 10200 satellite images and 10200 masks with weight ~ 17Gb. The satellite images from this dataset have resolution 0.5m/pixel, image size 1024x1024, ~1.5Mb/image. Dataset only contains pictures that have the total area of builbuilding in mask >= 1% area of that pictures. It means there are no images that do not have any building in this dataset.

  8. c

    SAROS - A large, heterogeneous, and sparsely annotated segmentation dataset...

    • cancerimagingarchive.net
    csv, n/a +1
    Updated Mar 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2024). SAROS - A large, heterogeneous, and sparsely annotated segmentation dataset on CT imaging data [Dataset]. http://doi.org/10.25737/SZ96-ZG60
    Explore at:
    csv, n/a, nifti and zipAvailable download formats
    Dataset updated
    Mar 7, 2024
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Mar 7, 2024
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description
    Sparsely Annotated Region and Organ Segmentation (SAROS) contributes a large heterogeneous semantic segmentation annotation dataset for existing CT imaging cases on TCIA. The goal of this dataset is to provide high-quality annotations for building body composition analysis tools (References: Koitka 2020 and Haubold 2023). Existing in-house segmentation models were employed to generate annotation candidates on randomly selected cases. All generated annotations were manually reviewed and corrected by medical residents and students on every fifth axial slice while other slices were set to an ignore label (numeric value 255). 900 CT series from 882 patients were randomly selected from the following TCIA collections (number of CTs per collection in parenthesis): ACRIN-FLT-Breast (32), ACRIN-HNSCC-FDG-PET/CT (48), ACRIN-NSCLC-FDG-PET (129), Anti-PD-1_Lung (12), Anti-PD-1_MELANOMA (2), C4KC-KiTS (175), COVID-19-NY-SBU (1), CPTAC-CM (1), CPTAC-LSCC (3), CPTAC-LUAD (1), CPTAC-PDA (8), CPTAC-UCEC (26), HNSCC (17), Head-Neck Cetuximab (12), LIDC-IDRI (133), Lung-PET-CT-Dx (17), NSCLC Radiogenomics (7), NSCLC-Radiomics (56), NSCLC-Radiomics-Genomics (20), Pancreas-CT (58), QIN-HEADNECK (94), Soft-tissue-Sarcoma (6), TCGA-HNSC (1), TCGA-LIHC (33), TCGA-LUAD (2), TCGA-LUSC (3), TCGA-STAD (2), TCGA-UCEC (1). A script to download and resample the images is provided in our GitHub repository: https://github.com/UMEssen/saros-dataset The annotations are provided in NIfTI format and were performed on 5mm slice thickness. The annotation files define foreground labels on the same axial slices and match pixel-perfect. In total, 13 semantic body regions and 6 body part labels were annotated with an index that corresponds to a numeric value in the segmentation file.

    Body Regions

    1. Subcutaneous Tissue
    2. Muscle
    3. Abdominal Cavity
    4. Thoracic Cavity
    5. Bones
    6. Parotid Glands
    7. Pericardium
    8. Breast Implant
    9. Mediastinum
    10. Brain
    11. Spinal Cord
    12. Thyroid Glands
    13. Submandibular Glands

    Body Parts

    1. Torso
    2. Head
    3. Right Leg
    4. Left Leg
    5. Right Arm
    6. Left Arm
    The labels which were modified or require further commentary are listed and explained below:
    • Subcutaneous Adipose Tissue: The cutis was included into this label due to its limited differentiation in 5mm-CT.
    • Muscle: All muscular tissue was segmented contiguously and not separated into single muscles. Thus, fascias and intermuscular fat were included into the label. Inter- and intramuscular fat is subtracted automatically in the process.
    • Abdominal Cavity: This label includes the pelvis. The label does not separate between the positional relationships of the peritoneum.
    • Mediastinum: The International Thymic Malignancy Group (ITMIG) scheme was used for the segmentation guidelines.
    • Head + Neck: The neck is confined by the base of the trapezius muscle.
    • Right + Left Leg: The legs are separated from the torso by the line between the two lowest points of the Rami ossa pubis.
    • Right + Left Arm: The arms are separated from the torso by the diagonal between the most lateral point of the acromion and the tuberculum infraglenoidale.
    For reproducibility on downstream tasks, five cross-validation folds and a test set were pre-defined and are described in the provided spreadsheet. Segmentation was conducted strictly in accordance with anatomical guidelines and only modified if required for the gain of segmentation efficiency.

  9. R

    Ct Brain Segmentation Dataset

    • universe.roboflow.com
    zip
    Updated Nov 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joshua (2024). Ct Brain Segmentation Dataset [Dataset]. https://universe.roboflow.com/joshua-zgc7b/ct-brain-segmentation/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 1, 2024
    Dataset authored and provided by
    Joshua
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Stroke Polygons
    Description

    CT Brain Segmentation

    ## Overview
    
    CT Brain Segmentation is a dataset for instance segmentation tasks - it contains Stroke annotations for 5,511 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  10. d

    Replication Data for: \"A Topic-based Segmentation Model for Identifying...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kim, Sunghoon; Lee, Sanghak; McCulloch, Robert (2024). Replication Data for: \"A Topic-based Segmentation Model for Identifying Segment-Level Drivers of Star Ratings from Unstructured Text Reviews\" [Dataset]. http://doi.org/10.7910/DVN/EE3DE2
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Kim, Sunghoon; Lee, Sanghak; McCulloch, Robert
    Description

    We provide instructions, codes and datasets for replicating the article by Kim, Lee and McCulloch (2024), "A Topic-based Segmentation Model for Identifying Segment-Level Drivers of Star Ratings from Unstructured Text Reviews." This repository provides a user-friendly R package for any researchers or practitioners to apply A Topic-based Segmentation Model with Unstructured Texts (latent class regression with group variable selection) to their datasets. First, we provide a R code to replicate the illustrative simulation study: see file 1. Second, we provide the user-friendly R package with a very simple example code to help apply the model to real-world datasets: see file 2, Package_MixtureRegression_GroupVariableSelection.R and Dendrogram.R. Third, we provide a set of codes and instructions to replicate the empirical studies of customer-level segmentation and restaurant-level segmentation with Yelp reviews data: see files 3-a, 3-b, 4-a, 4-b. Note, due to the dataset terms of use by Yelp and the restriction of data size, we provide the link to download the same Yelp datasets (https://www.kaggle.com/datasets/yelp-dataset/yelp-dataset/versions/6). Fourth, we provided a set of codes and datasets to replicate the empirical study with professor ratings reviews data: see file 5. Please see more details in the description text and comments of each file. [A guide on how to use the code to reproduce each study in the paper] 1. Full codes for replicating Illustrative simulation study.txt -- [see Table 2 and Figure 2 in main text]: This is R source code to replicate the illustrative simulation study. Please run from the beginning to the end in R. In addition to estimated coefficients (posterior means of coefficients), indicators of variable selections, and segment memberships, you will get dendrograms of selected groups of variables in Figure 2. Computing time is approximately 20 to 30 minutes 3-a. Preprocessing raw Yelp Reviews for Customer-level Segmentation.txt: Code for preprocessing the downloaded unstructured Yelp review data and preparing DV and IVs matrix for customer-level segmentation study. 3-b. Instruction for replicating Customer-level Segmentation analysis.txt -- [see Table 10 in main text; Tables F-1, F-2, and F-3 and Figure F-1 in Web Appendix]: Code for replicating customer-level segmentation study with Yelp data. You will get estimated coefficients (posterior means of coefficients), indicators of variable selections, and segment memberships. Computing time is approximately 3 to 4 hours. 4-a. Preprocessing raw Yelp reviews_Restaruant Segmentation (1).txt: R code for preprocessing the downloaded unstructured Yelp data and preparing DV and IVs matrix for restaurant-level segmentation study. 4-b. Instructions for replicating restaurant-level segmentation analysis.txt -- [see Tables 5, 6 and 7 in main text; Tables E-4 and E-5 and Figure H-1 in Web Appendix]: Code for replicating restaurant-level segmentation study with Yelp. you will get estimated coefficients (posterior means of coefficients), indicators of variable selections, and segment memberships. Computing time is approximately 10 to 12 hours. [Guidelines for running Benchmark models in Table 6] Unsupervised Topic model: 'topicmodels' package in R -- after determining the number of topics(e.g., with 'ldatuning' R package), run 'LDA' function in the 'topicmodels'package. Then, compute topic probabilities per restaurant (with 'posterior' function in the package) which can be used as predictors. Then, conduct prediction with regression Hierarchical topic model (HDP): 'gensimr' R package -- 'model_hdp' function for identifying topics in the package (see https://radimrehurek.com/gensim/models/hdpmodel.html or https://gensimr.news-r.org/). Supervised topic model: 'lda' R package -- 'slda.em' function for training and 'slda.predict' for prediction. Aggregate regression: 'lm' default function in R. Latent class regression without variable selection: 'flexmix' function in 'flexmix' R package. Run flexmix with a certain number of segments (e.g., 3 segments in this study). Then, with estimated coefficients and memberships, conduct prediction of dependent variable per each segment. Latent class regression with variable selection: 'Unconstraind_Bayes_Mixture' function in Kim, Fong and DeSarbo(2012)'s package. Run the Kim et al's model (2012) with a certain number of segments (e.g., 3 segments in this study). Then, with estimated coefficients and memberships, we can do prediction of dependent variables per each segment. The same R package ('KimFongDeSarbo2012.zip') can be downloaded at: https://sites.google.com/scarletmail.rutgers.edu/r-code-packages/home 5. Instructions for replicating Professor ratings review study.txt -- [see Tables G-1, G-2, G-4 and G-5, and Figures G-1 and H-2 in Web Appendix]: Code to replicate the Professor ratings reviews study. Computing time is approximately 10 hours. [A list of the versions of R, packages, and computer...

  11. Temporal Segmentation Annotation for the REAL-Colon dataset

    • figshare.com
    txt
    Updated Aug 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carlo Biffi (2024). Temporal Segmentation Annotation for the REAL-Colon dataset [Dataset]. http://doi.org/10.6084/m9.figshare.26472913.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 2, 2024
    Dataset provided by
    figshare
    Authors
    Carlo Biffi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the temporal annotation of the REAL-Colon dataset.The REAL-Colon dataset comprises 60 recordings of real-world colonoscopies and can be accessed here. The dataset encompasses four different clinical studies (001 to 004), with each study contributing 15 videos. At the dataset location, compressed folders titled SSS-VVV_frames contain video frames, where SSS indicates the clinical study (001 to 004) and VVV represents the video name (001 to 015).Here, we upload additional temporal annotation for each video contained in 60 distinct SSS-VVV.csv indicating for each frame in SSS-VVV_frames its temporal annotation.The annotations categorize each frame into three colonoscopy phase labels: outside, insertion, and withdrawal. The withdrawal phase is further divided into seven colon segment classes: cecum, ileum, ascending colon, transverse colon, descending colon, sigmoid colon, and rectum. This results in a total of 9 distinct labels.

  12. Dataset related to article "Deep learning and atlas-based models to...

    • zenodo.org
    • data.niaid.nih.gov
    Updated Oct 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damiano Dei; Damiano Dei; Nicola Lambri; Nicola Lambri; Leonardo Crespi; Ricardo Coimbra Brioso; Daniele Loiacono; Elena Clerici; Elena Clerici; Luisa Bellu; Luisa Bellu; Chiara De Philippis; Chiara De Philippis; Pierina Navarria; Pierina Navarria; Stefania Bramanti; Stefania Bramanti; Carmelo Carlo-Stella; Carmelo Carlo-Stella; Roberto Rusconi; Roberto Rusconi; Giacomo Reggiori; Giacomo Reggiori; Stefano Tomatis; Stefano Tomatis; Marta Scorsetti; Marta Scorsetti; Pietro Mancosu; Pietro Mancosu; Leonardo Crespi; Ricardo Coimbra Brioso; Daniele Loiacono (2023). Dataset related to article "Deep learning and atlas-based models to streamline the segmentation workflow of Total Marrow and Lymphoid Irradiation" [Dataset]. http://doi.org/10.5281/zenodo.8411234
    Explore at:
    Dataset updated
    Oct 6, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Damiano Dei; Damiano Dei; Nicola Lambri; Nicola Lambri; Leonardo Crespi; Ricardo Coimbra Brioso; Daniele Loiacono; Elena Clerici; Elena Clerici; Luisa Bellu; Luisa Bellu; Chiara De Philippis; Chiara De Philippis; Pierina Navarria; Pierina Navarria; Stefania Bramanti; Stefania Bramanti; Carmelo Carlo-Stella; Carmelo Carlo-Stella; Roberto Rusconi; Roberto Rusconi; Giacomo Reggiori; Giacomo Reggiori; Stefano Tomatis; Stefano Tomatis; Marta Scorsetti; Marta Scorsetti; Pietro Mancosu; Pietro Mancosu; Leonardo Crespi; Ricardo Coimbra Brioso; Daniele Loiacono
    Description

    This record contains raw data related to article “Deep learning and atlas-based models to streamline the segmentation workflow of Total Marrow and Lymphoid Irradiation"

    Abstract:

    Purpose: To improve the workflow of Total Marrow and Lymphoid Irradiation (TMLI) by enhancing the delineation of organs-at-risk (OARs) and clinical target volume (CTV) using deep learning (DL) and atlas-based (AB) segmentation models.

    Materials and Methods: Ninety-five TMLI plans optimized in our institute were analyzed. Two commercial DL software were tested for segmenting 18 OARs. An AB model for lymph node CTV (CTV_LN) delineation was built using 20 TMLI patients. The AB model was evaluated on 20 independent patients and a semi-automatic approach was tested by correcting the automatic contours. The generated OARs and CTV_LN contours were compared to manual contours in terms of topological agreement, dose statistics, and time workload. A clinical decision tree was developed to define a specific contouring strategy for each OAR.

    Results: The two DL models achieved a median Dice Similarity Coefficient (DSC) of 0.84 [0.73;0.92] and 0.84 [0.77;0.93] across the OARs. The absolute median dose (Dmedian) difference between manual and the two DL models was 2% [1%;5%] and 1% [0.2%;1%]. The AB model achieved a median DSC of 0.70 [0.66;0.74] for CTV_LN delineation, increasing to 0.94 [0.94;0.95] after manual revision, with minimal Dmedian differences. Since September 2022, our institution has implemented DL and AB models for all TMLI patients, reducing from 5 to 2 hours the time required to complete the entire segmentation process.

    Conclusion: DL models can streamline the TMLI contouring process of OARs. Manual revision is still necessary for lymph node delineation using AB models.

    Statements & Declarations

    Funding: This work was funded by the Italian Ministry of Health, grant AuToMI (GR-2019-12370739).

    Competing Interests: The authors have no conflict of interests to disclose.

    Author Contributions: All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by D.D., N.L., L.C., R.C.B., D.L., and P.M. The first draft of the manuscript was written by D.D. and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

    Ethics approval: The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Ethics Committee of IRCCS Humanitas Research Hospital (ID 2928, 26 January 2021). ClinicalTrials.gov identifier: NCT04976205.

    Consent to participate: Informed consent was obtained from all individual participants included in the study.

  13. Representative Sample Dataset for Resolution-Agnostic Tissue Segmentation in...

    • zenodo.org
    • data.niaid.nih.gov
    tiff, xml
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Péter Bándi; Péter Bándi (2024). Representative Sample Dataset for Resolution-Agnostic Tissue Segmentation in Whole-Slide Histopathology Images [Dataset]. http://doi.org/10.5281/zenodo.3375528
    Explore at:
    tiff, xmlAvailable download formats
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Péter Bándi; Péter Bándi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a representative sample from the dataset that was used to develop resolution-agnostic convolutional neural networks for tissue segmentation1 in whole-slide histopathology images.

    The dataset is composed of two parts: development set and dissimilar set.

    Sample images from the development set:

    • breast_hne_00.tif
    • breast_lymph_node_hne_00.tif
    • tongue_ae1ae3_00.tif
    • tongue_hne_00.tif
    • tongue_ki67_00.tif

    Sample images from the dissimilar set:

    • brain_alcianblue_00.tif
    • cornea_grocott_00.tif
    • kidney_cab_00.tif
    • skin_perls_00.tif
    • uterus_vonkossa_00.tif
  14. R

    Full_img_leaf Segmentation Dataset

    • universe.roboflow.com
    zip
    Updated May 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National university of sciences and technology (2023). Full_img_leaf Segmentation Dataset [Dataset]. https://universe.roboflow.com/national-university-of-sciences-and-technology-w7v9e/full_img_leaf-segmentation
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 12, 2023
    Dataset authored and provided by
    National university of sciences and technology
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Wheat Leaf Masks
    Description

    Full_img_leaf Segmentation

    ## Overview
    
    Full_img_leaf Segmentation is a dataset for semantic segmentation tasks - it contains Wheat Leaf annotations for 537 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  15. Fraunhofer EZRT XXL-CT Instance Segmentation Me163

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jun 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gruber Roland; Gruber Roland; Nils Reims; Andreas Hempfer; Stefan Gerth; Stefan Gerth; Michael Böhnel; Theobald Fuchs; Michael Salamon; Thomas Wittenberg; Thomas Wittenberg; Nils Reims; Andreas Hempfer; Michael Böhnel; Theobald Fuchs; Michael Salamon (2024). Fraunhofer EZRT XXL-CT Instance Segmentation Me163 [Dataset]. http://doi.org/10.5281/zenodo.10651746
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 25, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gruber Roland; Gruber Roland; Nils Reims; Andreas Hempfer; Stefan Gerth; Stefan Gerth; Michael Böhnel; Theobald Fuchs; Michael Salamon; Thomas Wittenberg; Thomas Wittenberg; Nils Reims; Andreas Hempfer; Michael Böhnel; Theobald Fuchs; Michael Salamon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The ’Me 163’ was a Second World War fighter airplane and a result of the German air force secret developments. One of these airplanes is currently owned and displayed in the historic aircraft exhibition of the ’Deutsches Museum’ in Munich, Germany. To gain insights with respect to its history, design and state of preservation, a complete CT scan was obtained using an industrial XXL-computer tomography scanner.

    Using the CT data from the Me 163, all its details can visually be examined at various levels, ranging from the complete hull down to single sprockets and rivets. However, while a trained human observer can identify and interpret the volumetric data with all its parts and connections, a virtual dissection of the airplane and all its different parts would be quite desirable. Nevertheless, this means, that an instance segmentation of all components and objects of interest into disjoint entities from the CT data is necessary.

    As of currently, no adequate computer-assisted tools for automated or semi-automated segmentation
    of such XXL-airplane data are available, in a first step, an interactive data annotation and object labeling process has been established. So far, seven sub-volumes from the Me 163 airplane have been annotated and labeled, whose results can potentially be used for various new applications in the field of digital heritage, non-destructive testing, or machine-learning. These annotated and labeled data sets are available here.

  16. Z

    Shifts Multiple Sclerosis Lesion Segmentation Dataset Part 1

    • data.niaid.nih.gov
    • zenodo.org
    Updated Nov 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Molchanova, Nataliia (2022). Shifts Multiple Sclerosis Lesion Segmentation Dataset Part 1 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7051657
    Explore at:
    Dataset updated
    Nov 29, 2022
    Dataset provided by
    Raina, Vatsal
    Bach Cuadra, Meritxell
    La Rosa, Francesco
    Tsompopoulou, Efi
    Athanasopoulos, Andreas
    Molchanova, Nataliia
    Granziera, Cristina
    Malinin, Andrey
    Graziani, Mara
    Tsarsitalidis, Vasileios
    Barakovic, Muhamed
    Volf, Elena
    Sivena, Eli
    Nikitakis, Antonis
    Kartashev, Nikolay
    Gales, Mark
    Kyriakopoulos, Konstantinos
    Lu, Po-Jui
    Description

    This archive contains the part 1 of Shift Benchmark on Multiple Sclerosis lesion segmentation data. This dataset is provided by the Shifts Project to enable assessment of the robustness of models to distributional shift and the quality of their uncertainty estimates. This part is the MSSEG data collected in the digital repository of the OFSEP Cohort provided in the context of the MICCAI 2016 and 2021 challenges. A full description of the benchmark is available in https://arxiv.org/pdf/2206.15407. Part 2 of the data is available here. To find out more about the Shifts Project, please visit https://shifts.ai .

  17. P

    Medical Segmentation Decathlon Dataset

    • paperswithcode.com
    • opendatalab.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amber L. Simpson; Michela Antonelli; Spyridon Bakas; Michel Bilello; Keyvan Farahani; Bram van Ginneken; Annette Kopp-Schneider; Bennett A. Landman; Geert Litjens; Bjoern Menze; Olaf Ronneberger; Ronald M. Summers; Patrick Bilic; Patrick F. Christ; Richard K. G. Do; Marc Gollub; Jennifer Golia-Pernicka; Stephan H. Heckers; William R. Jarnagin; Maureen K. McHugo; Sandy Napel; Eugene Vorontsov; Lena Maier-Hein; M. Jorge Cardoso, Medical Segmentation Decathlon Dataset [Dataset]. https://paperswithcode.com/dataset/medical-segmentation-decathlon
    Explore at:
    Authors
    Amber L. Simpson; Michela Antonelli; Spyridon Bakas; Michel Bilello; Keyvan Farahani; Bram van Ginneken; Annette Kopp-Schneider; Bennett A. Landman; Geert Litjens; Bjoern Menze; Olaf Ronneberger; Ronald M. Summers; Patrick Bilic; Patrick F. Christ; Richard K. G. Do; Marc Gollub; Jennifer Golia-Pernicka; Stephan H. Heckers; William R. Jarnagin; Maureen K. McHugo; Sandy Napel; Eugene Vorontsov; Lena Maier-Hein; M. Jorge Cardoso
    Description

    The Medical Segmentation Decathlon is a collection of medical image segmentation datasets. It contains a total of 2,633 three-dimensional images collected across multiple anatomies of interest, multiple modalities and multiple sources. Specifically, it contains data for the following body organs or parts: Brain, Heart, Liver, Hippocampus, Prostate, Lung, Pancreas, Hepatic Vessel, Spleen and Colon.

  18. h

    bald-people-segmentation-dataset

    • huggingface.co
    Updated Aug 4, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Training Data (2023). bald-people-segmentation-dataset [Dataset]. https://huggingface.co/datasets/TrainingDataPro/bald-people-segmentation-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 4, 2023
    Authors
    Training Data
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Bald People Segmentation & Object Detection dataset

    The balding dataset consists of images of bald people and corresponding segmentation masks. Segmentation masks highlight the regions of the images that delineate the bald scalp. By using these segmentation masks, researchers and practitioners can focus only on the areas of interest, they also could study androgenetic alopecia via this dataset. The alopecia dataset is designed to be accessible and easy to use, providing… See the full description on the dataset page: https://huggingface.co/datasets/TrainingDataPro/bald-people-segmentation-dataset.

  19. Z

    Generated Metastatic CT 3D Femurs with lesions segmentation 1/3

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Follet, Hélène (2024). Generated Metastatic CT 3D Femurs with lesions segmentation 1/3 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13771567
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    Saillard, Emile
    Follet, Hélène
    Grenier, Thomas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file regroups the 5675 synthetic femurs used and described in the paper: "Enhanced segmentation of femoral bone metastasis in CT scans of patients using synthetic data generation with 3D diffusion models" Saillard et al.

    Three repositories are needed to form the full archive. When all are downloaded, they can be deflated with the following bash command :

    cat Generated_3DCT_Metastatic_Femurs.tar.gz* | tar faxv -

    This is the first part (1/3)

    part 2/3: 10.5281/zenodo.13824177

    part 3/3: 10.5281/zenodo.13824179

    Generated CT Scans are organized in two directories: one for files generated with DDPM this one) and one without DDPM.

    ├── Generated_3DCT_Metastatic_Femurs_with_DDPM │ ├── img│ ├── lbl│ └── msk

    └── Generated_3DCT_Metastatic_Femurs_NO_DDPM ├── img ├── lbl └── msk

    The content of each directory is as follows:

    img : the generated 3D CT scan (nii.gz format) from the 26 healthy femurs

    lbl : the lesions segmentation (nii.gz format) of the generated 3D CT scan

    msk : the segmentations of the 26 healthy femurs (nii.gz)

    A given filename corresponds to in img and lbl directories :

    ./lbl/MEK03les0MEK14.nii.gz is the lesions segmentation of ./img/MEK03les0MEK14.nii.gz.

    ./lbl/MEK03les197MEK28.nii.gz is the lesions segmentation of ./img/MEK03les197MEK28.nii.gz

    But in msk, the corresponding femur mask for these examples is ./msk/MEK03.nii.gz

  20. Prithvi - Burn Scars Segmentation

    • hub.arcgis.com
    • arc-gis-hub-home-arcgishub.hub.arcgis.com
    Updated Jan 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esri (2024). Prithvi - Burn Scars Segmentation [Dataset]. https://hub.arcgis.com/content/9af7af28dd91473bbc8ad40942e74563
    Explore at:
    Dataset updated
    Jan 3, 2024
    Dataset authored and provided by
    Esrihttp://esri.com/
    Area covered
    Earth
    Description

    One of the key challenges in monitoring wildfires lies in distinguishing burn scars from non-burnt areas and assessing the extent of damage.
    This differentiation is crucial to assist emergency responders in their decision-making ability. Satellite imagery enriched with high temporal and spectral information, coupled with advancements in machine learning methods, present an avenue for automated monitoring and management of post-wildfire landscapes on a large scale. The burn scar deep learning model can emerge as an indispensable tool to tackle the task of accurately identifying and mapping the aftermath of wildfires from satellite imagery.The Prithivi-100M-burn-scar model has been developed by NASA and IBM by finetuning their foundation model for earth observation - Prithvi-100m, using HLS Burn Scar Scenes dataset. Use this model to automate the process of identifying burn scars in multispectral satellite imagery. Using the modelFollow the guide to use the model. Before using this model, ensure that the supported deep learning libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.Fine-tuning the modelThis model can be fine-tuned using the ArcGIS API for Python. Follow the guide to fine-tune this model.InputRaster, mosaic dataset, or image service. These must be a 6-band composite raster derived from either Harmonized Landsat 8 (HLSL30) or Harmonized Sentinel 2 (HLSS30). The model can also be used with level-2 products of Sentinel-2 and Landsat-8, yet it performs most effectively with HLSL30 and HLSS30.The composite raster should contain the following 6 bands: Blue, Green, Red, Narrow NIR, SWIR, SWIR 2.Band numbers for the above mentioned bands are: For HLSS30 and Sentinel-2: Band2, Band3, Band4, Band8A, Band11, Band12 For HLSL30 and Landsat 8: Band2, Band3, Band4, Band5, Band6, Band7OutputClassified raster with 2 classes (no burn and burn scar).Applicable geographiesThis model is expected to work well across the globe.Model architectureThis model packages IBM and NASA's Prithvi-100M-burn-scar and uses a self-supervised encoder developed with a ViT architecture and Masked AutoEncoder (MAE) learning strategy.Accuracy metricsThis model has an IoU of 0.73 on the burn scar class and 96.00 percent overall accuracy.Training dataThis model finetunes the pretrained Prithvi-100m model to segment the extent of burned area on multispectral satellite images from the HLS Burn Scar Scenes dataset.LimitationsThis model may not give satisfactory results in areas near water bodies.Sample resultsHere are a few results from the model.CitationsPhillips, C., Roy, S., Ankur, K., & Ramachandran, R. (2023). HLS Foundation Burnscars Dataset.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Chee Kheng Chng; Chee Seng Chan (2017). Total-Text Dataset [Dataset]. https://datasetninja.com/total-text

Total-Text Dataset

Explore at:
Dataset updated
Oct 27, 2017
Dataset provided by
Dataset Ninja
Authors
Chee Kheng Chng; Chee Seng Chan
License

https://opensource.org/license/bsd-3-clause/https://opensource.org/license/bsd-3-clause/

Description

Total-Text is a dataset tailored for instance segmentation, semantic segmentation, and object detection tasks, containing 1555 images with 11165 labeled objects belonging to a single class — text with text label tag. Its primary aim is to open new research avenues in the scene text domain. Unlike traditional text datasets, Total-Text uniquely includes curved-oriented text in addition to horizontal and multi-oriented text, offering diverse text orientations in more than half of its images. This variety makes it a crucial resource for advancing text-related studies in computer vision and natural language processing.

Search
Clear search
Close search
Google apps
Main menu