Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
This dataset includes CT data and segmentation masks from patients diagnosed with COVID-19, as well as data from subjects without the infection.
This study is approved under the ethical approval codes of IR.TUMS.IKHC.REC.1399.255 and IR.TUMS.VCR.REC.1399.488 at Tehran University of Medical Sciences.
The code for loading the dataset and running an AI model is available on: https://github.com/SamanSotoudeh/COVID19-segmentation
Please use the following citations:
1- Arian, Arvin; Mehrabinejad, Mohammad-Mehdi; Zoorpaikar, Mostafa; Hasanzadeh, Navid; Sotoudeh-Paima, Saman; Kolahi, Shahriar; Gity, Masoumeh; Soltanian-Zadeh, "Accuracy of Artificial Intelligence CT Quantification in Predicting COVID-19 Subjects’ Prognosis" PLoS ONE (2023).
2- Sotoudeh-Paima, Saman, et al. "A Multi-centric Evaluation of Deep Learning Models for Segmentation of COVID-19 Lung Lesions on Chest CT Scans." Iranian Journal of Radiology 19.4 (2022).
3- Hasanzadeh, Navid, et al. "Segmentation of COVID-19 Infections on CT: Comparison of four UNet-based networks." 2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME). IEEE, 2020.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Cov Khoom Siv Feem Ntau Segmentation Dataset ua haujlwm rau kev lag luam e-lag luam thiab kev lom zem pom kev lag luam nrog ntau cov duab sau hauv internet, muaj cov kev daws teeb meem xws li 800 × 600 txog 4160 × 3120. Cov ntaub ntawv no suav nrog ntau qhov sib txawv ntawm cov xwm txheej niaj hnub thiab cov khoom, suav nrog ntau tus neeg, tsiaj txhu thiab cov rooj tog zaum. segmentation.
Facebook
TwitterKSSD2025 – CT Kidney Stone Segmentation Dataset
A High-Quality Annotated Dataset for Deep Learning-Based Kidney Stone Segmentation
📌 Overview
KSSD2025 is a dataset of axial CT images with expert-annotated kidney stone segmentation masks, created to support deep learning research in medical image segmentation. It is derived from the public dataset by Islam et al. (2022), which contains CT images with different kidney conditions. KSSD2025 focuses exclusively on kidney stone cases, offering precise ground-truth masks for developing and benchmarking AI-based segmentation models.
🎈 Description
This dataset presents a carefully refined subset of the original "CT Kidney Dataset: Normal-Cyst-Tumor and Stone" by Islam et al., comprising only axial CT images that exhibit kidney stones. Out of 12,446 images in the original collection, 838 images were selected for manual annotation based on the presence of stones and the axial orientation, which offers better anatomical context for segmentation tasks.
To ensure high-quality ground-truth segmentation, a three-step preprocessing pipeline was applied:
1) Thresholding: Pixel intensity thresholding at 150 was used to isolate high-density structures, which often correspond to kidney stones.
2) Connected Component Filtering: Regions larger than 300 pixels were discarded to remove bones and other non-stone structures.
3) Manual Refinement: Remaining artifacts were removed and stone regions refined in collaboration with specialists in urology and radiology.
Each image in the dataset is paired with a binary mask that precisely delineates kidney stone regions, making it ideal for training and evaluating deep learning models in tasks like medical image segmentation and object detection.
📊 Dataset Details Total Annotated Images: 838 View: Axial Annotations: Binary segmentation masks (kidney stone regions) Image Format: TIF Size: 305.38 MB Source Dataset: CT KIDNEY DATASET: Normal-Cyst-Tumor and Stone Annotation Method: Semi-automatic (thresholding + connected components) followed by expert manual refinement
🔍 Use Cases ✔️ Deep Learning-Based Kidney Stone Segmentation ✔️ AI-Powered Medical Imaging Tools ✔️ Benchmarking Medical Image Segmentation Models ✔️ Educational Applications in Radiology and Urology
🔬 Research Potential
KSSD2025 addresses the scarcity of annotated kidney stone segmentation datasets. By offering pixel-level annotations, it opens new opportunities for developing robust segmentation models and AI-assisted diagnostic systems in urology.
⚖️ License
Datafiles © Nazmul Islam
🏫 Institutions Involved
📢 Citation
If you use this dataset in your research, please cite:
Islam MN, Hasan M, Hossain M, Alam M, Rabiul G, Uddin MZ, Soylu A. Vision transformer and explainable transfer learning models for auto detection of kidney cyst, stone and tumor from CT-radiography. Scientific Reports. 2022.
M. F. Bouzon et al., "KSSD2025: A New Annotated Dataset for Automatic Kidney Stone Segmentation and Evaluation with Modified U-Net Based Deep Learning Models," in IEEE Access, doi: 10.1109/ACCESS.2025.3610027
🙏 If you find this dataset helpful, please give it an upvote and share your feedback. Thank you! 😊
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Normal Multi Aperio Semantic is a dataset for semantic segmentation tasks - it contains Mitotic Cells annotations for 865 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The BUS-UCLM dataset is a collection of breast ultrasound images from 38 patients, specifically designed for lesion segmentation research. It comprises a total of 683 images categorized into benign (174), malignant (90), and normal (419) cases. The images were acquired using a Siemens ACUSON S2000TM Ultrasound System between 2022 and 2023. Ground truth segmentation masks are provided as separate RGB files.
The dataset contains:
The ground truth for lesion segmentation is provided in separate files as RGB images. The color coding is as follows:
These annotations were created by expert radiologists, ensuring high-quality ground truth for training and evaluation.
This dataset is a valuable resource for research in:
Please cite this dataset as follows:
Vallez, Noelia; Bueno, Gloria; Deniz, Oscar; Rienda, Miguel Angel; Pastor, Carlos (2024), “BUS-UCLM: Breast ultrasound lesion segmentation dataset”, Mendeley Data, V1, doi: 10.17632/7fvgj4jsp7.1
Breast Cancer, Image Segmentation, Object Detection, Ultrasound, Breast Ultrasonography, Instance Segmentation
This research was supported by:
This dataset is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0). This license allows for the sharing and adaptation of the material for any purpose, even commercially, as long as appropriate credit is given to the authors.
Please consider upvoting this dataset if you find it useful! 👍
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Kidney Normal is a dataset for instance segmentation tasks - it contains Kidney annotations for 1,025 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
One of the largest (colorectal) epithelium segmentation datasets to date, featuring H&E and IHC images with pathologist-annotated segmentations. (full dataset can be downloaded at https://dataverse.no/dataset.xhtml?persistentId=doi:10.18710/DIGQGQ).
With over 500,000 epithelium annotations, it represents a significant advance in scale:
The collection includes Tissue Microarray (TMA) cores from 100 patients, featuring both normal colorectal mucosa and cancer tissue. Each patient has:
All images are approximately 10,000 x 10,000 pixels at 40X magnification. Each marker includes matched pairs of JPG images and PNG segmentation masks:
H&E stained cores (552 image/mask pairs)
Immunohistochemistry for 13 proteins:
ISH stains:
Total: 13,179 image/mask pairs (26,358 files)
Ideal for AI training in:
Dataset use must adhere to the CC0 licence and cited as: Pettersen, Henrik Sahlin; Wiik, Erik Nesje, 2025, "The Colorectal_Cancer_IHC_CISH_HE_Epithelium_Segmentation dataset", https://doi.org/10.18710/DIGQGQ, DataverseNO, V1
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Urban sewer pipelines, as the critical guarantors of urban resilience and sustainable development, undertake the task of sewage disposal and flood prevention. However, in many countries, the most municipal sewer systems have been in service for 60 to 100 years, with the worst condition rating (D+) evaluated by ASCE.
As laser scanning is fast becoming the state-of-the-art inspection technique for underground sewers, semantic segmentation of pipeline point clouds is an essential intermediate step for pipeline condition assessment and digital twinning. Currently, similar to other building structures, the scarcity of real-world point clouds has hindered the application of deep learning techniques for automated sewer pipeline semantic segmentation.
We provided a high-quality, realistic, semantically-rich public dataset named "**Sewer3D Semantic Segmentation**" (S3DSS), including 800 synthetic scans and 500 real-world scans, for point cloud semantic segmentation in sewer pipeline domain, for which there are no public datasets in the past. S3DSS contains over 917 million points with 8 categories of common sewer defects. We hope it can be a starting point for benchmarking developed approaches to promote deep learning research on point cloud of sewer pipeline defects.
The two sub-datasets were obtained in the following way.
The real point cloud data were captured in laboratory scenarios using a FARO Focus S laser scanner. We used two prototype reinforced concrete sewer pipes to create most of the defect scenes. However, for misalign and displace defects that are difficult to operate with concrete pipes, we used two steel pipes which were well-designed to simulate. A total of 500 real scans were collected.
The synthetic point cloud data were obtained by our automated synthetic data generator in Unity3D. The introduction to the synthetic point cloud data generation methodology can be found in our paper. We generated 800 scans of sewer defect scenes. If you need more data, please contact Minghao Li (liminghao@dlut.edu.cn). In S3DSS, 8 common defect classes are used which includes:
This work was supported by the National Key R & D Program of China (Grant No. 2022YFC3801000) and the National Natural Science Foundation of China (Grant No. 52479118). We also thank Haurum et al. for sharing their great work "Sewer Defect Classification using Synthetic Point Clouds" as a reference for this work.
【M. Li, X. Feng, Z. Wu, J. Bai, F. Yang, Game engine-driven synthetic point cloud generation method for LiDAR-based defect detection in sewers, Tunnelling and Underground Space Technology 163 (2025) 106755. https://doi.org/10.1016/j.tust.2025.106755.】
【Z. Wu, M. Li, Y. Han, X. Feng, Semantic segmentation of 3D point cloud for sewer defect detection using an integrated global and local deep learning network, Measurement 253 (2025) 117434. https://doi.org/10.1016/j.measurement.2025.117434.】
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For more details and the most up-to-date information please consult our project page: https://kainmueller-lab.github.io/fisbe.
Instance segmentation of neurons in volumetric light microscopy images of nervous systems enables groundbreaking research in neuroscience by facilitating joint functional and morphological analyses of neural circuits at cellular resolution. Yet said multi-neuron light microscopy data exhibits extremely challenging properties for the task of instance segmentation: Individual neurons have long-ranging, thin filamentous and widely branching morphologies, multiple neurons are tightly inter-weaved, and partial volume effects, uneven illumination and noise inherent to light microscopy severely impede local disentangling as well as long-range tracing of individual neurons. These properties reflect a current key challenge in machine learning research, namely to effectively capture long-range dependencies in the data. While respective methodological research is buzzing, to date methods are typically benchmarked on synthetic datasets. To address this gap, we release the FlyLight Instance Segmentation Benchmark (FISBe) dataset, the first publicly available multi-neuron light microscopy dataset with pixel-wise annotations. In addition, we define a set of instance segmentation metrics for benchmarking that we designed to be meaningful with regard to downstream analyses. Lastly, we provide three baselines to kick off a competition that we envision to both advance the field of machine learning regarding methodology for capturing long-range data dependencies, and facilitate scientific discovery in basic neuroscience.
We provide a detailed documentation of our dataset, following the Datasheet for Datasets questionnaire:
Our dataset originates from the FlyLight project, where the authors released a large image collection of nervous systems of ~74,000 flies, available for download under CC BY 4.0 license.
Each sample consists of a single 3d MCFO image of neurons of the fruit fly.
For each image, we provide a pixel-wise instance segmentation for all separable neurons.
Each sample is stored as a separate zarr file (zarr is a file storage format for chunked, compressed, N-dimensional arrays based on an open-source specification.").
The image data ("raw") and the segmentation ("gt_instances") are stored as two arrays within a single zarr file.
The segmentation mask for each neuron is stored in a separate channel.
The order of dimensions is CZYX.
We recommend to work in a virtual environment, e.g., by using conda:
conda create -y -n flylight-env -c conda-forge python=3.9conda activate flylight-env
pip install zarr
import zarrraw = zarr.open(seg = zarr.open(
# optional:import numpy as npraw_np = np.array(raw)
Zarr arrays are read lazily on-demand.
Many functions that expect numpy arrays also work with zarr arrays.
Optionally, the arrays can also explicitly be converted to numpy arrays.
We recommend to use napari to view the image data.
pip install "napari[all]"
import zarr, sys, napari
raw = zarr.load(sys.argv[1], mode='r', path="volumes/raw")gts = zarr.load(sys.argv[1], mode='r', path="volumes/gt_instances")
viewer = napari.Viewer(ndisplay=3)for idx, gt in enumerate(gts): viewer.add_labels( gt, rendering='translucent', blending='additive', name=f'gt_{idx}')viewer.add_image(raw[0], colormap="red", name='raw_r', blending='additive')viewer.add_image(raw[1], colormap="green", name='raw_g', blending='additive')viewer.add_image(raw[2], colormap="blue", name='raw_b', blending='additive')napari.run()
python view_data.py
For more information on our selected metrics and formal definitions please see our paper.
To showcase the FISBe dataset together with our selection of metrics, we provide evaluation results for three baseline methods, namely PatchPerPix (ppp), Flood Filling Networks (FFN) and a non-learnt application-specific color clustering from Duan et al..
For detailed information on the methods and the quantitative results please see our paper.
The FlyLight Instance Segmentation Benchmark (FISBe) dataset is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
If you use FISBe in your research, please use the following BibTeX entry:
@misc{mais2024fisbe,
title = {FISBe: A real-world benchmark dataset for instance
segmentation of long-range thin filamentous structures},
author = {Lisa Mais and Peter Hirsch and Claire Managan and Ramya
Kandarpa and Josef Lorenz Rumberger and Annika Reinke and Lena
Maier-Hein and Gudrun Ihrke and Dagmar Kainmueller},
year = 2024,
eprint = {2404.00130},
archivePrefix ={arXiv},
primaryClass = {cs.CV}
}
We thank Aljoscha Nern for providing unpublished MCFO images as well as Geoffrey W. Meissner and the entire FlyLight Project Team for valuable
discussions.
P.H., L.M. and D.K. were supported by the HHMI Janelia Visiting Scientist Program.
This work was co-funded by Helmholtz Imaging.
There have been no changes to the dataset so far.
All future change will be listed on the changelog page.
If you would like to contribute, have encountered any issues or have any suggestions, please open an issue for the FISBe dataset in the accompanying github repository.
All contributions are welcome!
Facebook
TwitterA dog segmentation dataset created manually typically involves the following steps:
Image selection: Selecting a set of images that include dogs in various poses and backgrounds.
Image labeling: Manually labeling the dogs in each image using a labeling tool, where each dog is segmented and assigned a unique label.
Image annotation: Annotating the labeled images with the corresponding segmentation masks, where the dog region is assigned a value of 1 and the background region is assigned a value of 0.
Dataset splitting: Splitting the annotated dataset into training, validation, and test sets.
Dataset format: Saving the annotated dataset in a format suitable for use in machine learning frameworks such as TensorFlow or PyTorch.
Dataset characteristics: The dataset may have varying image sizes and resolutions, different dog breeds, backgrounds, lighting conditions, and other variations that are typical of natural images.
Dataset size: The size of the dataset can vary, but it should be large enough to provide a sufficient amount of training data for deep learning models.
Dataset availability: The dataset may be made publicly available for research and educational purposes.
Overall, a manually created dog segmentation dataset provides a high-quality training data for deep learning models and is essential for developing robust segmentation models.
Facebook
TwitterSynthetic image data is generated on 3D game engines ready to use, fully annotated (bounding box, segmentation, keypoint, depth, normal) without any errors. Synthetic data - Solves cold start problems - Reduces development time and costs - Enables more experimentation - Covers edge cases - Removes privacy concerns - Improves existing dataset performance
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
GREEN NIR NORMAL is a dataset for instance segmentation tasks - it contains Bunch AFXa annotations for 1,865 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Skin cancer is one of the most common malignant tumors worldwide, and early detection is crucial for improving its cure rate. In the field of medical imaging, accurate segmentation of lesion areas within skin images is essential for precise diagnosis and effective treatment. Due to the capacity of deep learning models to conduct adaptive feature learning through end-to-end training, they have been widely applied in medical image segmentation tasks. However, challenges such as boundary ambiguity between normal skin and lesion areas, significant variations in the size and shape of lesion areas, and different types of lesions in different samples pose significant obstacles to skin lesion segmentation. Therefore, this study introduces a novel network model called HDS-Net (Hybrid Dynamic Sparse Network), aiming to address the challenges of boundary ambiguity and variations in lesion areas in skin image segmentation. Specifically, the proposed hybrid encoder can effectively extract local feature information and integrate it with global features. Additionally, a dynamic sparse attention mechanism is introduced, mitigating the impact of irrelevant redundancies on segmentation performance by precisely controlling the sparsity ratio. Experimental results on multiple public datasets demonstrate a significant improvement in Dice coefficients, reaching 0.914, 0.857, and 0.898, respectively.
Facebook
TwitterDataset with CT scans includes over 1,000 studies that highlight various pathologies such as cancer, emphysema, hydrothorax, and etc
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Indoor Objects Segmentation Dataset serves the advertisement, gaming, and visual entertainment sectors, offering high-resolution images ranging from 1024 × 1024 to 3024 × 4032. This dataset includes over 50 types of common indoor objects and architectural elements, such as furniture and room structures, annotated for instance, semantic, and contour segmentation.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Multi-atlas bundle segmentation
This data is made to be used with the following script:
https://github.com/scilus/scilpy/blob/master/scripts/scil_tractogram_segment_with_bundleseg.py
Or the following Nextflow pipeline:
https://github.com/scilus/rbx_flow
Etienne St-Onge, Kurt Schilling, Francois Rheault, "BundleSeg: A versatile, reliable and reproducible approach to whitte matter bundle segmentation.", arXiv, 2308.10958 (2023)
Rheault, François. "Analyse et reconstruction de faisceaux de la matière blanche." Computer Science (Université de Sherbrooke) (2020), https://savoirs.usherbrooke.ca/handle/11143/17255
Usage
Here is an example (for more details use `scil_tractogram_segment_with_bundleseg.py -h`) :
antsRegistrationSyNQuick.sh -d 3 -f ${T1} -m mni_masked.nii.gz -t a -n 4scil_tractogram_segment_with_bundleseg.py ${TRACTOGRAM} config_fss_1.json atlas/*/ output0GenericAffine.mat --out_dir ${OUTPUT_DIR}/ --log_level DEBUG --minimal_vote 0.4 --processes 8 --seed 0 --inverse -f
To facilitate interpretation, all endpoints were uniformized head/tail. To see, which side of a bundle is head or tail, you can load the atlas bundle into the software MI-Brain
Notes on bundles
- AC and PC were added mostly in case the atlas is used for lesion-mapping or figures. Likely, segmentation won't produce good results. This is mostly due to difficult tracking for these bundles.
- The CC are split for each lobe. However, for technical consideration, the frontal portion was split in two to facilitate clustering and segmentation. For the same reason, the portion fanning to the pre/post central gyri were separated.
- The streamlines present in the CC are homotopic, Recobundles will allow for variation and thus lead to 'some' heterotopy. However, it is expected that the results will be mostly homotopic.
- CG has 3 possible endpoint locations. However, the full extent of the tail is difficult to track and is often missing.
- FPT and POPT should terminate in the pons. However, to fully capture candidate streamlines and improve segmentation quality even streamlines reaching down the brainstem are selected.
- PYT should reach down the brainstem. For similar reasons to the FPT/POPT, streamlines ending in the pons are selected. Otherwise, fanning is affected and bundles is too skinny.
- OR_ML will most likely have difficulty capturing the full ML. However, this is often due to difficult tracking.
- The cerebellum is often cut due to acquisition FOV. In such a case, all projection bundles will be more difficult to recognize and most cerebellum bundles will be missing (ICP, MCP, SCP).
See Mosaic of bundles here.
Acronym
AC - Anterior commisure
AF - Arcuate fasciculus
CC_Fr_1 - Corpus callosum, Frontal lobe (most anterior part)
CC_Fr_2 - Corpus callosum, Frontal lobe (most posterior part)
CC_Oc - Corpus callosum, Occipital lobe
CC_Pa - Corpus callosum, Parietal lobe
CC_Pr_Po - Corpus callosum, Pre/Post central gyri
CC_Te - Corpus callosum, Temporal lobe
CG - Cingulum
FAT - Frontal aslant tract
FPT - Fronto-pontine tract
FX - Fornix
ICP - Inferior cerebellar peduncle
IFOF - Inferior fronto-occipital fasciculus
ILF - Inferior longitudinal fasciculus
MCP - Middle cerebellar peduncle
MdLF - Middle longitudinal fascicle
OR_ML - Optic radiation and Meyer's loop
PC - Posterior commisure
POPT - parieto-occipito pontine tract
PYT - Pyramidal tract
SCP - Superior cerebellar peduncle
SLF - Superior longitudinal fasciculus
UF - Uncinate fasciculus
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Image Segmentation is a crucial task in computer vision that involves dividing an image into meaningful regions or segments. These segments can correspond to objects, boundaries, or other relevant parts of the image. One common approach for image segmentation is the use of Region of Interest (ROI) techniques.
What Is Image Segmentation?
Region of Interest (ROI) in Image Segmentation:
Skin Classification Using Image Segmentation:
Challenges in Skin Segmentation:
Applications of Skin Segmentation:
Facebook
TwitterThis dataset provides processed and normalized/standardized indices for the management tool 'Customer Segmentation', including the closely related concept of Market Segmentation. Derived from five distinct raw data sources, these indices are specifically designed for comparative longitudinal analysis, enabling the examination of trends and relationships across different empirical domains (web search, literature, academic publishing, and executive adoption). The data presented here represent transformed versions of the original source data, aimed at achieving metric comparability. Users requiring the unprocessed source data should consult the corresponding Customer Segmentation dataset in the Management Tool Source Data (Raw Extracts) Dataverse. Data Files and Processing Methodologies: Google Trends File (Prefix: GT_): Normalized Relative Search Interest (RSI) Input Data: Native monthly RSI values from Google Trends (Jan 2004 - Jan 2025) for the query "customer segmentation" + "market segmentation" + "customer segmentation marketing". Processing: None. Utilizes the original base-100 normalized Google Trends index. Output Metric: Monthly Normalized RSI (Base 100). Frequency: Monthly. Google Books Ngram Viewer File (Prefix: GB_): Normalized Relative Frequency Input Data: Annual relative frequency values from Google Books Ngram Viewer (1950-2022, English corpus, no smoothing) for the query Customer Segmentation + Market Segmentation. Processing: Annual relative frequency series normalized (peak year = 100). Output Metric: Annual Normalized Relative Frequency Index (Base 100). Frequency: Annual. Crossref.org File (Prefix: CR_): Normalized Relative Publication Share Index Input Data: Absolute monthly publication counts matching Customer Segmentation-related keywords [("customer segmentation" OR ...) AND (...) - see raw data for full query] in titles/abstracts (1950-2025), alongside total monthly Crossref publications. Deduplicated via DOIs. Processing: Monthly relative share calculated (Segmentation Count / Total Count). Monthly relative share series normalized (peak month's share = 100). Output Metric: Monthly Normalized Relative Publication Share Index (Base 100). Frequency: Monthly. Bain & Co. Survey - Usability File (Prefix: BU_): Normalized Usability Index Input Data: Original usability percentages (%) from Bain surveys for specific years: Customer Segmentation (1999, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2017). Note: Not reported in 2022 survey data. Processing: Normalization: Original usability percentages normalized relative to its historical peak (Max % = 100). Output Metric: Biennial Estimated Normalized Usability Index (Base 100 relative to historical peak). Frequency: Biennial (Approx.). Bain & Co. Survey - Satisfaction File (Prefix: BS_): Standardized Satisfaction Index Input Data: Original average satisfaction scores (1-5 scale) from Bain surveys for specific years: Customer Segmentation (1999-2017). Note: Not reported in 2022 survey data. Processing: Standardization (Z-scores): Using Z = (X - 3.0) / 0.891609. Index Scale Transformation: Index = 50 + (Z * 22). Output Metric: Biennial Standardized Satisfaction Index (Center=50, Range?[1,100]). Frequency: Biennial (Approx.). File Naming Convention: Files generally follow the pattern: PREFIX_Tool_Processed.csv or similar, where the PREFIX indicates the data source (GT_, GB_, CR_, BU_, BS_). Consult the parent Dataverse description (Management Tool Comparative Indices) for general context and the methodological disclaimer. For original extraction details (specific keywords, URLs, etc.), refer to the corresponding Customer Segmentation dataset in the Raw Extracts Dataverse. Comprehensive project documentation provides full details on all processing steps.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
HOWS-CL-25 (Household Objects Within Simulation dataset for Continual Learning) is a synthetic dataset especially designed for object classification on mobile robots operating in a changing environment (like a household), where it is important to learn new, never seen objects on the fly.
This dataset can also be used for other learning use-cases, like instance segmentation or depth estimation.
Or where household objects or continual learning are of interest.
Our dataset contains 150,795 unique synthetic images using 25 different household categories with 925 3D models in total. For each of those categories, we generated about 6000 RGB images. In addition, we also provide a corresponding depth, segmentation, and normal image.
The dataset was created with BlenderProc [Denninger et al. (2019)], a procedural pipeline to generate images for deep learning.
This tool created a virtual room with randomly textured floors, walls, and a light source with randomly chosen light intensity and color. After that, a 3D model is placed in the resulting room. This object gets customized by randomly assigning materials, including different textures, to achieve a diverse dataset. Moreover, each object might be deformed with a random
displacement texture.
We use 774 3D models from the ShapeNet dataset [A. X. Chang et al. (2015)] and the other models from various internet sites. Please note that we had to manually fix and filter most of the models with Blender before using them in the pipeline!
For continual learning (CL), we provide two different loading schemes:
- Five sequences with five categories each
- Twelve sequences with three categories in the first and two in the other sequences.
In addition to the RGB, depth, segmentation, and normal images, we also provide the calculated features of the RGB images (by ResNet50) as used in our RECALL paper.
In those two loading schemes, ten percent of the images are used for validation, where we ensure that an object instance is either in the training or the validation set, not in both. This avoids learning to recognize certain instances by heart.
We recommend using those loading schemes to compare your approach with others.
Here we provide three files for download:
- HOWS_CL_25.zip [124GB]: This is the original dataset with the RGB, depth, segmentation, and normal images, as well as the loading schemes. It is divided into three archive parts. To open the dataset, please ensure to download all three parts.
- HOWS_CL_25_hdf5_features.zip [2.5GB]: This only contains the calculated features from the RGB input by a ResNet50 in a .hdf5 file. Download this if you want to use the dataset for learning and/or want to compare your approach to our RECALL approach (where we used the same features).
- README.md: Some additional explanation.
For further information and code examples, please have a look at our website: https://github.com/DLR-RM/RECALL.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the realm of digital image applications, image processing technology occupies a pivotal position, with image segmentation serving as a foundational component. As the digital image application domain expands across industries, the conventional segmentation techniques increasingly challenge to cater to modern demands. To address this gap, this paper introduces an MCMC-based image segmentation algorithm based on the Markov Random Field (MRF) model, marking a significant stride in the field. The novelty of this research lies in its method that capitalizes on domain information in pixel space, amplifying the local segmentation precision of image segmentation algorithms. Further innovation is manifested in the development of an adaptive segmentation image denoising algorithm based on MCMC sampling. This algorithm not only elevates image segmentation outcomes, but also proficiently denoises the image. In the experimental results, MRF-MCMC achieves better segmentation performance, with an average segmentation accuracy of 94.26% in Lena images, significantly superior to other common image segmentation algorithms. In addition, the study proposes that the denoising model outperforms other algorithms in peak signal-to-noise ratio and structural similarity in environments with noise standard deviations of 15, 25, and 50. In essence, these experimental findings affirm the efficacy of this study, opening avenues for refining digital image segmentation methodologies.
Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
This dataset includes CT data and segmentation masks from patients diagnosed with COVID-19, as well as data from subjects without the infection.
This study is approved under the ethical approval codes of IR.TUMS.IKHC.REC.1399.255 and IR.TUMS.VCR.REC.1399.488 at Tehran University of Medical Sciences.
The code for loading the dataset and running an AI model is available on: https://github.com/SamanSotoudeh/COVID19-segmentation
Please use the following citations:
1- Arian, Arvin; Mehrabinejad, Mohammad-Mehdi; Zoorpaikar, Mostafa; Hasanzadeh, Navid; Sotoudeh-Paima, Saman; Kolahi, Shahriar; Gity, Masoumeh; Soltanian-Zadeh, "Accuracy of Artificial Intelligence CT Quantification in Predicting COVID-19 Subjects’ Prognosis" PLoS ONE (2023).
2- Sotoudeh-Paima, Saman, et al. "A Multi-centric Evaluation of Deep Learning Models for Segmentation of COVID-19 Lung Lesions on Chest CT Scans." Iranian Journal of Radiology 19.4 (2022).
3- Hasanzadeh, Navid, et al. "Segmentation of COVID-19 Infections on CT: Comparison of four UNet-based networks." 2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME). IEEE, 2020.