https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Random Sample of NIH Chest X-ray Dataset is a sample version of a large public medical imaging dataset containing 112,120 chest X-ray images and 15 disease (or normal) labels collected from 30,805 patients.
2) Data Utilization (1) Random Sample of NIH Chest X-ray Dataset has characteristics that: • Each sample comes with detailed metadata such as image file name, disease label, patient ID, age, gender, direction of shooting, and image size, and the label extracts the radiographic reading report with NLP, showing an accuracy of more than 90%. • It contains 5,606 1024x1024 size images, consisting of 14 diseases and a 'No Finding' class, but due to the nature of the sample, some disease data are very scarce. (2) Random Sample of NIH Chest X-ray Dataset can be used to: • Development of chest disease image reading AI: Using X-ray images with various chest disease labels, deep learning-based automatic diagnosis and classification models can be trained and evaluated. • Medical image data preprocessing and labeling research: It can be used for medical artificial intelligence research and algorithm development such as automatic labeling of large medical image datasets, data quality evaluation, and weak-supervised learning.
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
The NIH Chest X-ray dataset consists of 100,000 de-identified images of chest x-rays. The images are in PNG format.
The data is provided by the NIH Clinical Center and is available through the NIH download site: https://nihcc.app.box.com/v/ChestXray-NIHCC
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Chest X-ray exams are one of the most frequent and cost-effective medical imaging examinations available. However, clinical diagnosis of a chest X-ray can be challenging and sometimes more difficult than diagnosis via chest CT imaging. The lack of large publicly available datasets with annotations means it is still very difficult, if not impossible, to achieve clinically relevant computer-aided detection and diagnosis (CAD) in real world medical sites with chest X-rays. One major hurdle in creating large X-ray image datasets is the lack resources for labeling so many images. Prior to the release of this dataset, Openi was the largest publicly available source of chest X-ray images with 4,143 images available.
This NIH Chest X-ray Dataset is comprised of 112,120 X-ray images with disease labels from 30,805 unique patients. To create these labels, the authors used Natural Language Processing to text-mine disease classifications from the associated radiological reports. The labels are expected to be >90% accurate and suitable for weakly-supervised learning. The original radiology reports are not publicly available but you can find more details on the labeling process in this Open Access paper: "ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases." (Wang et al.)
The image labels are NLP extracted so there could be some erroneous labels but the NLP labeling accuracy is estimated to be >90%.
Very limited numbers of disease region bounding boxes (See BBoxlist2017.csv)
Chest x-ray radiology reports are not anticipated to be publicly shared. Parties who use this public dataset are encouraged to share their “updated” image labels and/or new bounding boxes in their own studied later, maybe through manual annotation
Image format: 112,120 total images with size 1024 x 1024
images_001.zip: Contains 4999 images
images_002.zip: Contains 10,000 images
images_003.zip: Contains 10,000 images
images_004.zip: Contains 10,000 images
images_005.zip: Contains 10,000 images
images_006.zip: Contains 10,000 images
images_007.zip: Contains 10,000 images
images_008.zip: Contains 10,000 images
images_009.zip: Contains 10,000 images
images_010.zip: Contains 10,000 images
images_011.zip: Contains 10,000 images
images_012.zip: Contains 7,121 images
README_ChestXray.pdf: Original README file
BBoxlist2017.csv: Bounding box coordinates. Note: Start at x,y, extend horizontally w pixels, and vertically h pixels
Image Index: File name
Finding Label: Disease type (Class label)
Bbox x
Bbox y
Bbox w
Bbox h
Dataentry2017.csv: Class labels and patient data for the entire dataset
Image Index: File name
Finding Labels: Disease type (Class label)
Follow-up #
Patient ID
Patient Age
Patient Gender
View Position: X-ray orientation
OriginalImageWidth
OriginalImageHeight
OriginalImagePixelSpacing_x
OriginalImagePixelSpacing_y
There are 15 classes (14 diseases, and one for "No findings"). Images can be classified as "No findings" or one or more disease classes:
Atelectasis
Consolidation
Infiltration
Pneumothorax
Edema
Emphysema
Fibrosis
Effusion
Pneumonia
Pleural_thickening
Cardiomegaly
Nodule Mass
Hernia
There are 12 zip files in total and range from ~2 gb to 4 gb in size. Additionally, we randomly sampled 5% of these images and created a smaller dataset for use in Kernels. The random sample contains 5606 X-ray images and class labels.
Sample: sample.zip
Original TAR archives were converted to ZIP archives to be compatible with the Kaggle platform
CSV headers slightly modified to be more explicit in comma separation and also to allow fields to be self-explanatory
Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM. ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases. IEEE CVPR 2017, ChestX-ray8Hospital-ScaleChestCVPR2017_paper.pdf
NIH News release: NIH Clinical Center provides one of the largest publicly available chest x-ray datasets to scientific community
Original source files and documents: https://nihcc.app.box.com/v/ChestXray-NIHCC/folder/36938765345
http://bimcv.cipf.es/bimcv-projects/padchest/padchest-dataset-research-use-agreement/http://bimcv.cipf.es/bimcv-projects/padchest/padchest-dataset-research-use-agreement/
A sample of a labeled large-scale, high resolution chest x-ray dataset for automated ex-ploration of medical images along with their associated reports. This dataset includes more than 160,000 images from 67,000 patients that were interpreted and reported by radiologists at Hospital San Juan (Spain) from 2009 to 2017, covering six different position views and additional information on image acquisition and patient demography - 167MB
This data contains X-ray computed tomography (XCT) reconstructed slices of additively manufactured cobalt chrome samples produced with varying laser powder bed fusion (LPBF) processing parameters (scan speed and hatch spacing). A constant laser power of 195 W and a layer thickness of 20 µm were used. Unoptimized processing parameters created defects in these parts. The as-built CoCr disks were 40 mm in diameter and 10 mm in height, with no post-processing step (e.g. heat treatment or hot isostatic pressing) used. Five mm diameter cylinders were cored out of each disk, and regions of interests (ROIs) within the cylinders were measured with XCT. The voxel size is approximately 2.5 µm, and approximately 1000 x 1000 x 1000 voxel three-dimensional images were obtained, for an actual volume of about (pi/4) x (2.5 mm)^3 in case of the approximately 2.5 µm voxel data sets. The data set contains two folders ('raw' and 'segmented') with 5 zipped tiff image folders, one for each sample. The images in the 'raw' folder are the original 16-bit XCT reconstructed images. The images in the 'segmented' folder are the segmented images. 'setn' in the file name represents the sample set and 'samplen' represents the sample number. The final trailing -n represents the number of the image in the stack where higher number is toward the top of the sample.
X-ray computed tomography (XCT) datasets with known ground truth pores were developed using realistic XCT simulation. Non-overlapping spherical pores of varying sizes are randomly distributed in a cylindrical part near surfaces and within the core. Ground truth data, ground truth binary data, and reconstructed data for three different signal-to-noise ratios (SNRs) are provided. The data set can be used for evaluation and comparison of image segmentation/detection algorithms.
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
The MIMIC Chest X-ray (MIMIC-CXR) Database v2.0.0 is a large publicly available dataset of chest radiographs in DICOM format with free-text radiology reports. The dataset contains 377,110 images corresponding to 227,835 radiographic studies performed at the Beth Israel Deaconess Medical Center in Boston, MA. The dataset is de-identified to satisfy the US Health Insurance Portability and Accountability Act of 1996 (HIPAA) Safe Harbor requirements. Protected health information (PHI) has been removed. The dataset is intended to support a wide body of research in medicine including image understanding, natural language processing, and decision support.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This research attempts to provide first its type public dataset for silicosis detection. The dataset contains frontal chest X-rays collected over three years. The data has been collected from the stone workers in the primary health centres of Rajasthan, India. The dataset contains samples of silicosis, STB, TB, and normal. The average size of each sample is around 3567 x 2898. The samples in the dataset are made available as three-channel JPG images. The dataset is divided into two sets:Set A: It contains images with only disease labels.Set B: It contains images with lung segmentation maps, annotations and disease labelsTo obtain access to the dataset, please email the duly filled-out license agreement to databases@iab-rubric.org with the subject line "Licence Agreement for the Silicodata Dataset.".NOTE: The license agreement has to be signed by someone having the legal authority to sign on behalf of the institute, such as the head of the institution or registrar. If a license agreement is signed by someone else, it will not be processed further.This database is available only for research and educational purposes and not for any commercial use
Of the more than 1000 QSOs in the Large Bright Quasar Survey (LBQS), we study the X-ray properties of 908 that were covered by the ROSAT All-Sky Survey (RASS). These data constitute among the largest, most homogeneous X-ray surveys of QSOs to date, and as such are well suited to the study of the multiwavelength properties of QSOs. Due to the ~600 s RASS exposure times, only 10% of the QSOs are detected in X-rays. However, by stacking X-ray counts, we obtain effectively much more sensitive observations for an average QSO in bins of redshift or luminosity, and for several classes of QSOs. We confirm a correlation of alpha_ox (slope of a hypothetical power law connecting 2500A and 2keV) with luminosity for the overall sample. For higher redshifts and optical luminosities, radio-loud QSOs appear to become progressively more luminous in X-rays than radio-quiet QSOs. The X-ray properties of a subsample of 36 broad absorption line QSOs suggest that they are strongly absorbed or underluminous in the X-rays, while a subsample of 22 Fe II-strong QSOs is anomalously X-ray bright.
We present photometric and spectroscopic observations of the X-ray sources detected in the wide-area, moderately deep Chandra Large Area Synoptic X-ray Survey (CLASXS) of the Lockman Hole-Northwest field. We have B, V, R, I, and z' photometry for 521 (99%) of the 525 sources in the X-ray catalog and spectroscopic redshifts for 271 (52%), including 20 stars. We do not find evidence for redshift groupings of the X-ray sources, like those found in the Chandra Deep Field surveys, because of the larger solid angle covered by this survey. We separate the X-ray sources by optical spectral type and examine the colors, apparent and absolute magnitudes, and redshift distributions for the broad-line and nonbroad-line active galactic nuclei. Combining our wide-area survey with other Chandra and XMM-Newton hard X-ray surveys, we find a definite lack of luminous, high accretion rate sources at z<1, consistent with previous observations that showed that supermassive black hole growth is dominated at low redshifts by sources with low accretion rates.
Comprehensive dataset of 40 X-ray labs in Hungary as of July, 2025. Includes verified contact information (email, phone), geocoded addresses, customer ratings, reviews, business categories, and operational details. Perfect for market research, lead generation, competitive analysis, and business intelligence. Download a complimentary sample to evaluate data quality and completeness.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This upload contains samples 1 - 8 from the data collection described in
Henri Der Sarkissian, Felix Lucka, Maureen van Eijnatten, Giulia Colacicco, Sophia Bethany Coban, Kees Joost Batenburg, "A Cone-Beam X-Ray CT Data Collection Designed for Machine Learning", Sci Data 6, 215 (2019). https://doi.org/10.1038/s41597-019-0235-y or arXiv:1905.04787 (2019)
Abstract:
"Unlike previous works, this open data collection consists of X-ray cone-beam (CB) computed tomography (CT) datasets specifically designed for machine learning applications and high cone-angle artefact reduction: Forty-two walnuts were scanned with a laboratory X-ray setup to provide not only data from a single object but from a class of objects with natural variability. For each walnut, CB projections on three different orbits were acquired to provide CB data with different cone angles as well as being able to compute artefact-free, high-quality ground truth images from the combined data that can be used for supervised learning. We provide the complete image reconstruction pipeline: raw projection data, a description of the scanning geometry, pre-processing and reconstruction scripts using open software, and the reconstructed volumes. Due to this, the dataset can not only be used for high cone-angle artefact reduction but also for algorithm development and evaluation for other tasks, such as image reconstruction from limited or sparse-angle (low-dose) scanning, super resolution, or segmentation."
The scans are performed using a custom-built, highly flexible X-ray CT scanner, the FleX-ray scanner, developed by XRE nvand located in the FleX-ray Lab at the Centrum Wiskunde & Informatica (CWI) in Amsterdam, Netherlands. The general purpose of the FleX-ray Lab is to conduct proof of concept experiments directly accessible to researchers in the field of mathematics and computer science. The scanner consists of a cone-beam microfocus X-ray point source that projects polychromatic X-rays onto a 1536-by-1944 pixels, 14-bit flat panel detector (Dexella 1512NDT) and a rotation stage in-between, upon which a sample is mounted. All three components are mounted on translation stages which allow them to move independently from one another.
Please refer to the paper for all further technical details.
The complete data set can be found via the following links: 1-8, 9-16, 17-24, 25-32, 33-37, 38-42
The corresponding Python scripts for loading, pre-processing and reconstructing the projection data in the way described in the paper can be found on github
For more information or guidance in using these dataset, please get in touch with
Comprehensive dataset of 7 X-ray labs in State of Acre, Brazil as of July, 2025. Includes verified contact information (email, phone), geocoded addresses, customer ratings, reviews, business categories, and operational details. Perfect for market research, lead generation, competitive analysis, and business intelligence. Download a complimentary sample to evaluate data quality and completeness.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary
This dataset is a collection of X-ray projection images of a walnut imaged in a cone-beam computed tomography (CBCT) scanner, using four different dose levels. The dataset also includes a metadata file for each of the scans, specifying the scan geometry and other important scan parameters.
Description
Sample Information
The sample is a walnut in its shell. For the scanning process double-sided tape was used to attach the sample to a plastic tube placed into the rotation stage.
Scanner
The measurements were acquired using a cone-beam computed tomography scanner designed and constructed in-house in the Industrial Mathematics Computed Tomography Laboratory at the University of Helsinki. The scanner consists of a molybdenum target X-ray tube (Oxford Instruments XTF5011), a motorized rotation stage (Thorlabs CR1-Z7), and a 12-bit, 2240x2368 pixel, energy-integrating flat panel detector (Hamatsu Photonics C7942CA-22).
Scan Settings
The dataset consists of four different scans of the same sample. For each scan 360 X-ray projections were acquired using an angle increment of 1 degrees, with one additional frame taken at the end to estimate sample movement. The X-ray source was set at 40 kV with a 0.5 mm aluminum filter. For the different scans, the relative doses, tube currents, and exposure times were:
Data Post-Processing
Before the scans, two correction images were acquired for each scan setting. A dark current image was created by averaging 255 images taken with the X-ray source off. A flat-field image was created by averaging 255 images taken with the X-ray source switched on with no sample placed in the scanner. After the scan, dark current and flat-field corrections were applied to each projection image using the Hamamatsu HiPic imaging software version 9.3.
Data Format
The X-ray projections are stored in .tif format. The metadata are contained in .txt files with formatting that is both human-readable and machine-readable.
Notes
Due to a slightly misaligned center of rotation in the scanner, the CT reconstructions can appear blurry. It was empirically observed that this problem can be compensated for quite well by shifting each projection left by 4 pixels, using circular boundary conditions, before performing any other operations on the projections. It was also observed that the scans are not entirely aligned, with a small angular discrepancy between each reconstruction.
Research Group
This dataset was produced by the Inverse Problems research group at the Department of Mathematics and Statistics at the University of Helsinki, Finland: https://www.helsinki.fi/en/researchgroups/inverse-problems.
Additional Links
To get started with the data, we recommend looking at the HelTomo toolbox, specifically created for working with CBCT data collected in the Industrial Mathematics Computed Tomography Laboratory, and available at https://se.mathworks.com/matlabcentral/fileexchange/74417-heltomo-helsinki-tomography-toolbox.
Please note that this is a an entirely separate dataset from the Walnut datasets accessible at https://zenodo.org/record/1254206 and https://doi.org/10.5281/zenodo.6986012, although both datasets have been created by the same research group.
Contact Details
For more information or guidance in using these datasets, please contact alexander.meaney [at] helsinki.fi.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Samples were analyzed using the handheld portable X-ray fluorescence (pXRF) analyzer as part of a study examining the occurrence chromium and of natural and anthropogenic hexavalent Chromium, Cr(VI) in groundwater. Data will be used to estimate naturally-occurring background Cr(VI) concentrations upgradient, near the plume margins, and downgradient from a mapped Cr(VI) contamination plume near Hinkley, CA (Izbicki and Groover, 2016). Relative concentrations for 18 elements of interest were measured on the less than 2mm and sized- fraction splits of the 36 field samples. Greater than 20 percent of the samples analyzed using pXRF also have Contract Laboratory results for comparison. These pXRF results are part of a data release including grain size distribution, photographic, and associated chemical and mineral analysis data for 36 sediment core and alluvium samples as well as Scanning Electron Microscopy analyses on select grains from magnetic and heavy mineral separates collected ...
Splits of the less than 38 micron size fraction were processed to make oriented clay mounts and analyzed using X-ray diffraction (XRD) as part of a study examining the occurrence of chromium and natural and anthropogenic hexavalent Chromium, Cr(VI) in groundwater. Data will be used to estimate naturally-occurring background Cr(VI) concentrations upgradient, near the plume margins, and downgradient from a mapped Cr(VI) contamination plume near Hinkley, CA (Izbicki and Groover, 2016). These clay mineralogy XRD results are part of a data release including grain size distribution, photographic, and associated chemical and mineral analysis data for 36 sediment core and alluvium samples as well as select grains from magnetic and heavy mineral separates collected near Hinkley, CA. The cooperator for this study is the Lahontan Regional Water Quality Control Board. Results are provided in this data release organized by analysis type.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary
This dataset is a collection of X-ray projection data of a biological imaging phantom (a bird chest) imaged in an X-ray microtomography scanner, using three different X-ray spectra. The dataset also includes a metadata file for each of the scans, specifying the scan geometry and other important scan parameters, as well as photographs and example reconstructions. The dataset is designed for use in algorithm development for multienergy computed tomography.
Description
Sample Information
The sample is the chest of a common quail (Coturnix coturnix) bird obtained frozen from a local supermarket. The chest section of the frozen bird was removed using a handsaw, and left to melt and settle in a sample holder before imaging.
Scanner
The measurement data were acquired using an X-ray microtomography scanner in the University of Helsinki Micro-CT Laboratory. The scanner uses cone beam geometry and it is equipped with an end-window tube with a tungsten target.
Scan Settings
The dataset consists of three consecutive scans made using identical geometry but different X-ray spectra and detector exposure times. For each scan, 720 X-ray projections were acquired using an angle increment of 0.5 degrees. Multiple frames were averaged for each projection in order to increase signal-to-noise ratio. The scan geometry and the energy-specific settings are summarized in the following two tables.
Table 1. Imaging geometry used for collecting the data.
Parameter Value
Focus-center distance 252 mm
Focus-detector distance 420 mm
Geometric magnification 5/2
Detector pixel size 0.200 mm
Effective pixel size 0.120 mm
Projection size 552 x 576 pixels
Angular range 360'
720
Table 2. Energy-specific settings used for collecting the data.
Energy label U (kV) Filtration I (μA) Exposure time (ms) Frame averaging
E1 50 None 300 125 4
E2 80 1 mm Al 180 125 4
E3 120 0.5 mm Cu 120 250 4
Data Post-Processing
Before the scans were made, a dark current image and flat-field image were acquired for each scan setting. During the scans, dark current subtraction and flat-field correction were automatically applied to the X-ray projections by the measurement software.
Data Contents
This dataset contains the following files:
The raw projection data (.tif format) for each scan and a metadata file (.txt format) describing the measurement setup, with formatting that is both human-readable and machine-readable.
Pre-created 2D sinograms for each energy level. The sinograms have been created from the central plane of the cone beam, which reduces to fan beam geometry. The sinograms are stored in Matlab's .mat file format in data structures which also contain metadata on the measurement.
Photographs taken during the measurement process.
Example filtered backprojection (FBP) reconstructions of the central plane of the phantom for each energy. The reconstructions were computed using the Phoenix datos|x CT software provided with the microtomography scanner
Research Group
This dataset was produced by the Inverse Problems research group at the Department of Mathematics and Statistics at the University of Helsinki, Finland (https://www.helsinki.fi/en/researchgroups/inverse-problems) in collaboration with the Computational Physics and Inverse Problems research group at the University of Eastern Finland, Finland (https://sites.uef.fi/inverse) and the X-ray Laboratory at the Department of Physics at the University of Helsinki, Finland (https://www.helsinki.fi/en/researchgroups/x-ray-laboratory).
Previous Use
This dataset has been used in the following publications:
Jussi Toivanen, Alexander Meaney, Samuli Siltanen, Ville Kolehmainen. Joint reconstruction in low dose multi-energy CT. Inverse Problems and Imaging, 2020, 14(4): 607-629. doi: 10.3934/ipi.2020028.
E. Cueva, A. Meaney, S. Siltanen, M. J. Ehrhardt. Synergistic multi-spectral CT reconstruction with directional total variation. Philos Trans A Math Phys Eng Sci. 2021 Aug 23;379(2204):20200198. doi: 10.1098/rsta.2020.0198.
Additional Links
To get started with the data, we recommend looking at the HelTomo toolbox, specifically created for working with CBCT data collected by the Inverse Problems research group, and available at https://se.mathworks.com/matlabcentral/fileexchange/74417-heltomo-helsinki-tomography-toolbox.
Acknowledgements
We wish to thank laboratory engineer Heikki Suhonen for his guidance and assistance in conducting the measurements.
Contact Details
For more information or guidance in using these datasets, please contact alexander.meaney [at] helsinki.fi.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary
This submission contains a collection of 235800 X-ray projections of 131 pieces of modeling clay (Play-Doh) with various numbers of stones inserted. The submission is intended as an extensive and easy-to-use training dataset for supervised machine learning driven object detection. The ground truth locations of the stones are included. The data is supplementary material to the paper titled "A tomographic workflow enabling deep learning for X-ray based foreign object detection" [Zeegers 2022].
Description
Sample information
The samples are modeling clay (Play-Doh, Hasbro, RI, USA) with various numbers of pieces of gravel included. In total 131 samples are prepared, of which 20 samples contain 5-8 inserted stones, 3 samples contain three stones, 35 contain two stones, 62 contain one stone and 11 contain no stones. The stones have an average diameter of ca. 7mm (ranging from 3mm to 11mm). The Play-Doh is remolded for every sample.
Apparatus
The dataset is acquired in the FleX-ray Laboratory, developed by TESCAN-XRE, located at CWI in Amsterdam. The CT scanner consists consists of a cone-beam microfocus polychromatic X-ray point source, and a 1944x1536 pixel, 14-bit, flat detector panel (Dexela1512NDT). Full details can be found in [Coban 2020].
Scanning setup
For each sample, 1800 radiographs are collected by rotating the sample over 360 degrees in a circular and continuous motion. A peak voltage of 90kV is used, and the target power is set to 20W. The distance between the source and detector is 69.80 cm and the distance between the source and the object is 44.14 cm. An exposure time of 20 ms is used for each projection.
Experimental plan
This data is the result of a demonstration of a workflow to collect annotated data for supervised machine learning for X-ray based object detection. The ground truth locations are retrieved by tomographic reconstruction, segmentation and virtual projections with the same acquisition angles. A detailed description for the workflow to obtain a training dataset is given in [Zeegers 2022].
Technical details
All projections have been corrected with flatfield images (averaged over 10 pre and 10 post radiographs) and darkfield images (averaged over 10 pre and 10 post images). Both the X-ray projections and the ground truth images are resized to 128x128 pixels. The raw data is made available in another (larger) submission for complete reproduction (https://zenodo.org/record/5866228). All images are stored in .tif format. The data for samples with 5-8 stones are put in a separate folder from the data with 0-3 stones. The size of the completely unpacked dataset is 19.6 GB.
NOTE: Because the dataset consists of 471600 files, fully extracting the dataset may take a while. Therefore, an additional and significantly smaller zip-file is included for previewing the data, with one X-ray projection for each sample.
Additional Links These datasets are produced by the Computational Imaging group at Centrum Wiskunde & Informatica (CI-CWI) in Amsterdam, The Netherlands: https://www.cwi.nl/research/groups/computational-imaging
Contact details zeegers [at] cwi [dot] nl
Acknowledgments The authors would like to acknowledge the funding from the Netherlands Organisation for Scientific Research (NWO), project number 639.073.506. The authors also acknowledge TESCAN-XRE NV for their collaboration and support of the FleX-ray laboratory.
References [Zeegers 2022] M. T. Zeegers, T. van Leeuwen, D. M. Pelt, S. B. Coban, R. van Liere, K. J. Batenburg, "A tomographic workflow enabling deep learning for X-ray based foreign object detection", 2022 (in preparation) [Coban 2020] S. B. Coban, F. Lucka, W. J. Palenstijn, D. Van Loo, and K. J. Batenburg, “Explorative imaging and its implementation at the FleX-ray Laboratory,” J. Imaging, vol. 6, no. 18, 2020, doi: 10.3390/jimaging6040018.
If you use (parts of) this data in a publication, we would appreciate it if you would refer to the first article.
The Serendipitous Extragalactic X-ray Source Identification (SEXSI) Program is designed to expand significantly the sample of identified extragalactic hard X-ray sources at intermediate fluxes, 10-15 ergs/cm2/s < 2-10 keV Flux <~ 10-13 ergs/cm2/s. SEXSI, which includes sources derived from more than 2 square degrees of Chandra images, provides the largest hard X-ray-selected sample yet studied, offering an essential complement to the Chandra Deep Fields (total area of 0.2 square degrees). In Eckart et al. (2005, Paper II) R-band optical imaging of the SEXSI fields from the Palomar P60 and P200, the MDM 2.4m and 1.3m, and the Keck I telescopes is described. The authors have identified counterparts or derived flux limits for nearly 1000 hard X-ray sources. Using the optical images, they have derived accurate source positions. They have investigated correlations between optical and X-ray flux, and optical flux and X-ray hardness ratio. They have also studied the density of optical sources surrounding X-ray counterparts, as well as the properties of optically faint, hard X-ray sources. In Eckart et al. (2006, Paper III) optical spectra of 477 counterparts are presented. These spectra reach to R-band magnitudes of <~24 and have produced identifications and redshifts for 438 hard X-ray sources. Typical completeness levels in the 27 Chandra fields studied are 40-70%. The vast majority of the 2-10 keV selected sample are AGNs with redshifts between 0.1 and 3; the highest redshift source lies at z = 4.33. This table which combines data presented in Eckart et al. (2005, 2006) has links to the list of SEXSI X-ray sources (the HEASARC Browse table CHANSEXSI: see Paper I = Harrison et al. 2003, ApJ, 596, 944). This table was originally created by the HEASARC in June 2005 based on the CDS version of Table 3 from Eckart et al. (2005: CDS table J/ApJS/156/35/table3.dat). It was updated in August 2006 to include information from Table 2 of Eckart et al. (2006: the electronic version available at the electronic ApJ web site). This is a service provided by NASA HEASARC .
The ROSAT All-Sky Survey (RASS) was the first imaging X-ray survey of the entire sky. Combining the RASS Bright and Faint Source Catalogs yields an average of about three X-ray sources per square degree. However, while X-ray source counterparts are known to range from distant quasars to nearby M dwarfs, the RASS data alone are often insufficient to determine the nature of an X-ray source. As a result, large-scale follow-up programs are required to construct samples of known X-ray emitters. The authors use optical data produced by the Sloan Digital Sky Survey (SDSS) to identify 709 stellar X-ray emitters cataloged in the RASS and falling within the SDSS Data Release 1 footprint. Most of these are bright stars with coronal X-ray emission unsuitable for SDSS spectroscopy, which is designed for fainter objects (g > 15mag). Instead, the authors use SDSS photometry, correlations with the Two Micron All Sky Survey (2MASS) and other catalogs, and spectroscopy from the Apache Point Observatory 3.5 m telescope to identify these stellar X-ray counterparts. Their sample of 707 X-ray-emitting F, G, K, and M stars is one of the largest X-ray-selected samples of such stars. The authors derive distances to these stars using photometric parallax relations appropriate for dwarfs on the main sequence, and use these distances to calculate their X-ray luminosities LX. They also identify a previously unknown cataclysmic variable (CV) as a RASS counterpart. Much more information on the SDSS is available at the project's web site at http://www.sdss.org/. This table was created by the HEASARC in April 2009 based on the the machine-readable version of Table 4 from the reference paper which was obtained from the ApJ web site. This is a service provided by NASA HEASARC .
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Random Sample of NIH Chest X-ray Dataset is a sample version of a large public medical imaging dataset containing 112,120 chest X-ray images and 15 disease (or normal) labels collected from 30,805 patients.
2) Data Utilization (1) Random Sample of NIH Chest X-ray Dataset has characteristics that: • Each sample comes with detailed metadata such as image file name, disease label, patient ID, age, gender, direction of shooting, and image size, and the label extracts the radiographic reading report with NLP, showing an accuracy of more than 90%. • It contains 5,606 1024x1024 size images, consisting of 14 diseases and a 'No Finding' class, but due to the nature of the sample, some disease data are very scarce. (2) Random Sample of NIH Chest X-ray Dataset can be used to: • Development of chest disease image reading AI: Using X-ray images with various chest disease labels, deep learning-based automatic diagnosis and classification models can be trained and evaluated. • Medical image data preprocessing and labeling research: It can be used for medical artificial intelligence research and algorithm development such as automatic labeling of large medical image datasets, data quality evaluation, and weak-supervised learning.