The database is a summary of records of: sea turtle size tags applied release and capture location are summarized in this database which is derived from paper data sheets submitted to the Cooperative Marine Turtle Tagging Program CMTTP at the Archie Carr Center for Sea Turtle Research ACCSTR, University of Florida, Gainesville.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
The COVID-19 pandemic is a global healthcare emergency. Prediction models for COVID-19 imaging are rapidly being developed to support medical decision making in imaging. However, inadequate availability of a diverse annotated dataset has limited the performance and generalizability of existing models.
To create the first multi-institutional, multi-national expert annotated COVID-19 imaging dataset made freely available to the machine learning community as a research and educational resource for COVID-19 chest imaging. The Radiological Society of North America (RSNA) assembled the RSNA International COVID-19 Open Radiology Database (RICORD) collection of COVID-related imaging datasets and expert annotations to support research and education. RICORD data will be incorporated in the Medical Imaging and Data Resource Center (MIDRC), a multi-institutional research data repository funded by the National Institute of Biomedical Imaging and Bioengineering of the National Institutes of Health.
This dataset was created through a collaboration between the RSNA and Society of Thoracic Radiology (STR). Clinical annotation by thoracic radiology subspecialists was performed for all COVID positive chest radiography (CXR) imaging studies using a labeling schema based upon guidelines for reporting classification of COVID-19 findings in CXRs (see Review of Chest Radiograph Findings of COVID-19 Pneumonia and Suggested Reporting Language, Journal of Thoracic Imaging).
The RSNA International COVID-19 Open Annotated Radiology Database (RICORD) consists of 998 chest x-rays from 361 patients at four international sites annotated with diagnostic labels.
Patient Selection: Patients at least 18 years in age receiving positive diagnosis for COVID-19.
998 Chest x-ray examinations from 361 patients.
Annotations with labels:
Classification
Typical Appearance
Multifocal bilateral, peripheral opacities, and/or Opacities with rounded morphology
Lower lung-predominant distribution (Required Feature - must be present with either or both of the first two opacity patterns)
Indeterminate Appearance
Absence of typical findings AND Unilateral, central or upper lung predominant distribution of airspace disease
Negative for Pneumonia
No lung opacities
Airspace Disease Grading
Lungs are divided on frontal chest xray into 3 zones per lung (6 zones total). The upper zone extends from the apices to the superior hilum. The mid zone spans between the superior and inferior hilar margins. The lower zone extends from the inferior hilar margins to the costophrenic sulci.
Mild - Required if not negative for pneumonia
Opacities in 1-2 lung zones
Moderate - Required if not negative for pneumonia
Opacities in 3-4 lung zones
Severe - Required if not negative for pneumonia
Opacities in >4 lung zones
Supporting clinical variables: MRN*, Age, Study Date*, Exam Description, Sex, Study UID*, Image Count, Modality, Testing Result, Specimen Source (* pseudonymous values).
How to use the JSON annotations
More information about how the JSON annotations are organized can be found on https://docs.md.ai/data/json/. Steps 2 & 3 in this example code demonstrate how to to load the JSON into a Dataframe. The JSON file can be downloaded via the data access table below; it is not available via MD.ai. This Jupyter Notebook may also be helpful.
RICORD is available for non-commercial use (and further enrichment) by the research and education communities which may include development of educational resources for COVID-19, use of RICORD to create AI systems for diagnosis and quantification, benchmarking performance for existing solutions, exploration of distributed/federated learning, further annotation or data augmentation efforts, and evaluation of the examinations for disease entities beyond COVID-19 pneumonia. Deliberate consideration of the detailed annotation schema, demographics, and other included meta-data will be critical when generating cohorts with RICORD, particularly as more public COVID-19 imaging datasets are made available via complementary and parallel efforts. It is important to emphasize that there are limitations to the clinical “ground truth” as the SARS-CoV-2 RT-PCR tests have widely documented limitations and are subject to both false-negative and false-positive results which impact the distribution of the included imaging data, and may have led to an unknown epidemiologic distortion of patients based on the inclusion criteria. These limitations notwithstanding, RICORD has achieved the stated objectives for data complexity, heterogeneity, and high-quality expert annotations as a comprehensive COVID-19 thoracic imaging data resource.
Label dressing the plan of a POS
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Doodleverse/Segmentation Zoo/Seg2Map Res-UNet models for CoastTrain 5-class segmentation of RGB 768x768 NAIP images
These Residual-UNet model data are based on Coast Train images and associated labels. https://coasttrain.github.io/CoastTrain/docs/Version%201:%20March%202022/data
Models have been created using Segmentation Gym* using the following dataset**: https://doi.org/10.1038/s41597-023-01929-2
Image size used by model: 768 x 768 x 3 pixels
classes:
water
whitewater
sediment
other_bare_natural_terrain
other_terrain
File descriptions
For each model, there are 5 files with the same root name:
'.json' config file: this is the file that was used by Segmentation Gym* to create the weights file. It contains instructions for how to make the model and the data it used, as well as instructions for how to use the model for prediction. It is a handy wee thing and mastering it means mastering the entire Doodleverse.
'.h5' weights file: this is the file that was created by the Segmentation Gym* function train_model.py
. It contains the trained model's parameter weights. It can called by the Segmentation Gym* function seg_images_in_folder.py
. Models may be ensembled.
'_modelcard.json' model card file: this is a json file containing fields that collectively describe the model origins, training choices, and dataset that the model is based upon. There is some redundancy between this file and the config
file (described above) that contains the instructions for the model training and implementation. The model card file is not used by the program but is important metadata so it is important to keep with the other files that collectively make the model and is such is considered part of the model
'_model_history.npz' model training history file: this numpy archive file contains numpy arrays describing the training and validation losses and metrics. It is created by the Segmentation Gym function train_model.py
'.png' model training loss and mean IoU plot: this png file contains plots of training and validation losses and mean IoU scores during model training. A subset of data inside the .npz file. It is created by the Segmentation Gym function train_model.py
Additionally, BEST_MODEL.txt contains the name of the model with the best validation loss and mean IoU
References *Segmentation Gym: Buscombe, D., & Goldstein, E. B. (2022). A reproducible and reusable pipeline for segmentation of geoscientific imagery. Earth and Space Science, 9, e2022EA002332. https://doi.org/10.1029/2022EA002332 See: https://github.com/Doodleverse/segmentation_gym
**Buscombe, D., Wernette, P., Fitzpatrick, S. et al. A 1.2 Billion Pixel Human-Labeled Dataset for Data-Driven Classification of Coastal Environments. Sci Data 10, 46 (2023). https://doi.org/10.1038/s41597-023-01929-2
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Benchmark set at 77.1% O.A at: https://doi.org/10.1117/1.JRS.14.048503
The dataset consists of 60,000 images, corresponding to Landsat patches of 33x33 pixels with 102 bands. Randomly selected from Mexico (country). Each patch is labeled with one of 12 Land Use and Vegetation classes according to the classification described at https://doi.org/10.3390/rs6053923.
The zip file contains 12 folders numbered 1-12 and each contains 5,000 .npy python files (can be loaded with the NumPy library).
The labeled classes correspond to the following identifier.
1, Temperate Coniferous forest 2, Temperate Decidius Forest 3, Temperate Mixed Forest 4, Tropical Evergreen Forest 5, Tropical Deciduous Forest 6, Scrubland 7, Wetland Vegetation 8, Agriculture 9, Grassland 10, Water body 11, Barren Land 12, Urban Area
To build that dataset, we take the information of the National Continuum of Land Use and Vegetation series number 5 generated by the National Institute of Statistics and Geography from Mexico (INEGI) from The National Commission for the Knowledge and Use of Biodiversity (CONABIO) web page (http://geoportal.conabio.gob.mx/metadatos/doc/html/usv250s5ugw.html).
The file used for this dataset construction is the shape format file with geographic coordinates located in http://www.conabio.gob.mx/informacion/gis/maps/geo/usv250s5ugw.zip. Later, a transformation to Albers equal-area conic projection was done with the followings parameters:
Fake east: 2500000.0 Fake North: 0.0 Origin longitude: -102.0º Origin latitude: 12.0º First standard parallel: 17.5º Second standard parallel: 29.5º Linear unit: Meter (1.0) Reference ellipsoid: GRS80
Once the data was projected, using the classes identified in the National Continuum of Land Use and Vegetation, correspondence was applied to the classes identified in https://doi.org/10.3390/rs6053923, these classes being: Agriculture, Barren land, Grassland, Scrubland, Temperate coniferous forest, Temperate deciduous forest, Temperate mixed forest, Tropical deciduous forest, Tropical evergreen forest, Urban area, Waterbody and Wetland vegetation.
Once the information layer was generated with the 12 classes indicated above, the reference layer was rasterized. Thus, a national grid of 1,975,940 regions of 1 x 1 kilometers was generated and the percentage of pixels of the dominant class in each corresponding 1 km region was associated.
A total of cells with 70% or more pixels from one dominant class corresponds to 1,640,827 which represents a total of 83% of the Mexican territory. That means, only 17% of cells have less than 70% of their pixels from one dominant class. Then, 5000 regions were randomly selected from each land cover class at the national level. For this random selection only were selected the regions in which cells have 70% or more of their pixels from one dominant class. The above, for looking to have consistent and reliable data for the automatic classification task. This random selection generates a total of 60,000 regions selected.
Image patches were extracted from the selected regions in the sample.
The image used is the result of the application of multiple time series analysis algorithms on a cube of image data with mainly Tier 1 (T1) quality and a few Tier 2 (T2) as described in https: // www. usgs.gov/land-resources/nli/landsat/landsat-collection-1. An Open Data Cube (ODC, https://www.opendatacube.org/) was constructed from 3,515 Landsat 5 and 7 images corresponding to the year 2011, which is the same reference year of the National Continuum of Land Use and Vegetation Series 5.
From the analysis of the ODC images, the Geomedian (https://doi.org/10.1109/TGRS.2017.2723896) was calculated, which generated a national cloud-free mosaic from 2011, pixels at 30 meters resolution and 6 spectral bands (blue, green, red, nir, swir 1, swir 2). Finally, 15 spectral indices were calculated for each pixel in the image. This resulted in 15 national mosaics from the analysis of the time series of each pixel available for the year 2011 using all the combinations of normalized difference indices, which were possible with the 6 bands that were incorporated into the data cube, with which resulted in 102 information channels. Since Landsat images have a resolution of 30 meters, we have images of 33 pixels x 33 pixels for each region of 1 km x 1 km.
The 102 channels in the patches correspond to:
Geomedian Bands (6): blue, green, red, nir, swir 1, swir 2 Geomedian Based Indexes (15): evi, bu, sr, arvi, ui, ndbi, ibi, ndvi, ndwi, mndwi, nbi, brba, nbai, baei, bi Geomedian Based Tasseled cap transformation (6): brightness, greenness, wetness, fourth, fifth, sixth
2011 Landsat Time Analysis Series by Pixel
(red-swir 1)/(red+swir 1); (5): min, mean, max, std, median (red-nir)/( red+nir); (5): min, mean, max, std, median (swir 1-swir 2)/( swir 1+swir 2); (5): min, mean, max, std, median (nir-swir 2)/(nir+swir 2); (5): min, mean, max, std, median (nir-swir 1)/( nir+swir 1); (5): min, mean, max, std, median (red-swir 2)/( red+swir 2); (5): min, mean, max, std, median (green-swir 2)/(green+swir 2); (5): min, mean, max, std, median (green-swir 1)/(green+swir 1); (5): min, mean, max, std, median (green-red)/(green+red); (5): min, mean, max, std, median (green-nir)/(green+nir); (5): min, mean, max, std, median (blue-swir 2)/(blue+swir 2); (5): min, mean, max, std, median (blue-swir 1)/(blue+swir 1); (5): min, mean, max, std, median (blue-red)/(blue+red); (5): min, mean, max, std, median (blue-nir)/(blue+nir); (5): min, mean, max, std, median (blue-green)/( blue+green); (5): min, mean, max, std, median
Label dressing the plan of a PLU
Label dressing the plan of a PLU
Label dressing the plan of a PLU
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The database is a summary of records of: sea turtle size tags applied release and capture location are summarized in this database which is derived from paper data sheets submitted to the Cooperative Marine Turtle Tagging Program CMTTP at the Archie Carr Center for Sea Turtle Research ACCSTR, University of Florida, Gainesville.