100+ datasets found

u
Historical Unidata Internet Data Distribution (IDD) Global Observational...
data.ucar.edu
rda-web-prod.ucar.edu
+2more
netcdf
Updated Jul 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Unidata, University Corporation for Atmospheric Research (2025). Historical Unidata Internet Data Distribution (IDD) Global Observational Data [Dataset]. http://doi.org/10.5065/9235-WJ24
Explore at:
netcdfAvailable download formats
Unique identifier
https://doi.org/10.5065/9235-WJ24
Dataset updated
Jul 11, 2025
Dataset provided by
Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory
Authors
Unidata, University Corporation for Atmospheric Research
Time period covered
Jan 1, 1970 - Dec 31, 2029
Area covered
Earth
Description
This dataset contains the historical Unidata Internet Data Distribution (IDD) Global Observational Data that are derived from real-time Global Telecommunications System (GTS) reports distributed via the Unidata Internet Data Distribution System (IDD). Reports include surface station (SYNOP) reports at 3-hour intervals, upper air (RAOB) reports at 3-hour intervals, surface station (METAR) reports at 1-hour intervals, and marine surface (BUOY) reports at 1-hour intervals. Select variables found in all report types include pressure, temperature, wind speed, and wind direction. Data may be available at mandatory or significant levels from 1000 millibars to 1 millibar, and at surface levels. Online archives are populated daily with reports generated two days prior to the current date.
f
Distribution of waiting times and displacements: A comparison of over 30...
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura Alessandretti; Piotr Sapiezynski; Sune Lehmann; Andrea Baronchelli (2023). Distribution of waiting times and displacements: A comparison of over 30 datasets on human mobility. [Dataset]. http://doi.org/10.1371/journal.pone.0171686.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0171686.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Laura Alessandretti; Piotr Sapiezynski; Sune Lehmann; Andrea Baronchelli
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The table reports for each dataset: the reference to the journal article/book where the study was published, the type of data (LBSN stands for Location Based Social Networks, CDR for Call Detail Record), the number of individuals (or vehicles in the case of car/taxi data) involved in the data collection, the duration of the data collection (M → months, Y → years, D → days, W → weeks), the minimum and maximum length of spatial displacements, the shape of the probability distribution of displacements with the corresponding parameters, the temporal sampling, the shape of the distribution of waiting times with the corresponding parameters. Power-law (T), indicates a truncated power-law. The table can also be found at http://lauraalessandretti.weebly.com/plosmobilityreview.html.
Z
Data from: A 24-hour dynamic population distribution dataset based on mobile...
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Feb 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Henrikki Tenkanen (2022). A 24-hour dynamic population distribution dataset based on mobile phone data from Helsinki Metropolitan Area, Finland [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4724388
Explore at:
Dataset updated
Feb 16, 2022
Dataset provided by
Henrikki Tenkanen
Claudia Bergroth
Olle Järv
Tuuli Toivonen
Matti Manninen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Helsinki Metropolitan Area, Finland
Description
Related article: Bergroth, C., Järv, O., Tenkanen, H., Manninen, M., Toivonen, T., 2022. A 24-hour population distribution dataset based on mobile phone data from Helsinki Metropolitan Area, Finland. Scientific Data 9, 39.

In this dataset:

We present temporally dynamic population distribution data from the Helsinki Metropolitan Area, Finland, at the level of 250 m by 250 m statistical grid cells. Three hourly population distribution datasets are provided for regular workdays (Mon – Thu), Saturdays and Sundays. The data are based on aggregated mobile phone data collected by the biggest mobile network operator in Finland. Mobile phone data are assigned to statistical grid cells using an advanced dasymetric interpolation method based on ancillary data about land cover, buildings and a time use survey. The data were validated by comparing population register data from Statistics Finland for night-time hours and a daytime workplace registry. The resulting 24-hour population data can be used to reveal the temporal dynamics of the city and examine population variations relevant to for instance spatial accessibility analyses, crisis management and planning.

Please cite this dataset as:

Bergroth, C., Järv, O., Tenkanen, H., Manninen, M., Toivonen, T., 2022. A 24-hour population distribution dataset based on mobile phone data from Helsinki Metropolitan Area, Finland. Scientific Data 9, 39. https://doi.org/10.1038/s41597-021-01113-4

Organization of data

The dataset is packaged into a single Zipfile Helsinki_dynpop_matrix.zip which contains following files:

HMA_Dynamic_population_24H_workdays.csv represents the dynamic population for average workday in the study area.

HMA_Dynamic_population_24H_sat.csv represents the dynamic population for average saturday in the study area.

HMA_Dynamic_population_24H_sun.csv represents the dynamic population for average sunday in the study area.

target_zones_grid250m_EPSG3067.geojson represents the statistical grid in ETRS89/ETRS-TM35FIN projection that can be used to visualize the data on a map using e.g. QGIS.

Column names

YKR_ID : a unique identifier for each statistical grid cell (n=13,231). The identifier is compatible with the statistical YKR grid cell data by Statistics Finland and Finnish Environment Institute.

H0, H1 ... H23 : Each field represents the proportional distribution of the total population in the study area between grid cells during a one-hour period. In total, 24 fields are formatted as “Hx”, where x stands for the hour of the day (values ranging from 0-23). For example, H0 stands for the first hour of the day: 00:00 - 00:59. The sum of all cell values for each field equals to 100 (i.e. 100% of total population for each one-hour period)

In order to visualize the data on a map, the result tables can be joined with the target_zones_grid250m_EPSG3067.geojson data. The data can be joined by using the field YKR_ID as a common key between the datasets.

License Creative Commons Attribution 4.0 International.

Related datasets

Järv, Olle; Tenkanen, Henrikki & Toivonen, Tuuli. (2017). Multi-temporal function-based dasymetric interpolation tool for mobile phone data. Zenodo. https://doi.org/10.5281/zenodo.252612

Tenkanen, Henrikki, & Toivonen, Tuuli. (2019). Helsinki Region Travel Time Matrix [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3247564
n
Real-World Distribution Network and Loading Data
data.ncl.ac.uk
xlsx
Updated Sep 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ilias Sarantakos; David Greenwood; Peter Davison; Haris Patsios (2021). Real-World Distribution Network and Loading Data [Dataset]. http://doi.org/10.25405/data.ncl.16456014.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.25405/data.ncl.16456014.v1
Dataset updated
Sep 1, 2021
Dataset provided by
Newcastle University
Authors
Ilias Sarantakos; David Greenwood; Peter Davison; Haris Patsios
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Network and loading data for a real-world distribution network in the North-East of England.
d
Promote Implementation of the Model Data Distribution Policy
datadiscoverystudio.org
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Promote Implementation of the Model Data Distribution Policy [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/283d1d85cae9449ebbd80089fd760ac4/html
Explore at:
Description
Link to the ScienceBase Item Summary page for the item described by this metadata record. Service Protocol: Link to the ScienceBase Item Summary page for the item described by this metadata record. Application Profile: Web Browser. Link Function: information
Data from: METASHIFT: A DATASET OF DATASETS FOR EVALUATING CONTEXTUAL...
zenodo.org
bin, json, txt
Updated Jul 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xinyu Yang; Xinyu Yang (2022). METASHIFT: A DATASET OF DATASETS FOR EVALUATING CONTEXTUAL DISTRIBUTION SHIFTS [Dataset]. http://doi.org/10.5281/zenodo.6804766
Explore at:
txt, json, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6804766
Dataset updated
Jul 7, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Xinyu Yang; Xinyu Yang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Understanding the performance of machine learning models across diverse data distributions is critically important for reliable applications. Motivated by this, there is a growing focus on curating benchmark datasets that capture distribution shifts. In this work, we present MetaShift—a collection of 12,868 sets of natural images across 410 classes—to address this challenge. We leverage the natural heterogeneity of Visual Genome and its annotations to construct MetaShift. The key construction idea is to cluster images using its metadata, which provides context for each image (e.g. cats with cars or cats in bathroom) that represent distinct data distributions. MetaShift has two important benefits: first, it contains orders of magnitude more natural data shifts than previously available. Second, it provides explicit explanations of what is unique about each of its data sets and a distance score that measures the amount of distribution shift between any two of its data sets. Importantly, to support evaluating ImageNet trained models on MetaShift, we match MetaShift with ImageNet hierarchy. The matched version covers 867 out of 1,000 classes in ImageNet-1k. Each class in the ImageNet-matched Metashift contains 2301.6 images on average, and 19.3 subsets capturing images in different contexts. We also propose a method to construct tasks on the matched version, giving an example to construct 19,024 binary classification tasks on it.
f
Data_Sheet_1_Raw Data Visualization for Common Factorial Designs Using SPSS:...
frontiersin.figshare.com
zip
Updated Jun 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Loffing (2023). Data_Sheet_1_Raw Data Visualization for Common Factorial Designs Using SPSS: A Syntax Collection and Tutorial.ZIP [Dataset]. http://doi.org/10.3389/fpsyg.2022.808469.s001
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.3389/fpsyg.2022.808469.s001
Dataset updated
Jun 2, 2023
Dataset provided by
Frontiers
Authors
Florian Loffing
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Transparency in data visualization is an essential ingredient for scientific communication. The traditional approach of visualizing continuous quantitative data solely in the form of summary statistics (i.e., measures of central tendency and dispersion) has repeatedly been criticized for not revealing the underlying raw data distribution. Remarkably, however, systematic and easy-to-use solutions for raw data visualization using the most commonly reported statistical software package for data analysis, IBM SPSS Statistics, are missing. Here, a comprehensive collection of more than 100 SPSS syntax files and an SPSS dataset template is presented and made freely available that allow the creation of transparent graphs for one-sample designs, for one- and two-factorial between-subject designs, for selected one- and two-factorial within-subject designs as well as for selected two-factorial mixed designs and, with some creativity, even beyond (e.g., three-factorial mixed-designs). Depending on graph type (e.g., pure dot plot, box plot, and line plot), raw data can be displayed along with standard measures of central tendency (arithmetic mean and median) and dispersion (95% CI and SD). The free-to-use syntax can also be modified to match with individual needs. A variety of example applications of syntax are illustrated in a tutorial-like fashion along with fictitious datasets accompanying this contribution. The syntax collection is hoped to provide researchers, students, teachers, and others working with SPSS a valuable tool to move towards more transparency in data visualization.
a
Nitrates Data Distribution - Annual
home-pugonline.hub.arcgis.com
Updated Oct 24, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The PUG User Group (2023). Nitrates Data Distribution - Annual [Dataset]. https://home-pugonline.hub.arcgis.com/datasets/nitrates-data-distribution-annual
Explore at:
Dataset updated
Oct 24, 2023
Dataset authored and provided by
The PUG User Group
Area covered

Description
Number of in situ measurements obtained from instruments carried aboard oceanographic research and merchant ships. This is of annual data distribution. The spatial and temporal coverage of nitrates data in the Gulf of Mexico is not uniform, and most of the historical data were collected over the continental shelf near shallow intertidal areas (<200 m depth).
Data for the paper, "On Efficient Spectroscopy Calculations for Thermal...
catalog.data.gov
data.nist.gov
Updated Jun 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2023). Data for the paper, "On Efficient Spectroscopy Calculations for Thermal Distributions of Atoms" [Dataset]. https://catalog.data.gov/dataset/data-for-the-paper-on-efficient-spectroscopy-calculations-for-thermal-distributions-of-ato
Explore at:
Dataset updated
Jun 7, 2023
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
Simulated transmission curves illustrating an efficient new calculation method. Data was produced for a publication, and is indexed by figure.
d
Coho Distribution [ds326]
catalog.data.gov
data.ca.gov
+4more
Updated Nov 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Fish and Wildlife (2024). Coho Distribution [ds326] [Dataset]. https://catalog.data.gov/dataset/coho-distribution-ds326-cc8ae
Explore at:
Dataset updated
Nov 27, 2024
Dataset provided by
California Department of Fish and Wildlife
Description
November 2022 VersionThis dataset represents the "Observed Distribution" for coho salmon in California by using observations made only between 1990 and the present. It was developed for the express purpose of assisting with species recovery planning efforts. The process for developing this dataset was to collect as many observations of the species as possible and derive the stream-based geographic distribution for the species based solely on these positive observations.For the purpose of this dataset an observation is defined as a report of a sighting or other evidence of the presence of the species at a given place and time. As such, observations are modeled by year observed as point locations in the GIS. All such observations were collected with information regarding who reported the observation, their agency/organization/affiliation, the date that they observed the species, who compiled the information, etc. This information is maintained in the developers file geodatabase (©Environmental Science Research Institute (ESRI) 2016).To develop this distribution dataset, the species observations were applied to California Streams, a CDFW derivative of USGS National Hydrography Dataset (NHD) High Resolution hydrography. For each observation, a path was traced down the hydrography from the point of observation to the ocean, thereby deriving the shortest migration route from the point of observation to the sea. By appending all of these migration paths together, the "Observed Distribution" for the species is developed.It is important to note that this layer does not attempt to model the entire possible distribution of the species. Rather, it only represents the known distribution based on where the species has been observed and reported. While some observations indeed represent the upstream extent of the species (e.g., an observation made at a hard barrier), the majority of observations only indicate where the species was sampled for or otherwise observed. Because of this, this dataset likely underestimates the absolute geographic distribution of the species.It is also important to note that the species may not be found on an annual basis in all indicated reaches due to natural variations in run size, water conditions, and other environmental factors. As such, the information in this dataset should not be used to verify that the species are currently present in a given stream. Conversely, the absence of distribution linework for a given stream does not necessarily indicate that the species does not occur in that stream. The observation data were compiled from a variety of disparate sources including but not limited to CDFW, USFS, NMFS, timber companies, and the public. Forms of documentation include CDFW administrative reports, personal communications with biologists, observation reports, and literature reviews. The source of each feature (to the best available knowledge) is included in the data attributes for the observations in the geodatabase, but not for the resulting linework. The spatial data has been referenced to California Streams, a CDFW derivative of USGS National Hydrography Dataset (NHD) High Resolution hydrography.Usage of this dataset:Examples of appropriate uses include:- species recovery planning- Evaluation of future survey sites for the species- Validating species distribution modelsExamples of inappropriate uses include:- Assuming absence of a line feature means that the species are not present in that stream.- Using this data to make parcel or ground level land use management decisions.- Using this dataset to prove or support non-existence of the species at any spatial scale.- Assuming that the line feature represents the maximum possible extent of species distribution.All users of this data should seek the assistance of qualified professionals such as surveyors, hydrologists, or fishery biologists as needed to ensure that such users possess complete, precise, and up to date information on species distribution and water body location.Any copy of this dataset is considered to be a snapshot of the species distribution at the time of release. It is impingent upon the user to ensure that they have the most recent version prior to making management or planning decisions.Please refer to "Use Constraints" section below.
h
DSR-Bench-spatial
huggingface.co
Updated May 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vitercik Lab (2025). DSR-Bench-spatial [Dataset]. https://huggingface.co/datasets/vitercik-lab/DSR-Bench-spatial
Explore at:
Dataset updated
May 15, 2025
Dataset authored and provided by
Vitercik Lab
Description
Dataset Card for "DSR-Bench-spatial"

DSR-Bench-spatial extends 3 data structures in DSR-Bench (K-D Heap, K-D Tree, Geometric Graphs) into variants in terms of dimensionality and data distribution. It contains the 1D, 2D, 3D, and 5D data versions of all three data structures, and 3 non-uniform data distributions (moons, circles, blobs) versions of K-D Tree, all containing short, medium, and long prompts, yielding a total of 450 questions. DSR-Bench-spatial is designed to highlight… See the full description on the dataset page: https://huggingface.co/datasets/vitercik-lab/DSR-Bench-spatial.
Distribution latina inc Import Company US
seair.co.in
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim, Distribution latina inc Import Company US [Dataset]. https://www.seair.co.in
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset provided by
Seair Exim Solutions
Authors
Seair Exim
Area covered
United States
Description
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
B
Big Data Processing and Distribution Software Report
datainsightsmarket.com
doc, pdf, ppt
Updated May 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Big Data Processing and Distribution Software Report [Dataset]. https://www.datainsightsmarket.com/reports/big-data-processing-and-distribution-software-1395953
Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
May 10, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Big Data Processing and Distribution Software market is experiencing robust growth, driven by the exponential increase in data volume across industries and the rising need for efficient data management and analytics. The market, estimated at $50 billion in 2025, is projected to witness a Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033, reaching approximately $150 billion by 2033. This growth is fueled by several key factors, including the increasing adoption of cloud-based solutions, the proliferation of Internet of Things (IoT) devices generating massive data streams, and the growing demand for real-time analytics and data-driven decision-making across various sectors like finance, healthcare, and retail. Large enterprises are leading the adoption, followed by a rapidly growing segment of Small and Medium-sized Enterprises (SMEs) leveraging cloud-based solutions for cost-effectiveness and scalability. The market is characterized by a competitive landscape with both established players like Google, Amazon Web Services, and Microsoft, and emerging niche providers offering specialized solutions. While the North American market currently holds a significant share, regions like Asia-Pacific are showing exceptional growth potential, driven by rapid digitalization and increasing investments in data infrastructure. However, the market also faces certain restraints. These include the complexities associated with data integration and management, the high costs of implementing and maintaining big data solutions, and the need for skilled professionals to manage and analyze the data effectively. Furthermore, ensuring data security and compliance with evolving regulations poses a challenge for organizations. Despite these hurdles, the overall market outlook remains positive, fueled by continuous technological advancements, increasing data generation, and the growing understanding of the value of data-driven insights. The shift towards cloud-based solutions continues to be a significant trend, facilitating easier access, scalability, and reduced infrastructure costs. The market's future hinges on the continued development of innovative solutions addressing security, scalability, and ease of use, catering to the diverse needs of various industry segments and geographical locations.
Modelled Global Distribution of the Seagrass Biome - United Nations...
palau-data.sprep.org
fsm-data.sprep.org
+13more
Updated Feb 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Secretariat of the Pacific Regional Environment Programme (2025). Modelled Global Distribution of the Seagrass Biome - United Nations Environment Programme World Conservation Monitoring Centre [Dataset]. https://palau-data.sprep.org/dataset/modelled-global-distribution-seagrass-biome-united-nations-environment-programme-world
Explore at:
Dataset updated
Feb 20, 2025
Dataset provided by
Pacific Regional Environment Programmehttps://www.sprep.org/
License
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
Area covered
192.10693359375 84.897146951603, -172.11181640625 84.897146951603, POLYGON ((-172.11181640625 -86.91338962312, 192.10693359375 -86.91338962312)), Worldwide
Description
This is a MaxEnt model map of the global distribution of the seagrass biome. Species occurrence records were extracted from the Global Biodiversity Information Facility (GBIF), United Nations Environment Programme-World Conservation Monitoring Centre (UNEP-WCMC) Ocean Data Viewer and Ocean biogeographic information system (OBIS). This map shows the suitable habitats for the seagrass distribution at global scale.

Citation: Jayathilake D.R.M., Costello M.J. 2018. A modelled global distribution of the seagrass biome. Biological Conservation. https://doi.org/10.1016/j.biocon.2018.07.009

Use Constraints: Creative Commons Attribution 4.0 Unported (CC BY 4.0). https://creativecommons.org/licenses/by/4.0/.

Free to (1) copy and redistribute the material in any medium or format, (2) remix, transform, and build upon the material for any purpose, even commercially. You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Data distribution service USA Import & Buyer Data
seair.co.in
Updated Nov 29, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2014). Data distribution service USA Import & Buyer Data [Dataset]. https://www.seair.co.in
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Nov 29, 2014
Dataset provided by
Seair Exim Solutions
Authors
Seair Exim
Area covered
United States
Description
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
V
Hazardous Substances Data Bank (HSDB)
data.virginia.gov
html
Updated Jun 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Library of Medicine (2025). Hazardous Substances Data Bank (HSDB) [Dataset]. https://data.virginia.gov/dataset/hazardous-substances-data-bank-hsdb
Explore at:
htmlAvailable download formats
Dataset updated
Jun 18, 2025
Dataset provided by
National Library of Medicine
Description
Hazardous Substances Data Bank (HSDB) was a toxicology database that focused on the toxicology of potentially hazardous chemicals. It provided information on human exposure, industrial hygiene, emergency handling procedures, environmental fate, regulatory requirements, nanomaterials, and related areas. The information in HSDB has been assessed by a Scientific Review Panel.

This version of HSDB data includes a subset of HSDB for downloading, but is no longer updated. HSDB data has been incorporated into PubChem.
A
Digitized NHANES II X-ray Films
data.amerigeoss.org
html
Updated Dec 7, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States (2020). Digitized NHANES II X-ray Films [Dataset]. https://data.amerigeoss.org/dataset/digitized-nhanes-ii-x-ray-films
Explore at:
htmlAvailable download formats
Dataset updated
Dec 7, 2020
Dataset provided by
United States
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
The National Health and Nutrition Examination Surveys (NHANES), conducted by the National Center for Health Statistics, Centers for Disease Control (NCHS/CDC), were designed to assess the health and nutritional status of adults and children in the United States through interviews and direct physical examinations. The NHANES radiographs were scanned by Dr. Bernie Huang at the University of California at Los Angeles and the University of California at San Francisco. Dr. Huang’s group used a Lumysis 100 with a 175 micron spot to scan the first 6000 radiographs. The remaining radiographs were scanned on the Lumysis 150 again with a 175 micron spot size. NOTE: This dataset is no-longer updated with new content.
P
misinfo-general Dataset
paperswithcode.com
Updated Oct 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ivo Verhoeven; Pushkar Mishra; Ekaterina Shutova (2024). misinfo-general Dataset [Dataset]. https://paperswithcode.com/dataset/misinfo-general
Explore at:
Dataset updated
Oct 11, 2024
Authors
Ivo Verhoeven; Pushkar Mishra; Ekaterina Shutova
Description
We introduce misinfo-general, a benchmark dataset for evaluating misinformation models’ ability to perform out-of-distribution generalisation. Misinformation changes rapidly, much quicker than moderators can annotate at scale, resulting in a shift between the training and inference data distributions. As a result, misinformation models need to be able to perform out-of-distribution generalisation, an understudied problem in existing datasets.

Constructed on top of the various NELA corpora (2017, 2018, 2019, 2020, 2021, 2022), misinfo-general is a large, diverse dataset consisting of news articles from reliable and unreliable publishers. Unlike NELA, we apply several rounds of deduplication and filtering to ensure all articles are of reasonable quality.

We use distant labelling to provide each publisher with rich metadata annotations. These annotations allow for simulating various generalisation splits that misinformation models are confronted with during deployment. We focus on 6 such splits-time, event, topic, publisher, political bias, misinformation type-but more are possible.

By releasing this dataset publicly, we hope to encourage future works that design misinformation models specifically with out-of-distribution generalisation in mind.
Clinical Questions Collection
healthdata.gov
data.virginia.gov
+3more
application/rdfxml +5
Updated Feb 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
datadiscovery.nlm.nih.gov (2021). Clinical Questions Collection [Dataset]. https://healthdata.gov/dataset/Clinical-Questions-Collection/3g22-yf4h
Explore at:
application/rssxml, xml, csv, json, application/rdfxml, tsvAvailable download formats
Dataset updated
Feb 26, 2021
Dataset provided by
datadiscovery.nlm.nih.gov
Description
The Clinical Questions Collection is a repository of questions that have been collected between 1991 – 2003 from healthcare providers in clinical settings across the country. The questions have been submitted by investigators who wish to share their data with other researchers. This dataset is no-longer updated with new content. The collection is used in developing approaches to clinical and consumer-health question answering, as well as researching information needs of clinicians and the language they use to express their information needs. All files are formatted in XML.
P
Meta-Dataset Dataset
paperswithcode.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eleni Triantafillou; Tyler Zhu; Vincent Dumoulin; Pascal Lamblin; Utku Evci; Kelvin Xu; Ross Goroshin; Carles Gelada; Kevin Swersky; Pierre-Antoine Manzagol; Hugo Larochelle, Meta-Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/meta-dataset
Explore at:
Authors
Eleni Triantafillou; Tyler Zhu; Vincent Dumoulin; Pascal Lamblin; Utku Evci; Kelvin Xu; Ross Goroshin; Carles Gelada; Kevin Swersky; Pierre-Antoine Manzagol; Hugo Larochelle
Description
The Meta-Dataset benchmark is a large few-shot learning benchmark and consists of multiple datasets of different data distributions. It does not restrict few-shot tasks to have fixed ways and shots, thus representing a more realistic scenario. It consists of 10 datasets from diverse domains:

ILSVRC-2012 (the ImageNet dataset, consisting of natural images with 1000 categories) Omniglot (hand-written characters, 1623 classes) Aircraft (dataset of aircraft images, 100 classes) CUB-200-2011 (dataset of Birds, 200 classes) Describable Textures (different kinds of texture images with 43 categories) Quick Draw (black and white sketches of 345 different categories) Fungi (a large dataset of mushrooms with 1500 categories) VGG Flower (dataset of flower images with 102 categories), Traffic Signs (German traffic sign images with 43 classes) MSCOCO (images collected from Flickr, 80 classes).

All datasets except Traffic signs and MSCOCO have a training, validation and test split (proportioned roughly into 70%, 15%, 15%). The datasets Traffic Signs and MSCOCO are reserved for testing only.

Facebook

Twitter

Click to copy link

Link copied

Cite

Unidata, University Corporation for Atmospheric Research (2025). Historical Unidata Internet Data Distribution (IDD) Global Observational Data [Dataset]. http://doi.org/10.5065/9235-WJ24

Historical Unidata Internet Data Distribution (IDD) Global Observational Data

Explore at:

8 scholarly articles cite this dataset (View in Google Scholar)

netcdfAvailable download formats

Unique identifier

https://doi.org/10.5065/9235-WJ24

Dataset updated

Jul 11, 2025

Dataset provided by

Research Data Archive at the National Center for Atmospheric Research, Computational and Information Systems Laboratory

Authors

Unidata, University Corporation for Atmospheric Research

Time period covered

Jan 1, 1970 - Dec 31, 2029

Area covered

Earth

Description

This dataset contains the historical Unidata Internet Data Distribution (IDD) Global Observational Data that are derived from real-time Global Telecommunications System (GTS) reports distributed via the Unidata Internet Data Distribution System (IDD). Reports include surface station (SYNOP) reports at 3-hour intervals, upper air (RAOB) reports at 3-hour intervals, surface station (METAR) reports at 1-hour intervals, and marine surface (BUOY) reports at 1-hour intervals. Select variables found in all report types include pressure, temperature, wind speed, and wind direction. Data may be available at mandatory or significant levels from 1000 millibars to 1 millibar, and at surface levels. Online archives are populated daily with reports generated two days prior to the current date.

Clear search

Close search

Google apps

Main menu

Historical Unidata Internet Data Distribution (IDD) Global Observational...

Distribution of waiting times and displacements: A comparison of over 30...

Data from: A 24-hour dynamic population distribution dataset based on mobile...

Real-World Distribution Network and Loading Data

Promote Implementation of the Model Data Distribution Policy

Data from: METASHIFT: A DATASET OF DATASETS FOR EVALUATING CONTEXTUAL...

Data_Sheet_1_Raw Data Visualization for Common Factorial Designs Using SPSS:...

Nitrates Data Distribution - Annual

Data for the paper, "On Efficient Spectroscopy Calculations for Thermal...

Coho Distribution [ds326]

DSR-Bench-spatial

Distribution latina inc Import Company US

Big Data Processing and Distribution Software Report

Modelled Global Distribution of the Seagrass Biome - United Nations...

Data distribution service USA Import & Buyer Data

Hazardous Substances Data Bank (HSDB)

Digitized NHANES II X-ray Films

misinfo-general Dataset

Clinical Questions Collection

Meta-Dataset Dataset

Historical Unidata Internet Data Distribution (IDD) Global Observational DataSee More Versions

Historical Unidata Internet Data Distribution (IDD) Global Observational Data