100+ datasets found

f
Data from: Development and validation of HBV surveillance models using big...
tandf.figshare.com
docx
Updated Dec 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Weinan Dong; Cecilia Clara Da Roza; Dandan Cheng; Dahao Zhang; Yuling Xiang; Wai Kay Seto; William C. W. Wong (2024). Development and validation of HBV surveillance models using big data and machine learning [Dataset]. http://doi.org/10.6084/m9.figshare.25201473.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25201473.v1
Dataset updated
Dec 3, 2024
Dataset provided by
Taylor & Francis
Authors
Weinan Dong; Cecilia Clara Da Roza; Dandan Cheng; Dahao Zhang; Yuling Xiang; Wai Kay Seto; William C. W. Wong
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The construction of a robust healthcare information system is fundamental to enhancing countries’ capabilities in the surveillance and control of hepatitis B virus (HBV). Making use of China’s rapidly expanding primary healthcare system, this innovative approach using big data and machine learning (ML) could help towards the World Health Organization’s (WHO) HBV infection elimination goals of reaching 90% diagnosis and treatment rates by 2030. We aimed to develop and validate HBV detection models using routine clinical data to improve the detection of HBV and support the development of effective interventions to mitigate the impact of this disease in China. Relevant data records extracted from the Family Medicine Clinic of the University of Hong Kong-Shenzhen Hospital’s Hospital Information System were structuralized using state-of-the-art Natural Language Processing techniques. Several ML models have been used to develop HBV risk assessment models. The performance of the ML model was then interpreted using the Shapley value (SHAP) and validated using cohort data randomly divided at a ratio of 2:1 using a five-fold cross-validation framework. The patterns of physical complaints of patients with and without HBV infection were identified by processing 158,988 clinic attendance records. After removing cases without any clinical parameters from the derivation sample (n = 105,992), 27,392 cases were analysed using six modelling methods. A simplified model for HBV using patients’ physical complaints and parameters was developed with good discrimination (AUC = 0.78) and calibration (goodness of fit test p-value >0.05). Suspected case detection models of HBV, showing potential for clinical deployment, have been developed to improve HBV surveillance in primary care setting in China. (Word count: 264) This study has developed a suspected case detection model for HBV, which can facilitate early identification and treatment of HBV in the primary care setting in China, contributing towards the achievement of WHO’s elimination goals of HBV infections.We utilized the state-of-art natural language processing techniques to structure the data records, leading to the development of a robust healthcare information system which enhances the surveillance and control of HBV in China. This study has developed a suspected case detection model for HBV, which can facilitate early identification and treatment of HBV in the primary care setting in China, contributing towards the achievement of WHO’s elimination goals of HBV infections. We utilized the state-of-art natural language processing techniques to structure the data records, leading to the development of a robust healthcare information system which enhances the surveillance and control of HBV in China.
FDA Drug Product Labels Validation Method Data Package
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). FDA Drug Product Labels Validation Method Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/fda-drug-product-labels-validation-method-data-package/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Description
This data package contains information on Structured Product Labeling (SPL) Terminology for SPL validation procedures and information on performing SPL validations.
m
PEN-Method: Predictor model and Validation Data
data.mendeley.com
narcis.nl
Updated Sep 3, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Halle (2021). PEN-Method: Predictor model and Validation Data [Dataset]. http://doi.org/10.17632/459f33wxf6.4
Explore at:
Unique identifier
https://doi.org/10.17632/459f33wxf6.4
Dataset updated
Sep 3, 2021
Authors
Alex Halle
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This Data contains the PEN-Predictor-Keras-Model as well as the 100 validation data sets.
Supplementary Materials: Choosing Validation Methods for Agent-Based Models...
zenodo.org
bin, csv, png
Updated Jun 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Artem Serdyuk; Artem Serdyuk (2025). Supplementary Materials: Choosing Validation Methods for Agent-Based Models - R Scripts, Data, and Visualizations" [Dataset]. http://doi.org/10.5281/zenodo.15633195
Explore at:
bin, csv, pngAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15633195
Dataset updated
Jun 10, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Artem Serdyuk; Artem Serdyuk
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains supplementary materials, including R scripts, data files, figures, and documentation for the agent-based model validation framework presented in the article. README.md includes a detailed description.
D
Python functions -- cross-validation methods from a data-driven perspective
phys-techsciences.datastations.nl
zenodo.org
docx, png +4
Updated Aug 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Y. Wang; Y. Wang (2024). Python functions -- cross-validation methods from a data-driven perspective [Dataset]. http://doi.org/10.17026/PT/TXAU9W
Explore at:
tiff(2474294), tiff(2412540), tsv(49141), txt(1220), tiff(2413148), tsv(20072), tsv(30174), tiff(4833081), tiff(12196238), tiff(1606453), tiff(4729349), tiff(5695336), tsv(29), tiff(6478950), tiff(6534556), tiff(6466131), text/x-python(8210), docx(63366), tsv(12056), tiff(6567360), tsv(28), tiff(5385805), tsv(263901), tiff(6385076), text/x-python(5598), tiff(2423836), tiff(3417568), text/x-python(8181), png(110251), tiff(5726045), tsv(48948), tsv(1564525), tiff(3031197), tiff(2059260), tiff(2880005), tiff(6135064), tiff(3648419), tsv(102), tiff(3060978), tiff(3802696), tiff(4396561), tiff(1385025), text/x-python(1184), tiff(2817752), tiff(2516606), tsv(27725), text/x-python(12795), tiff(2282443)Available download formats
Unique identifier
https://doi.org/10.17026/PT/TXAU9W
Dataset updated
Aug 16, 2024
Dataset provided by
DANS Data Station Physical and Technical Sciences
Authors
Y. Wang; Y. Wang
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This is the organized python functions of proposed methods in Yanwen Wang PhD research. Researchers can directly use these functions to conduct spatial+ cross-validation, dissimilarity quantification method, and dissimilarity-adaptive cross-validation.
Method Validation Data.xlsx
figshare.com
xlsx
Updated Jan 28, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Norberto Gonzalez; Alanah Fitch (2020). Method Validation Data.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.11741703.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11741703.v1
Dataset updated
Jan 28, 2020
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Norberto Gonzalez; Alanah Fitch
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data for method validation on detecting pmp-glucose by HPLC
H
Replication Data for: Beyond Standardization: A Comprehensive Review of...
dataverse.harvard.edu
Updated Apr 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jana Bernhard-Harrer (2025). Replication Data for: Beyond Standardization: A Comprehensive Review of Topic Modeling Validation Methods for Computational Social Science Research [Dataset]. http://doi.org/10.7910/DVN/N67BDI
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/N67BDI
Dataset updated
Apr 28, 2025
Dataset provided by
Harvard Dataverse
Authors
Jana Bernhard-Harrer
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
copy directly from abstract in PSRM publicationAs the use of computational text analysis in the social sciences has increased, topic modeling has emerged as a popular method for identifying latent themes in textual data. Nevertheless, concerns have been raised regarding the validity of the results produced by this method, given that it is largely automated and inductive in nature, and the lack of clear guidelines for validating topic models has been identified by scholars as an area of concern. In response, we conducted a comprehensive systematic review of 789 studies that employ topic modeling. Our goal is to investigate whether the field is moving towards a common framework for validating these models. The findings of our review indicate a notable absence of standardized validation practices and a lack of convergence towards specific methods of validation. This gap may be attributed to the inherent incompatibility between the inductive, qualitative approach of topic modeling and the deductive, quantitative tradition that favors standardized validation. To address this, we advocate for incorporating qualitative validation approaches, emphasizing transparency and detailed reporting to improve the credibility of findings in computational social science research, when using topic modeling.
f
Data from: Selection of optimal validation methods for quantitative...
tandf.figshare.com
xlsx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
K. Héberger (2023). Selection of optimal validation methods for quantitative structure–activity relationships and applicability domain [Dataset]. http://doi.org/10.6084/m9.figshare.23185916.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23185916.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Taylor & Francis
Authors
K. Héberger
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This brief literature survey groups the (numerical) validation methods and emphasizes the contradictions and confusion considering bias, variance and predictive performance. A multicriteria decision-making analysis has been made using the sum of absolute ranking differences (SRD), illustrated with five case studies (seven examples). SRD was applied to compare external and cross-validation techniques, indicators of predictive performance, and to select optimal methods to determine the applicability domain (AD). The ordering of model validation methods was in accordance with the sayings of original authors, but they are contradictory within each other, suggesting that any variant of cross-validation can be superior or inferior to other variants depending on the algorithm, data structure and circumstances applied. A simple fivefold cross-validation proved to be superior to the Bayesian Information Criterion in the vast majority of situations. It is simply not sufficient to test a numerical validation method in one situation only, even if it is a well defined one. SRD as a preferable multicriteria decision-making algorithm is suitable for tailoring the techniques for validation, and for the optimal determination of the applicability domain according to the dataset in question.
Prognostics of Power Electronics, methods and validation testbeds - Dataset...
data.nasa.gov
data.staging.idas-ds1.appdat.jsc.nasa.gov
Updated Mar 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). Prognostics of Power Electronics, methods and validation testbeds - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/prognostics-of-power-electronics-methods-and-validation-testbeds
Explore at:
Dataset updated
Mar 31, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
An overview of the current results of prognostics for DC- DC power converters is presented, focusing on the output filter capacitor component. The electrolytic capacitor used typically as fileter capacitor is one of the components of the power supply with higher failure rate, hence the effort in devel- oping component level prognostics methods for capacitors. An overview of prognostics algorithms based on electrical overstress and thermal overstress accelerated aging data is presented and a discussion on the current efforts in terms of validation of the algorithms is included. The focus of current and future work is to develop a methodology that allows for algoritm development using accelerated aging data and then transform that to a valid algorithm on the real usage time scale.
c
ckanext-validation
catalog.civicdataecosystem.org
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). ckanext-validation [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-validation
Explore at:
Dataset updated
Dec 16, 2024
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The Validation extension for CKAN enhances data quality within the CKAN ecosystem by leveraging the Frictionless Framework to validate tabular data. This extension allows for automated data validation, generating comprehensive reports directly accessible within the CKAN interface. The validation process helps identify structural and schema-level issues, ensuring data consistency and reliability. Key Features: Automated Data Validation: Performs data validation automatically in the background or during dataset creation, streamlining the quality assurance process. Comprehensive Validation Reports: Generates detailed reports on data quality, highlighting issues such as missing headers, blank rows, incorrect data types, or values outside of defined ranges. Frictionless Framework Integration: Utilizes the Frictionless Framework library for robust and standardized data validation. Exposed Actions: Provides accessible action functions that allows data validation to be integrated into custom workflows from other CKAN extensions. Command Line Interface: Offers a command-line interface (CLI) to manually trigger validation jobs for specific datasets, resources, or based on search criteria. Reporting Utilities: Enables the generation of global reports summarizing validation statuses across all resources. Use Cases: Improve Data Quality: Ensures data integrity and adherence to defined schemas, leading to better data-driven decision-making. Streamline Data Workflows: Integrates validation as part of data creation or update processes, automating quality checks and saving time. Customize Data Validation Rules: Allows developers to extend the validation process with their own custom workflows and integrations using the exposed actions. Technical Integration: The Validation extension integrates deeply within CKAN by providing new action functions (resourcevalidationrun, resourcevalidationshow, resourcevalidationdelete, resourcevalidationrunbatch) that can be called via the CKAN API. It also includes a plugin interface (IPipeValidation) for more advanced customization, which allows other extensions to receive and process validation reports. Users can utilize the command-line interface to trigger validation jobs and generate overview reports. Benefits & Impact: By implementing the Validation extension, CKAN installations can significantly improve the quality and reliability of their data. This leads to increased trust in the data, better data governance, and reduced errors in downstream applications that rely on the data. Automated validation helps to proactively identify and resolve data issues, contributing to a more efficient data management process.
d
Prognostics of Power Electronics, methods and validation testbeds
catalog.data.gov
Updated Apr 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Prognostics of Power Electronics, methods and validation testbeds [Dataset]. https://catalog.data.gov/dataset/prognostics-of-power-electronics-methods-and-validation-testbeds
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
An overview of the current results of prognostics for DC- DC power converters is presented, focusing on the output filter capacitor component. The electrolytic capacitor used typically as fileter capacitor is one of the components of the power supply with higher failure rate, hence the effort in devel- oping component level prognostics methods for capacitors. An overview of prognostics algorithms based on electrical overstress and thermal overstress accelerated aging data is presented and a discussion on the current efforts in terms of validation of the algorithms is included. The focus of current and future work is to develop a methodology that allows for algoritm development using accelerated aging data and then transform that to a valid algorithm on the real usage time scale.
Z
Data for training, validation and testing of methods in the thesis:...
data.niaid.nih.gov
Updated May 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucia Hajduková (2021). Data for training, validation and testing of methods in the thesis: Camera-based Accuracy Improvement of Indoor Localization [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4730337
Explore at:
Dataset updated
May 1, 2021
Dataset authored and provided by
Lucia Hajduková
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The package contains files for two modules designed to improve the accuracy of the indoor positioning system, namely the following:

door detection

videos_test - videos used to demonstrate the application of door detector

videos_res - videos from videos_test directory with detected doors marked

parts detection

frames_train_val - images generated from videos used for training and validation of VGG16 neural network model

frames_test - images generated from videos used for testing of the trained model

videos_test - videos used to demonstrate the application of parts detector

videos_res - videos from videos_test directory with detected parts marked
f
Data from: Cross-Validation With Confidence
tandf.figshare.com
zip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jing Lei (2023). Cross-Validation With Confidence [Dataset]. http://doi.org/10.6084/m9.figshare.9976901.v3
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.9976901.v3
Dataset updated
May 31, 2023
Dataset provided by
Taylor & Francis
Authors
Jing Lei
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Cross-validation is one of the most popular model and tuning parameter selection methods in statistics and machine learning. Despite its wide applicability, traditional cross-validation methods tend to overfit, due to the ignorance of the uncertainty in the testing sample. We develop a novel statistically principled inference tool based on cross-validation that takes into account the uncertainty in the testing sample. This method outputs a set of highly competitive candidate models containing the optimal one with guaranteed probability. As a consequence, our method can achieve consistent variable selection in a classical linear regression setting, for which existing cross-validation methods require unconventional split ratios. When used for tuning parameter selection, the method can provide an alternative trade-off between prediction accuracy and model interpretability than existing variants of cross-validation. We demonstrate the performance of the proposed method in several simulated and real data examples. Supplemental materials for this article can be found online.
Z
Data from: Synthetic Smart Card Data for the Analysis of Temporal and...
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paul Bouman (2020). Synthetic Smart Card Data for the Analysis of Temporal and Spatial Patterns [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_776718
Explore at:
Dataset updated
Jan 24, 2020
Dataset authored and provided by
Paul Bouman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a synthetic smart card data set that can be used to test pattern detection methods for the extraction of temporal and spatial data. The data set is tab seperated and based on a stylized travel pattern description for city of Utrecht in The Netherlands and is developed and used in Chapter 6 of the PhD Thesis of Paul Bouman.

This dataset contains the following files:

journeys.tsv : the actual data set of synthetic smart card data

utrecht.xml : the activity pattern definition that was used to randomly generate the synthethic smart card data

validate.ref : a file derived from the activity pattern definition that can be used for validation purposes. It specifies which activity types occur at each location in the smart card data set.
d
Summary report of the 4th IAEA Technical Meeting on Fusion Data Processing,...
search.dataone.org
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
S.M. Gonzalez de Vicente, D. Mazon, M. Xu, S. Pinches, M. Churchill, A. Dinklage, R. Fischer, A. Murari, P. Rodriguez-Fernandez, J. Stillerman, J. Vega, G. Verdoolaege (2024). Summary report of the 4th IAEA Technical Meeting on Fusion Data Processing, Validation and Analysis (FDPVA) [Dataset]. http://doi.org/10.7910/DVN/ZZ9UKO
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/ZZ9UKO
Dataset updated
Sep 24, 2024
Dataset provided by
Harvard Dataverse
Authors
S.M. Gonzalez de Vicente, D. Mazon, M. Xu, S. Pinches, M. Churchill, A. Dinklage, R. Fischer, A. Murari, P. Rodriguez-Fernandez, J. Stillerman, J. Vega, G. Verdoolaege
Description
The objective of the fourth Technical Meeting on Fusion Data Processing, Validation and Analysis was to provide a platform during which a set of topics relevant to fusion data processing, validation and analysis are discussed with the view of extrapolating needs to next step fusion devices such as ITER. The validation and analysis of experimental data obtained from diagnostics used to characterize fusion plasmas are crucial for a knowledge-based understanding of the physical processes governing the dynamics of these plasmas. This paper presents the recent progress and achievements in the domain of plasma diagnostics and synthetic diagnostics data analysis (including image processing, regression analysis, inverse problems, deep learning, machine learning, big data and physics-based models for control) reported at the meeting. The progress in these areas highlight trends observed in current major fusion confinement devices. A special focus is dedicated on data analysis requirements for ITER and DEMO with a particular attention paid to Artificial Intelligence for automatization and improving reliability of control processes.
d
Validation Data for: Demand for family planning satisfied through modern...
search.dataone.org
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jolivet, Rima; Gausman, Jewel; Adanu, Richard; Bandoh, Delia; Berrueta, Mabel; Chakraborty, Suchandrima; Kenu, Ernest; Khan, Nizamuddin; Odikro, Magdalene; Pingray, Veronica; Ramesh, Sowmya; Vázquez, Paula; Williams, Caitlin; Langer, Ana (2023). Validation Data for: Demand for family planning satisfied through modern methods of contraception [Dataset]. http://doi.org/10.7910/DVN/DZSODD
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/DZSODD
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Jolivet, Rima; Gausman, Jewel; Adanu, Richard; Bandoh, Delia; Berrueta, Mabel; Chakraborty, Suchandrima; Kenu, Ernest; Khan, Nizamuddin; Odikro, Magdalene; Pingray, Veronica; Ramesh, Sowmya; Vázquez, Paula; Williams, Caitlin; Langer, Ana
Description
This dataset contains data from household surveys conducted with women in Argentina, Ghana, and India to validate the construct of "Demand for family planning satisfied through modern methods of contraception." Metadata Demand for family planning satisfied through modern methods of contraception Definition Percentage of women of reproductive age (15−49 years) who have their need for family planning satisfied with modern methods Numerator: Number of women of reproductive age (15 – 49 years) who have their need for family planning satisfied with modern methods Denominator: Total number of women of reproductive age (15–49 years) in need of family planning Disaggregator(s) • Wealth • Age • Education • Residence Data Source • MICS • DHS • RHS • Other national surveys Indicator Reference Countdown to 2030 Construct for Validation Women’s self-identified satisfaction of demand for family planning through a modern method of contraception; search for convergent validity comparing women’s subjective perception of satisfaction with family planning method, with an estimation of the concept derived via a constructed measure. Validation Question(s) 1. How does a direct measure of demand satisfaction for family planning (woman’s self-report) compare to the assigned result provided by the DHS algorithm derived from the responses to the series of questions used to calculate the indicator (same woman surveyed) (construct validity)? 2. How does the value of the indicator vary based on a new data source/estimation method compared to an established source/method? II. Study Aims This study aims to validate the DHS algorithm used to determine demand for family planning satisfied by comparing the results of the derived measure for a sample of women to the gold standard of those women’s own subjective perceptions as to whether their demand for family planning was actually satisfied.
GPM GROUND VALIDATION NOAA CPC MORPHING TECHNIQUE (CMORPH) IFLOODS V1
catalog.data.gov
datasets.ai
+3more
Updated Apr 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NASA/MSFC/GHRC (2025). GPM GROUND VALIDATION NOAA CPC MORPHING TECHNIQUE (CMORPH) IFLOODS V1 [Dataset]. https://catalog.data.gov/dataset/gpm-ground-validation-noaa-cpc-morphing-technique-cmorph-ifloods-v1-f4b35
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
The GPM Ground Validation NOAA CPC Morphing Technique (CMORPH) IFloodS dataset consists of global precipitation analyses data produced by the NOAA Climate Prediction Center (CPC). The Iowa Flood Studies (IFloodS) campaign was a ground measurement campaign that took place in eastern Iowa from May 1 to June 15, 2013. The goals of the campaign were to collect detailed measurements of precipitation at the Earth's surface using ground instruments and advanced weather radars and, simultaneously, collect data from satellites passing overhead. The CPC morphing technique uses precipitation estimates from low orbiter satellite microwave observations to produce global precipitation analyses at a high temporal and spatial resolution. Data has been selected for the Iowa Flood Studies (IFloodS) field campaign which took place from April 1, 2013 to June 30, 2013. The dataset includes both the near real-time raw data and bias corrected data from NOAA in binary and netCDF format.
Data from: Validation of Methods to Assess the Immunoglobulin Gene...
data.nasa.gov
data.staging.idas-ds1.appdat.jsc.nasa.gov
+1more
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.nasa.gov (2025). Validation of Methods to Assess the Immunoglobulin Gene Repertoire in Tissues Obtained from Mice on the International Space Station [Dataset]. https://data.nasa.gov/dataset/validation-of-methods-to-assess-the-immunoglobulin-gene-repertoire-in-tissues-obtained-fro-e1070
Explore at:
Dataset updated
Apr 1, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
Spaceflight is known to affect immune cell populations. In particular, splenic B-cell numbers decrease during spaceflight and in ground-based physiological models. Although antibody isotype changes have been assessed during and after spaceflight, an extensive characterization of the impact of spaceflight on antibody composition has not been conducted in mice. Next Generation Sequencing and bioinformatic tools are now available to assess antibody repertoires. We can now identify immunoglobulin gene- segment usage, junctional regions, and modifications that contribute to specificity and diversity. Due to limitations on the International Space Station, alternate sample collection and storage methods must be employed. Our group compared Illumina MiSeq sequencing data from multiple sample preparation methods in normal C57Bl/6J mice to validate that sample preparation and storage would not bias the outcome of antibody repertoire characterization. In this report, we also compared sequencing techniques and a bioinformatic workflow on the data output when we assessed the IgH and Igκ variable gene usage. Our bioinformatic workflow has been optimized for Illumina HiSeq and MiSeq datasets, and is designed specifically to reduce bias, capture the most information from Ig sequences, and produce a data set that provides other data mining options.
d
Data from: Partnering for science: proceedings of the USGS Workshop on...
datadiscoverystudio.org
pdf
Updated Jan 14, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). Partnering for science: proceedings of the USGS Workshop on Citizen Science [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/aee29daa83444cb7bdb59539ad895bc8/html
Explore at:
pdfAvailable download formats
Dataset updated
Jan 14, 2017
Area covered

Description
Link to the ScienceBase Item Summary page for the item described by this metadata record. Service Protocol: Link to the ScienceBase Item Summary page for the item described by this metadata record. Application Profile: Web Browser. Link Function: information
H
Replication Data for: An Empirical Validation Study of Popular Survey...
dataverse.harvard.edu
application/x-gzip +6
Updated Oct 19, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harvard Dataverse (2020). Replication Data for: An Empirical Validation Study of Popular Survey Methodologies for Sensitive Questions [Dataset]. http://doi.org/10.7910/DVN/29911
Explore at:
application/x-rlang-transport(1439200), pdf(26067), tsv(3673), zip(13725), txt(3726), application/x-gzip(61656), text/plain; charset=us-ascii(455)Available download formats
Unique identifier
https://doi.org/10.7910/DVN/29911
Dataset updated
Oct 19, 2020
Dataset provided by
Harvard Dataverse
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/3.0/customlicense?persistentId=doi:10.7910/DVN/29911https://dataverse.harvard.edu/api/datasets/:persistentId/versions/3.0/customlicense?persistentId=doi:10.7910/DVN/29911
Description
When studying sensitive issues including corruption, prejudice, and sexual behavior, researchers have increasingly relied upon indirect questioning techniques to mitigate such known problems of direct survey questions as under-reporting and nonresponse. However, there have been surprisingly few empirical validation studies of these indirect techniques, because the information required to verify the resulting estimates is often difficult to access. This paper reports findings from the first comprehensive validation study of indirect methods. We estimate whether people voted for an anti-abortion referendum held during the 2011 Mississippi General Election using direct questioning and three popular indirect methods: list experiment, endorsement experiment, and randomized response. We then validate these estimates against the official election outcome. While direct questioning leads to significant under-estimation of sensitive votes against the referendum, these survey techniques yield estimates much closer to the actual vote count, with endorsement experiment and randomized response yielding least bias.

Facebook

Twitter

Click to copy link

Link copied

Cite

Weinan Dong; Cecilia Clara Da Roza; Dandan Cheng; Dahao Zhang; Yuling Xiang; Wai Kay Seto; William C. W. Wong (2024). Development and validation of HBV surveillance models using big data and machine learning [Dataset]. http://doi.org/10.6084/m9.figshare.25201473.v1

Data from: Development and validation of HBV surveillance models using big data and machine learning

Explore at:

docxAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.25201473.v1

Dataset updated

Dec 3, 2024

Dataset provided by

Taylor & Francis

Authors

Weinan Dong; Cecilia Clara Da Roza; Dandan Cheng; Dahao Zhang; Yuling Xiang; Wai Kay Seto; William C. W. Wong

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The construction of a robust healthcare information system is fundamental to enhancing countries’ capabilities in the surveillance and control of hepatitis B virus (HBV). Making use of China’s rapidly expanding primary healthcare system, this innovative approach using big data and machine learning (ML) could help towards the World Health Organization’s (WHO) HBV infection elimination goals of reaching 90% diagnosis and treatment rates by 2030. We aimed to develop and validate HBV detection models using routine clinical data to improve the detection of HBV and support the development of effective interventions to mitigate the impact of this disease in China. Relevant data records extracted from the Family Medicine Clinic of the University of Hong Kong-Shenzhen Hospital’s Hospital Information System were structuralized using state-of-the-art Natural Language Processing techniques. Several ML models have been used to develop HBV risk assessment models. The performance of the ML model was then interpreted using the Shapley value (SHAP) and validated using cohort data randomly divided at a ratio of 2:1 using a five-fold cross-validation framework. The patterns of physical complaints of patients with and without HBV infection were identified by processing 158,988 clinic attendance records. After removing cases without any clinical parameters from the derivation sample (n = 105,992), 27,392 cases were analysed using six modelling methods. A simplified model for HBV using patients’ physical complaints and parameters was developed with good discrimination (AUC = 0.78) and calibration (goodness of fit test p-value >0.05). Suspected case detection models of HBV, showing potential for clinical deployment, have been developed to improve HBV surveillance in primary care setting in China. (Word count: 264) This study has developed a suspected case detection model for HBV, which can facilitate early identification and treatment of HBV in the primary care setting in China, contributing towards the achievement of WHO’s elimination goals of HBV infections.We utilized the state-of-art natural language processing techniques to structure the data records, leading to the development of a robust healthcare information system which enhances the surveillance and control of HBV in China. This study has developed a suspected case detection model for HBV, which can facilitate early identification and treatment of HBV in the primary care setting in China, contributing towards the achievement of WHO’s elimination goals of HBV infections. We utilized the state-of-art natural language processing techniques to structure the data records, leading to the development of a robust healthcare information system which enhances the surveillance and control of HBV in China.

Clear search

Close search

Google apps

Main menu

Data from: Development and validation of HBV surveillance models using big...

FDA Drug Product Labels Validation Method Data Package

PEN-Method: Predictor model and Validation Data

Supplementary Materials: Choosing Validation Methods for Agent-Based Models...

Python functions -- cross-validation methods from a data-driven perspective

Method Validation Data.xlsx

Replication Data for: Beyond Standardization: A Comprehensive Review of...

Data from: Selection of optimal validation methods for quantitative...

Prognostics of Power Electronics, methods and validation testbeds - Dataset...

ckanext-validation

Prognostics of Power Electronics, methods and validation testbeds

Data for training, validation and testing of methods in the thesis:...

Data from: Cross-Validation With Confidence

Data from: Synthetic Smart Card Data for the Analysis of Temporal and...

Summary report of the 4th IAEA Technical Meeting on Fusion Data Processing,...

Validation Data for: Demand for family planning satisfied through modern...

GPM GROUND VALIDATION NOAA CPC MORPHING TECHNIQUE (CMORPH) IFLOODS V1

Data from: Validation of Methods to Assess the Immunoglobulin Gene...

Data from: Partnering for science: proceedings of the USGS Workshop on...

Replication Data for: An Empirical Validation Study of Popular Survey...

Data from: Development and validation of HBV surveillance models using big data and machine learning