100+ datasets found

f
Data from: Development and validation of HBV surveillance models using big...
tandf.figshare.com
docx
Updated Dec 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Weinan Dong; Cecilia Clara Da Roza; Dandan Cheng; Dahao Zhang; Yuling Xiang; Wai Kay Seto; William C. W. Wong (2024). Development and validation of HBV surveillance models using big data and machine learning [Dataset]. http://doi.org/10.6084/m9.figshare.25201473.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25201473.v1
Dataset updated
Dec 3, 2024
Dataset provided by
Taylor & Francis
Authors
Weinan Dong; Cecilia Clara Da Roza; Dandan Cheng; Dahao Zhang; Yuling Xiang; Wai Kay Seto; William C. W. Wong
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The construction of a robust healthcare information system is fundamental to enhancing countries’ capabilities in the surveillance and control of hepatitis B virus (HBV). Making use of China’s rapidly expanding primary healthcare system, this innovative approach using big data and machine learning (ML) could help towards the World Health Organization’s (WHO) HBV infection elimination goals of reaching 90% diagnosis and treatment rates by 2030. We aimed to develop and validate HBV detection models using routine clinical data to improve the detection of HBV and support the development of effective interventions to mitigate the impact of this disease in China. Relevant data records extracted from the Family Medicine Clinic of the University of Hong Kong-Shenzhen Hospital’s Hospital Information System were structuralized using state-of-the-art Natural Language Processing techniques. Several ML models have been used to develop HBV risk assessment models. The performance of the ML model was then interpreted using the Shapley value (SHAP) and validated using cohort data randomly divided at a ratio of 2:1 using a five-fold cross-validation framework. The patterns of physical complaints of patients with and without HBV infection were identified by processing 158,988 clinic attendance records. After removing cases without any clinical parameters from the derivation sample (n = 105,992), 27,392 cases were analysed using six modelling methods. A simplified model for HBV using patients’ physical complaints and parameters was developed with good discrimination (AUC = 0.78) and calibration (goodness of fit test p-value >0.05). Suspected case detection models of HBV, showing potential for clinical deployment, have been developed to improve HBV surveillance in primary care setting in China. (Word count: 264) This study has developed a suspected case detection model for HBV, which can facilitate early identification and treatment of HBV in the primary care setting in China, contributing towards the achievement of WHO’s elimination goals of HBV infections.We utilized the state-of-art natural language processing techniques to structure the data records, leading to the development of a robust healthcare information system which enhances the surveillance and control of HBV in China. This study has developed a suspected case detection model for HBV, which can facilitate early identification and treatment of HBV in the primary care setting in China, contributing towards the achievement of WHO’s elimination goals of HBV infections. We utilized the state-of-art natural language processing techniques to structure the data records, leading to the development of a robust healthcare information system which enhances the surveillance and control of HBV in China.
T
Validation
data.va.gov
datahub.va.gov
+3more
application/rdfxml +5
Updated Sep 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). Validation [Dataset]. https://www.data.va.gov/dataset/Validation/a23g-khhe
Explore at:
csv, xml, application/rdfxml, tsv, application/rssxml, jsonAvailable download formats
Dataset updated
Sep 12, 2019
Description
Validation to ensure data and identity integrity. DAS will also ensure security compliant standards are met.
MIPS Data Validation Criteria
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). MIPS Data Validation Criteria [Dataset]. https://www.johnsnowlabs.com/marketplace/mips-data-validation-criteria/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Time period covered
2017 - 2020
Area covered
United States
Description
This dataset includes the MIPS Data Validation Criteria. The Medicare Access and CHIP Reauthorization Act of 2015 (MACRA) streamlines a patchwork collection of programs with a single system where provider can be rewarded for better care. Providers will be able to practice as they always have, but they may receive higher Medicare payments based on their performance.
Z
Data from: Synthetic Smart Card Data for the Analysis of Temporal and...
data.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paul Bouman (2020). Synthetic Smart Card Data for the Analysis of Temporal and Spatial Patterns [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_776718
Explore at:
Dataset updated
Jan 24, 2020
Dataset authored and provided by
Paul Bouman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a synthetic smart card data set that can be used to test pattern detection methods for the extraction of temporal and spatial data. The data set is tab seperated and based on a stylized travel pattern description for city of Utrecht in The Netherlands and is developed and used in Chapter 6 of the PhD Thesis of Paul Bouman.

This dataset contains the following files:

journeys.tsv : the actual data set of synthetic smart card data

utrecht.xml : the activity pattern definition that was used to randomly generate the synthethic smart card data

validate.ref : a file derived from the activity pattern definition that can be used for validation purposes. It specifies which activity types occur at each location in the smart card data set.
z
Model validation data from 2018 to 2020 - Dataset - data.govt.nz - discover...
portal.zero.govt.nz
catalogue.data.govt.nz
Updated Nov 1, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
zero.govt.nz (2020). Model validation data from 2018 to 2020 - Dataset - data.govt.nz - discover and use data [Dataset]. https://portal.zero.govt.nz/77d6ef04507c10508fcfc67a7c24be32/dataset/oai-figshare-com-article-12278786
Explore at:
Dataset updated
Nov 1, 2020
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was used to validate the global distribution of kelp biome model. These data were downloaded from the GBIF online database and cleaned to maintain highest georeference accuracy. The MaxEnt probability values of each record were given in the last column.
Z
Data pipeline Validation And Load Testing using Multiple JSON Files
data.niaid.nih.gov
Updated Mar 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Afsana Khan (2021). Data pipeline Validation And Load Testing using Multiple JSON Files [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4636789
Explore at:
Dataset updated
Mar 26, 2021
Dataset provided by
Mainak Adhikari
Pelle Jakovits
Afsana Khan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The datasets were used to validate and test the data pipeline deployment following the RADON approach. The dataset contains temperature and humidity sensor readings of a particular day, which are synthetically generated using a data generator and are stored as JSON files to validate and test (performance/load testing) the data pipeline components.
H
Rainbow training and validation data
dataverse.harvard.edu
Updated Nov 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kimberly Carlson (2022). Rainbow training and validation data [Dataset]. http://doi.org/10.7910/DVN/YTRMGN
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/YTRMGN
Dataset updated
Nov 26, 2022
Dataset provided by
Harvard Dataverse
Authors
Kimberly Carlson
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset includes the date and time, latitude (“lat”), longitude (“lon”), sun angle (“sun_angle”, in degrees [o]), rainbow presence (TRUE = rainbow, FALSE = no rainbow), cloud cover (“cloud_cover”, proportion), and liquid precipitation (“liquid_precip”, kg m-2 s-1) for each record used to train and/or validate the models.
C
Validations on the surface network: Number of validations per day (2nd...
ckan.mobidatalab.eu
Updated Sep 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Île-de-France Mobilités (2023). Validations on the surface network: Number of validations per day (2nd quarter 2023) [Dataset]. https://ckan.mobidatalab.eu/dataset/validations-on-the-surface-network-number-of-validations-per-day-2nd-quarter-2023
Explore at:
https://www.iana.org/assignments/media-types/application/octet-stream, https://www.iana.org/assignments/media-types/application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, https://www.iana.org/assignments/media-types/application/ld+json, https://www.iana.org/assignments/media-types/application/json, https://www.iana.org/assignments/media-types/text/n3, https://www.iana.org/assignments/media-types/text/csv, https://www.iana.org/assignments/media-types/text/turtle, https://www.iana.org/assignments/media-types/application/rdf+xmlAvailable download formats
Dataset updated
Sep 12, 2023
Dataset provided by
Île-de-France Mobilités
License
http://vvlibri.org/fr/licence/odbl-10/legalcode/unofficialhttp://vvlibri.org/fr/licence/odbl-10/legalcode/unofficial
Description
Validation data is updated at the end of February and the end of August.
This dataset presents the number of traveler validations by day per line and per transport ticket on the surface network.
Update update data
Data is updated semi-annually.
Documentation and dataset information may be updated more regularly.
</ p>

Documentation
Documentation for validation data is available
Consult the documentation on validation data

Validation data sets
To access validation data from previous years and other validation datasets, you can consult the < a href="https://data.iledefrance-mobilites.fr/explore/dataset/histo-validations" target="_blank">validation data history.
z
Structure tensor validation - Dataset - data.govt.nz - discover and use data...
portal.zero.govt.nz
catalogue.data.govt.nz
Updated Feb 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
zero.govt.nz (2024). Structure tensor validation - Dataset - data.govt.nz - discover and use data [Dataset]. https://portal.zero.govt.nz/77d6ef04507c10508fcfc67a7c24be32/dataset/oai-figshare-com-article-25216145
Explore at:
Dataset updated
Feb 22, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Structure tensor validationGeneral informationThis item contains test data to validate the structure tensor algorithms and a supplemental paper describing how the data was generated and used.ContentsThe test_data.zip archive contains 101 slices of a cylinder (701x701 pixels) with two artificially created fibre orientations. The outer fibres are oriented longitudinally, and the inner fibres are oriented circumferentially, similar to the ones found in the rat uterus.The SupplementaryMaterials_rat_uterus_texture_validation.pdf file is a short supplemental paper describing the generation of the test data and the results after being processed with the structure tensor code.
h
Refining and validating change requests from a crowd to derive requirements...
heidata.uni-heidelberg.de
docx, pdf, xlsx
Updated Nov 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leon Radeck; Barbara Paech; Leon Radeck; Barbara Paech (2024). Refining and validating change requests from a crowd to derive requirements [data] [Dataset]. http://doi.org/10.11588/DATA/N1T5T8
Explore at:
docx(21014), xlsx(40387), pdf(10485117), xlsx(137970)Available download formats
Unique identifier
https://doi.org/10.11588/DATA/N1T5T8
Dataset updated
Nov 14, 2024
Dataset provided by
heiDATA
Authors
Leon Radeck; Barbara Paech; Leon Radeck; Barbara Paech
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset funded by
Carl Zeiss Foundation
Description
[Context/Motivation] Integrating user feedback into software development enhances system acceptance, decreases the likelihood of project failure and strengthens customer loyalty. Moreover, user feedback plays an important role in software evolution, because it can be the basis for deriving requirements. [Problems] However, to be able to derive requirements from feedback, the feedback must contain actionable change requests, that is contain detailed information regarding a change to the application. Furthermore, requirements engineers must know how many users support the change request. [Principal ideas] To address these challenges, we propose an approach that uses structured questions to transform non-actionable change requests into actionable and validate the change requests to assess their support among the users. We evaluate the approach in the large-scale research project SMART-AGE with over 200 older adults, aged 67 and older. [Contribution] We contribute a set of templates for our questions and our process, and we evaluate the approach’s feasibility, effectiveness and user satisfaction, resulting in very positive outcomes.
B
Data set of "Smart" Brace Validation Study
borealisdata.ca
search.dataone.org
Updated Nov 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vincent Nguyen; William Gage (2023). Data set of "Smart" Brace Validation Study [Dataset]. http://doi.org/10.5683/SP3/MGQYBR
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP3/MGQYBR
Dataset updated
Nov 29, 2023
Dataset provided by
Borealis
Authors
Vincent Nguyen; William Gage
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This data set was collected to validate a 'smart' knee brace with an IMU embedded on the thigh and shank area against a reference motion capture system (Vicon). There were 10 participants total and each participant came into the lab for 2 sessions, each on separate days. For each session, participants completed three trials of 2-minute treadmill walking at their preferred walking speed, three trials of 15 squats to parallel, three trials of 10 sit-to-stand on a chair that was about knee level, three trials of 15 total alternating lunges, and three trials of 2-minute treadmill jogging at their preferred speed, all in that order. 10 squats and 10 lunges were done for some participants' first sessions, but then did 15 squats and lunges in the second session (a .txt file will be included for each participants' session to specify). This dataset only contains the IMU data.
c
Gulf of Maine - Control Points Used to Validate the Accuracies of the...
s.cnmilf.com
datasets.ai
+2more
Updated Oct 18, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(Point of Contact, Custodian) (2024). Gulf of Maine - Control Points Used to Validate the Accuracies of the Interpolated Water Density Rasters [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/gulf-of-maine-control-points-used-to-validate-the-accuracies-of-the-interpolated-water-density-1
Explore at:
Dataset updated
Oct 18, 2024
Dataset provided by
(Point of Contact, Custodian)
Area covered
Gulf of Maine, Maine
Description
This feature dataset contains the control points used to validate the accuracies of the interpolated water density rasters for the Gulf of Maine. These control points were selected randomly from the water density data points, using Hawth's Create Random Selection Tool. Twenty-five percent of each seasonal bin (for each year and at each depth) were randomly selected and set aside for validation. For example, if there were 1,000 water density data points for the fall (September, October, November) 2003 at 0 meters, then 250 of those points were randomly selected, removed and set aside to assess the accuracy of interpolated surface. The naming convention of the validation point feature class includes the year (or years), the season, and the depth (in meters) it was selected from. So for example, the name: ValidationPoints_1997_2004_Fall_0m would indicate that this point feature class was randomly selected from water density points that were at 0 meters in the fall between 1997-2004. The seasons were defined using the same months as the remote sensing data--namely, Fall = September, October, November; Winter = December, January, February; Spring = March, April, May; and Summer = June, July, August.
d
GPS Validation Mark (DOT-031) - Datasets - data.wa.gov.au
catalogue.data.wa.gov.au
Updated Jul 30, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). GPS Validation Mark (DOT-031) - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/gps-validation-mark-dot-031
Explore at:
Dataset updated
Jul 30, 2020
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Western Australia
Description
Global Positioning System (GPS) satellite navigation validation marks are unique visible markers located at a number of public boat ramps and associated jetties, which mariners or owners of portable GPS units can use to validate their position and map datum settings. Show full description
d
Data from: RM3 Wave Tank Validation Model
catalog.data.gov
mhkdr.openei.org
+2more
Updated Jan 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Renewable Energy Laboratory (2025). RM3 Wave Tank Validation Model [Dataset]. https://catalog.data.gov/dataset/rm3-wave-tank-validation-model-ee6aa
Explore at:
Dataset updated
Jan 20, 2025
Dataset provided by
National Renewable Energy Laboratory
Description
An approximately 1/75th scale point absorber wave energy absorber was built to validate the testing systems of a 16k gallon single paddle wave tank. The model was build based on the RM3 design and incorporated a linear position sensor, a force transducer, and wetness detection sensors. The data set also includes motion tracking data of the device's two bodies acquired from 4x Qualisys cameras. The tank wave spectrum is measured by 4 ultrasonic water height sensors.
Z
Validation data for a coupled water-heat-salt multi-field transport model
data.niaid.nih.gov
zenodo.org
Updated Jan 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Validation data for a coupled water-heat-salt multi-field transport model [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10452990
Explore at:
Dataset updated
Jan 3, 2024
Dataset authored and provided by
Ruan, Dongmei
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These data were utilized to validate the accuracy of the newly developed multi-length coupling model. Water and salt transport experiments under freezing conditions were carried out in the laboratory to determine the mass salt content and volumetric water content at different heights of the soil column after freezing
Z
Data pipeline Validation And Load Testing using Multiple CSV Files
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Mar 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Afsana Khan (2021). Data pipeline Validation And Load Testing using Multiple CSV Files [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4636797
Explore at:
Dataset updated
Mar 26, 2021
Dataset provided by
Mainak Adhikari
Pelle Jakovits
Afsana Khan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The datasets were used to validate and test the data pipeline deployment following the RADON approach. The dataset has a CSV file that contains around 32000 Twitter tweets. 100 CSV files have been created from the single CSV file and each CSV file containing 320 tweets. Those 100 CSV files are used to validate and test (performance/load testing) the data pipeline components.
d
Adelaide Metrocard Validations
data.sa.gov.au
researchdata.edu.au
Updated Jul 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Adelaide Metrocard Validations [Dataset]. https://data.sa.gov.au/data/dataset/adelaide-metrocard-validations
Explore at:
Dataset updated
Jul 13, 2023
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Adelaide
Description
Total onboard validations (banded) by date, mode, route, direction, stop and media type
m
Validation Data UT-SAFT Resolution
data.mendeley.com
Updated Oct 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hubert Mooshofer (2024). Validation Data UT-SAFT Resolution [Dataset]. http://doi.org/10.17632/w9rsywyd43.1
Explore at:
Unique identifier
https://doi.org/10.17632/w9rsywyd43.1
Dataset updated
Oct 7, 2024
Authors
Hubert Mooshofer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Experimental data obtained from the inspection of steel disks used to validate the UT-SAFT resolution formulas in the referencing paper 'UT-SAFT resolution'. The purpose is to show, that the indication size of small test reflectors matches well with the resolution formulas derived in this paper.
U
Forage Fish Aerial Validation Data from Prince William Sound, Alaska
data.usgs.gov
s.cnmilf.com
+1more
Updated Mar 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Forage Fish Aerial Validation Data from Prince William Sound, Alaska [Dataset]. https://data.usgs.gov/datacatalog/data/USGS:ASC523
Explore at:
Unique identifier
https://doi.org/10.5066/F74J0C9Z
Dataset updated
Mar 4, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Mayumi Arimitsu; John Piatt; Brielle Heflin; Caitlin Marsteller
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
2014 - 2022
Area covered
Prince William Sound, Alaska
Description
These data are part of the Gulf Watch Alaska (GWA) long term monitoring program, pelagic monitoring component. This dataset consists of one table, providing fish aerial validation data from summer surveys in Prince William Sound, Alaska. Data includes: date, location, latitude, longitude, aerial ID, validation ID, total length and validation method. Various catch methods were used to obtain fish samples for aerial validations, including:cast net, go pro, hydroacoustics, jig, dip net, gill net, purse seine, photo and visual identification.
i
Validations sur le réseau ferré : Nombre de validations par jour (1er...
data.iledefrance-mobilites.fr
csv, excel, json
Updated Jan 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Validations sur le réseau ferré : Nombre de validations par jour (1er semestre 2024) [Dataset]. https://data.iledefrance-mobilites.fr/explore/dataset/validations-reseau-ferre-nombre-validations-par-jour-1er-semestre/
Explore at:
json, csv, excelAvailable download formats
Dataset updated
Jan 20, 2025
License
https://doc.transport.data.gouv.fr/le-point-d-acces-national/cadre-juridique/conditions-dutilisation-des-donnees/licence-odblhttps://doc.transport.data.gouv.fr/le-point-d-acces-national/cadre-juridique/conditions-dutilisation-des-donnees/licence-odbl
Description
Les données de validations sont mises à jour fin février et fin août.

Ce jeu de données présente le nombre de validations des voyageurs par jour par arrêt et par titre de transport sur le réseau ferré.

Mise à jour des données Les données sont mises à jour de manière semestrielle. La documentation et les informations du jeu de données peuvent être mis à jour plus régulièrement.

Documentation Une documentation relative aux données de validation est disponible : Consulter la documentation sur les données de validation

Les jeux des données de validation

Pour avoir accès aux données de validation des années précédentes et aux autres jeux de données de validation, vous pouvez consulter l'historique des données de validation.

Facebook

Twitter

Click to copy link

Link copied

Cite

Weinan Dong; Cecilia Clara Da Roza; Dandan Cheng; Dahao Zhang; Yuling Xiang; Wai Kay Seto; William C. W. Wong (2024). Development and validation of HBV surveillance models using big data and machine learning [Dataset]. http://doi.org/10.6084/m9.figshare.25201473.v1

Data from: Development and validation of HBV surveillance models using big data and machine learning

Explore at:

docxAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.25201473.v1

Dataset updated

Dec 3, 2024

Dataset provided by

Taylor & Francis

Authors

Weinan Dong; Cecilia Clara Da Roza; Dandan Cheng; Dahao Zhang; Yuling Xiang; Wai Kay Seto; William C. W. Wong

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The construction of a robust healthcare information system is fundamental to enhancing countries’ capabilities in the surveillance and control of hepatitis B virus (HBV). Making use of China’s rapidly expanding primary healthcare system, this innovative approach using big data and machine learning (ML) could help towards the World Health Organization’s (WHO) HBV infection elimination goals of reaching 90% diagnosis and treatment rates by 2030. We aimed to develop and validate HBV detection models using routine clinical data to improve the detection of HBV and support the development of effective interventions to mitigate the impact of this disease in China. Relevant data records extracted from the Family Medicine Clinic of the University of Hong Kong-Shenzhen Hospital’s Hospital Information System were structuralized using state-of-the-art Natural Language Processing techniques. Several ML models have been used to develop HBV risk assessment models. The performance of the ML model was then interpreted using the Shapley value (SHAP) and validated using cohort data randomly divided at a ratio of 2:1 using a five-fold cross-validation framework. The patterns of physical complaints of patients with and without HBV infection were identified by processing 158,988 clinic attendance records. After removing cases without any clinical parameters from the derivation sample (n = 105,992), 27,392 cases were analysed using six modelling methods. A simplified model for HBV using patients’ physical complaints and parameters was developed with good discrimination (AUC = 0.78) and calibration (goodness of fit test p-value >0.05). Suspected case detection models of HBV, showing potential for clinical deployment, have been developed to improve HBV surveillance in primary care setting in China. (Word count: 264) This study has developed a suspected case detection model for HBV, which can facilitate early identification and treatment of HBV in the primary care setting in China, contributing towards the achievement of WHO’s elimination goals of HBV infections.We utilized the state-of-art natural language processing techniques to structure the data records, leading to the development of a robust healthcare information system which enhances the surveillance and control of HBV in China. This study has developed a suspected case detection model for HBV, which can facilitate early identification and treatment of HBV in the primary care setting in China, contributing towards the achievement of WHO’s elimination goals of HBV infections. We utilized the state-of-art natural language processing techniques to structure the data records, leading to the development of a robust healthcare information system which enhances the surveillance and control of HBV in China.

Clear search

Close search

Google apps

Main menu

Data from: Development and validation of HBV surveillance models using big...

Validation

MIPS Data Validation Criteria

Data from: Synthetic Smart Card Data for the Analysis of Temporal and...

Model validation data from 2018 to 2020 - Dataset - data.govt.nz - discover...

Data pipeline Validation And Load Testing using Multiple JSON Files

Rainbow training and validation data

Validations on the surface network: Number of validations per day (2nd...

Structure tensor validation - Dataset - data.govt.nz - discover and use data...

Refining and validating change requests from a crowd to derive requirements...

Data set of "Smart" Brace Validation Study

Gulf of Maine - Control Points Used to Validate the Accuracies of the...

GPS Validation Mark (DOT-031) - Datasets - data.wa.gov.au

Data from: RM3 Wave Tank Validation Model

Validation data for a coupled water-heat-salt multi-field transport model

Data pipeline Validation And Load Testing using Multiple CSV Files

Adelaide Metrocard Validations

Validation Data UT-SAFT Resolution

Forage Fish Aerial Validation Data from Prince William Sound, Alaska

Validations sur le réseau ferré : Nombre de validations par jour (1er...

Data from: Development and validation of HBV surveillance models using big data and machine learning