8 datasets found

f
Data_Sheet_1_The impact of transitive annotation on the training of...
figshare.com
pdf
Updated Jan 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harihara Subrahmaniam Muralidharan; Noam Y. Fox; Mihai Pop (2024). Data_Sheet_1_The impact of transitive annotation on the training of taxonomic classifiers.PDF [Dataset]. http://doi.org/10.3389/fmicb.2023.1240957.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fmicb.2023.1240957.s001
Dataset updated
Jan 3, 2024
Dataset provided by
Frontiers
Authors
Harihara Subrahmaniam Muralidharan; Noam Y. Fox; Mihai Pop
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionA common task in the analysis of microbial communities involves assigning taxonomic labels to the sequences derived from organisms found in the communities. Frequently, such labels are assigned using machine learning algorithms that are trained to recognize individual taxonomic groups based on training data sets that comprise sequences with known taxonomic labels. Ideally, the training data should rely on labels that are experimentally verified—formal taxonomic labels require knowledge of physical and biochemical properties of organisms that cannot be directly inferred from sequence alone. However, the labels associated with sequences in biological databases are most commonly computational predictions which themselves may rely on computationally-generated data—a process commonly referred to as “transitive annotation.”MethodsIn this manuscript we explore the implications of training a machine learning classifier (the Ribosomal Database Project’s Bayesian classifier in our case) on data that itself has been computationally generated. We generate new training examples based on 16S rRNA data from a metagenomic experiment, and evaluate the extent to which the taxonomic labels predicted by the classifier change after re-training.ResultsWe demonstrate that even a few computationally-generated training data points can significantly skew the output of the classifier to the point where entire regions of the taxonomic space can be disturbed.Discussion and conclusionsWe conclude with a discussion of key factors that affect the resilience of classifiers to transitively-annotated training data, and propose best practices to avoid the artifacts described in our paper.
Fight Lead Poisoning with a Healthy Diet
data.virginia.gov
pdf
Updated Sep 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Environmental Protection Agency (2024). Fight Lead Poisoning with a Healthy Diet [Dataset]. https://data.virginia.gov/dataset/fight-lead-poisoning-with-a-healthy-diet
Explore at:
pdf(314493), pdf(165456)Available download formats
Dataset updated
Sep 25, 2024
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Authors
U.S. Environmental Protection Agency
Description
Lead is a poisonous metal that our bodies cannot use. Lead poisoning can cause learning, hearing, and behavioral problems, and can harm your child’s brain, kidneys, and other organs. Lead in the body stops good minerals such as iron and calcium from working right. Some of these effects may be permanent.
f
Data from: S1 Dataset -
plos.figshare.com
xlsx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lufeng Hu; Huaizhong Li; Zhennao Cai; Feiyan Lin; Guangliang Hong; Huiling Chen; Zhongqiu Lu (2023). S1 Dataset - [Dataset]. http://doi.org/10.1371/journal.pone.0186427.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0186427.s001
Dataset updated
May 30, 2023
Dataset provided by
PLOS ONE
Authors
Lufeng Hu; Huaizhong Li; Zhennao Cai; Feiyan Lin; Guangliang Hong; Huiling Chen; Zhongqiu Lu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The file contains the first time tests of coagulation, liver, kidney indices, deceased group (1) and survival group (2). (XLSX)
Data from: Puerarin protects against damage to spatial learning and memory...
scielo.figshare.com
jpeg
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
S.Q. Cui; Q. Wang; Y. Zheng; B. Xiao; H.W. Sun; X.L. Gu; Y.C. Zhang; C.H. Fu; P.X. Dong; X.M. Wang (2023). Puerarin protects against damage to spatial learning and memory ability in mice with chronic alcohol poisoning [Dataset]. http://doi.org/10.6084/m9.figshare.7898927.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7898927.v1
Dataset updated
Jun 4, 2023
Dataset provided by
SciELOhttp://www.scielo.org/
Authors
S.Q. Cui; Q. Wang; Y. Zheng; B. Xiao; H.W. Sun; X.L. Gu; Y.C. Zhang; C.H. Fu; P.X. Dong; X.M. Wang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We evaluated the effect of puerarin on spatial learning and memory ability of mice with chronic alcohol poisoning. A total of 30 male C57BL/6 mice were randomly divided into model, puerarin, and control groups (n=10 each). The model group received 60% (v/v) ethanol by intragastric administration followed by intraperitoneal injection of normal saline 30 min later. The puerarin group received intragastric 60% ethanol followed by intraperitoneal puerarin 30 min later, and the control group received intragastric saline followed by intraperitoneal saline. Six weeks after treatment, the Morris water maze and Tru Scan behavioral tests and immunofluorescence staining of cerebral cortex and hippocampal neurons (by Neu-N) and microglia (by Ib1) were conducted. Glutamic acid (Glu) and gamma amino butyric acid (GABA) in the cortex and hippocampus were assayed by high-performance liquid chromatography (HPLC), and tumor necrosis factor (TNF)-α and interleukin (IL)-1β were determined by ELISA. Compared with mice in the control group, escape latency and distance were prolonged, and spontaneous movement distance was shortened (P
d
Year, State-wise number of unnatural elephant deaths by cause
dataful.in
Updated Mar 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataful (Factly) (2025). Year, State-wise number of unnatural elephant deaths by cause [Dataset]. https://dataful.in/datasets/20250
Explore at:
application/x-parquet, xlsx, csvAvailable download formats
Dataset updated
Mar 18, 2025
Dataset authored and provided by
Dataful (Factly)
License
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
Area covered
States of India
Variables measured
Number of deaths
Description
The dataset consists of the annual number of deaths of elephants due to unnatural deaths by causes like poaching, electrocution, train accidents, and poisoning across states.
f
Statistics concerning different logistic regression models’ abilities to...
plos.figshare.com
xls
Updated Jul 10, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammad Howard-Azzeh; David L. Pearl; Terri L. O’Sullivan; Olaf Berke (2023). Statistics concerning different logistic regression models’ abilities to predict opioid poisoning calls to the APCCa in US dogs (2005–2014). [Dataset]. http://doi.org/10.1371/journal.pone.0288339.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0288339.t006
Dataset updated
Jul 10, 2023
Dataset provided by
PLOS ONE
Authors
Mohammad Howard-Azzeh; David L. Pearl; Terri L. O’Sullivan; Olaf Berke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
Statistics concerning different logistic regression models’ abilities to predict opioid poisoning calls to the APCCa in US dogs (2005–2014).
f
Number of coefficients in models fitted using various logistic regression...
plos.figshare.com
xls
Updated Jul 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Number of coefficients in models fitted using various logistic regression models examining the associations between dog-level variables and a poisoning call to the APCCa being related to cannabinoids or opioids (2005–2014). [Dataset]. https://plos.figshare.com/articles/dataset/Number_of_coefficients_in_models_fitted_using_various_logistic_regression_models_examining_the_associations_between_dog-level_variables_and_a_poisoning_call_to_the_APCC_sup_a_sup_being_related_to_cannabinoids_or_opioids_2005_2014_/23655732
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0288339.t004
Dataset updated
Jul 10, 2023
Dataset provided by
PLOS ONE
Authors
Mohammad Howard-Azzeh; David L. Pearl; Terri L. O’Sullivan; Olaf Berke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Number of coefficients in models fitted using various logistic regression models examining the associations between dog-level variables and a poisoning call to the APCCa being related to cannabinoids or opioids (2005–2014).
f
The impact of label noise on the performance of ranking models.
plos.figshare.com
xls
Updated Jun 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shahabeddin Sotudian; Ruidi Chen; Ioannis Ch. Paschalidis (2023). The impact of label noise on the performance of ranking models. [Dataset]. http://doi.org/10.1371/journal.pone.0283574.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0283574.t003
Dataset updated
Jun 21, 2023
Dataset provided by
PLOS ONE
Authors
Shahabeddin Sotudian; Ruidi Chen; Ioannis Ch. Paschalidis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The impact of label noise on the performance of ranking models.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Harihara Subrahmaniam Muralidharan; Noam Y. Fox; Mihai Pop (2024). Data_Sheet_1_The impact of transitive annotation on the training of taxonomic classifiers.PDF [Dataset]. http://doi.org/10.3389/fmicb.2023.1240957.s001

Data_Sheet_1_The impact of transitive annotation on the training of taxonomic classifiers.PDF

Explore at:

pdfAvailable download formats

Unique identifier

https://doi.org/10.3389/fmicb.2023.1240957.s001

Dataset updated

Jan 3, 2024

Dataset provided by

Frontiers

Authors

Harihara Subrahmaniam Muralidharan; Noam Y. Fox; Mihai Pop

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

IntroductionA common task in the analysis of microbial communities involves assigning taxonomic labels to the sequences derived from organisms found in the communities. Frequently, such labels are assigned using machine learning algorithms that are trained to recognize individual taxonomic groups based on training data sets that comprise sequences with known taxonomic labels. Ideally, the training data should rely on labels that are experimentally verified—formal taxonomic labels require knowledge of physical and biochemical properties of organisms that cannot be directly inferred from sequence alone. However, the labels associated with sequences in biological databases are most commonly computational predictions which themselves may rely on computationally-generated data—a process commonly referred to as “transitive annotation.”MethodsIn this manuscript we explore the implications of training a machine learning classifier (the Ribosomal Database Project’s Bayesian classifier in our case) on data that itself has been computationally generated. We generate new training examples based on 16S rRNA data from a metagenomic experiment, and evaluate the extent to which the taxonomic labels predicted by the classifier change after re-training.ResultsWe demonstrate that even a few computationally-generated training data points can significantly skew the output of the classifier to the point where entire regions of the taxonomic space can be disturbed.Discussion and conclusionsWe conclude with a discussion of key factors that affect the resilience of classifiers to transitively-annotated training data, and propose best practices to avoid the artifacts described in our paper.

Clear search

Close search

Google apps

Main menu

Data_Sheet_1_The impact of transitive annotation on the training of...

Fight Lead Poisoning with a Healthy Diet

Data from: S1 Dataset -

Data from: Puerarin protects against damage to spatial learning and memory...

Year, State-wise number of unnatural elephant deaths by cause

Statistics concerning different logistic regression models’ abilities to...

Number of coefficients in models fitted using various logistic regression...

The impact of label noise on the performance of ranking models.

Data_Sheet_1_The impact of transitive annotation on the training of taxonomic classifiers.PDFSee More Versions

Data_Sheet_1_The impact of transitive annotation on the training of taxonomic classifiers.PDF