57 datasets found

Toxicity Reference Database
catalog.data.gov
datasets.ai
+2more
Updated Dec 3, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) - National Center for Computational Toxicology (NCCT) (2020). Toxicity Reference Database [Dataset]. https://catalog.data.gov/dataset/toxicity-reference-database
Explore at:
Dataset updated
Dec 3, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
The Toxicity Reference Database (ToxRefDB) contains approximately 30 years and $2 billion worth of animal studies. ToxRefDB allows scientists and the interested public to search and download thousands of animal toxicity testing results for hundreds of chemicals that were previously found only in paper documents. Currently, there are 474 chemicals in ToxRefDB, primarily the data rich pesticide active ingredients, but the number will continue to expand.
[Therapeutics Data Commons] Acute Toxicity LD50
kaggle.com
Updated Jan 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seongjin Kim (2025). [Therapeutics Data Commons] Acute Toxicity LD50 [Dataset]. https://www.kaggle.com/datasets/iapetus509/therapeutics-data-commons-acute-toxicity-ld50
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 11, 2025
Dataset provided by
Kaggle
Authors
Seongjin Kim
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
https://tdcommons.ai/single_pred_tasks/tox#acute-toxicity-ld50

Dataset Description: Acute toxicity LD50 measures the most conservative dose that can lead to lethal adverse effects. The higher the dose, the more lethal of a drug. This dataset is kindly provided by the authors of [1].

Task Description: Regression. Given a drug SMILES string, predict its acute toxicity.

**Dataset Statistics: ** 7,385 drugs.

Dataset Split: Random Split, Scaffold Split

from tdc.single_pred import Tox data = Tox(name = 'LD50_Zhu') split = data.get_split()

References: [1] Zhu, Hao, et al. “Quantitative structure− activity relationship modeling of rat acute toxicity by oral exposure.” Chemical research in toxicology 22.12 (2009): 1913-1921.

Dataset License: CC BY 4.0.
The toxicity data of compounds
figshare.com
zip
Updated Mar 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiang Lu (2025). The toxicity data of compounds [Dataset]. http://doi.org/10.6084/m9.figshare.27195339.v5
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.27195339.v5
Dataset updated
Mar 21, 2025
Dataset provided by
Figsharehttp://figshare.com/
Authors
Jiang Lu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
toxric_30_datasets.zip: The expanded predictive toxicology dataset is sourced from TOXRIC, a comprehensive and standardized toxicology database. The toxric_30_datasets contains 30 assay datasets with ~150,000 measurements related to five categories. These categories span a range of toxicity assessment, including genetic toxicity, organic toxicity, clinical toxicity, developmental and reproductive toxicity, and reactive toxicity. multiple_endpoint_acute_toxicity_dataset.zip & all_descriptors.txt: This 59-endpoint acute toxicity dataset is sourced from TOXRIC. It includes 59 various toxicity endpoints with 80,081 unique compounds represented using SMILES strings, and 122,594 usable toxicity measurements described by continuous values with a unified toxicity chemical unit: -log(mol/kg). The larger the measurement value, the stronger the toxicity intensity of the corresponding compound towards a certain endpoint. The 59 acute toxicity endpoints involve 15 different species including mouse, rat, rabbit, guinea pig, dog, cat, bird wild, quail, duck, chicken, frog, mammal, man, women, and human, 8 different administration routes including intraperitoneal, intravenous, oral, skin, subcutaneous, intramuscular, parenteral, and unreported, and 3 different measurement indicators including LD50 (lethal dose 50%), LDLo (lethal dose low), and TDLo (toxic dose low). In this dataset, each compound only has toxicity measurement values concerning a small number of toxicity endpoints, so this dataset is very sparse with nearly 97.4% of compound-to-endpoint measurements missing. Meanwhile, this dataset is also extremely data-unbalanced with some endpoints having tens of thousands of toxicity measurements available, e.g., mouse-intraperitoneal-LD50 has 36,295 measurements, mouse-oral-LD50 has 23,373 measurements, and rat-oral-LD50 has 10,190 measurements, etc, while some endpoints contain only around 100 measurements like mouse-intravenous-LDLo, rat-intravenous-LDLo, frog-subcutaneous-LD50, and human-oral-TDLo, etc. The sparsity and unbalance of this dataset present acute toxicity evaluation as a challenging issue. Among the 59 endpoints, 21 endpoints with less than 200 measurements were considered small-sized endpoints, and 11 endpoints with more than 1000 measurements were treated as large-sized endpoints. Three endpoints targeting humans, human-oral-TDLo, women-oral-TDLo, and man-oral-TDLo, are typical small-sized endpoints, with only 140, 156, and 163 available toxicity measurements, respectively (The acute toxicity intensity measurement values of the 80,081 compounds concerning 59 acute toxic endpoints, as well as the 5-fold random splits, were provided in the multiple_endpoint_acute_toxicity_dataset.zip. The molecular fingerprints or feature descripors of the 80,081 compounds, such as Avalon, Morgan, and AtomPair, were given in the all_descriptors.txt).115-endpoint_acute_toxiciy_dataset.zip: We collected more acute toxicity data of compounds from PubChem database through web crawling. We unified all the toxicity measurement units into -log(mol/kg) and retained the endpoints with no less than 30 available samples per endpoint. Thus, a brand-new acute toxicity dataset containing 115 endpoints was established. Compared with the previous 59-endpoint acute toxicity dataset from TOXRIC, the number of acute toxicity endpoints in this new dataset has doubled, adding more possible species (like goat, monkey, hamster, etc), administration routes (like intracerebral, intratracheal), and measurement indicators (like LD10, LD20). It should be emphasized that the sample imbalance among endpoints and the data missing rate of this dataset are more severe. Its sparsity rate reaches 98.7%, and it contains 68 small-sample acute toxicity endpoints (i.e., endpoints with less than 200 toxicity measurement data), among which the endpoint with the fewest samples has only 30 available measurement data. Therefore, this dataset is more challenging for all current acute toxicity prediction models.
ld50-smiles-descriptors-dataset
kaggle.com
zip
Updated Sep 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hillary Mongare (2025). ld50-smiles-descriptors-dataset [Dataset]. https://www.kaggle.com/datasets/hillarymongare/ld50-smiles-descriptors-dataset
Explore at:
zip(906311 bytes)Available download formats
Dataset updated
Sep 15, 2025
Authors
Hillary Mongare
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset is a curated and feature-enhanced collection of** 7,413 chemical compounds for computational toxicology studies**. It's a good resource for a machine learning exercise. Its primary purpose is to enable the prediction of acute toxicity (LD₅₀) using machine learning methods.

Each row represents one unique chemical compound and includes:

Identifiers:

*Name *– IUPAC name of the compound

SMILES – cheminformatics string describing molecular structure

Target Variable:

LD50 – median lethal dose (lower values = higher toxicity)

Molecular Descriptors (25 features)

Physicochemical and structural properties calculated using *RDKit *and PubChemPy, such as:

MolWt, HeavyAtomCount, NumHDonors, NumHAcceptors, RingCount

Topological indices (e.g. Chi2v, Chi3v, Kappa3)

Electronic properties (MaxPartialCharge, MinEStateIndex)

Drug-likeness (qed)

MOE-type descriptors (SMR_VSA10, SlogP_VSA5, VSA_EState4, etc.)
u
Data from: Multimodal Dataset for LD50 Toxicity Prediction of Pesticides...
portalinvestigacion.uniovi.es
Updated 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Junquera Álvarez, Enol; Febbraio, Ferdinando; Díaz, Irene; Junquera Álvarez, Enol; Febbraio, Ferdinando; Díaz, Irene (2025). Multimodal Dataset for LD50 Toxicity Prediction of Pesticides Using Deep Learning [Dataset]. https://portalinvestigacion.uniovi.es/documentos/688b601c17bb6239d2d488c9?lang=de
Explore at:
Dataset updated
2025
Authors
Junquera Álvarez, Enol; Febbraio, Ferdinando; Díaz, Irene; Junquera Álvarez, Enol; Febbraio, Ferdinando; Díaz, Irene
Description
This dataset supports the study "Saving Mice: ChenseNet121, a New Deep Learning Architecture for LD50 Toxicity Estimation", and was specifically designed for training and evaluating multimodal deep learning models for acute oral toxicity (LD50) prediction in pesticides. It integrates multiple data representations for each compound:

2D images of molecular structures (folder: images/, PNG format), downloaded from PubChem and identified by compound CID.

3D voxelized volumes derived from molecular docking simulations against human acetylcholinesterase (hAChE, PDB: 7E3H), formatted as tensors and stored as .npy files (not shown in screenshot).

Physicochemical descriptors, extracted from SMILES using RDKit, including molecular weight, logP, TPSA, number of rotatable bonds, and docking binding affinities. These are stored in plain text files:

dataset_descriptores_bool.txt

dataset_descriptores_float.txt

dataset_descriptores_2x2x2_bool.txt

dataset_descriptores_2x2x2_float.txt

CSV files containing the integrated dataset (combined_dataset.csv) and a balanced test subset for classification tasks (balanced_test.csv).

The dataset is aligned with EFSA guidelines and enables the training of machine learning models using image-based, structural, and biochemical features. It was used to develop and evaluate the ChenseNet121 architecture, which outperforms ResNet, Inception, and EfficientNet variants in LD50 regression and WHO-aligned toxicity classification.

Suggested Citation:Junquera, E., Remeseiro, B., Febbraio, F., & Díaz, I. (2025). Multimodal Dataset for LD50 Toxicity Prediction of Pesticides Using Deep Learning. Zenodo.

Related Publication:Junquera et al. (2025). Saving Mice: ChenseNet121, a New Deep Learning Architecture for LD50 Toxicity Estimation.
Z
Data from: Integrating QSAR models predicting acute contact toxicity and...
nde-dev.biothings.io
Updated Oct 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carnesecchi, Edoardo (2020). Integrating QSAR models predicting acute contact toxicity and mode of action profiling in honey bees (A. mellifera): Data curation using open source databases, performance testing and validation [Dataset]. https://nde-dev.biothings.io/resources?id=zenodo_4104276
Explore at:
Dataset updated
Oct 20, 2020
Dataset provided by
Dorne, Jean-Lou
Benfenati, Emilio
Carnesecchi, Edoardo
Roncaglioni, Alessandra
Toma, Cosimo
Kramer, Nynke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This excel file (DOI: https://doi.org/10.5281/zenodo.3755675) provides the collection of raw data used for developing the first integrative Quantitative Structure-Activity Relationship (QSAR) model using EFSA's OpenFoodTox, US-EPA ECOTOX and Pesticide Properties DataBase i) to predict acute contact toxicity (LD50) and ii) to profile the Mode of Action (MoA) of pesticides active substances in honey bees (Apis mellifera). Chemical identifiers (e.g. SMILES, CAS n., InChI) and acute contact toxicity data (LD50) on honey bees were used to develop and validate i) a two-category QSAR model (toxic/non-toxic; n=411) (sensitivity =0.93), specificity =0.85), balanced accuracy =0.90), Matthews correlation coefficient MCC=0.78), and ii) a regression-based model (n=113) (R2=0.74; MAE=0.52). Similarly, current study proposes the first MoA profiling for 113 pesticides active substances and the first harmonised MoA classification scheme for acute contact toxicity in honey bees, including LD50s data points from three different databases such as EFSA's OpenFoodTox, US-EPA ECOTOX and Pesticide Properties DataBase. Such classification allows to further define MoAs and the target site of Plant Protection Products (PPPs) active substances, thus enabling regulators and scientists to refine chemical grouping and toxicity extrapolations for single chemicals and component-based mixture risk assessment of multiple chemicals.

The full data collection and analysis of QSAR models, toxicity data (LD50) and Mode of Action (Moa) data are described in Carnesecchi et al., 2020 (DOI: doi.org/10.1016/j.scitotenv.2020.139243).

This work was supported by the European Food Safety Authority (EFSA) [contract number: OC/EFSA/SCER/2018/01 and NP/EFSA/AFSCO/2016/02 (Edoardo Carnesecchi)].
I
Data: Variation in pesticide toxicity in the western honey bee (Apis...
databank.illinois.edu
Updated Apr 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ling-Hsiu Liao; Wen-Yen Wu; May Berenbaum (2024). Data: Variation in pesticide toxicity in the western honey bee (Apis mellifera) associated with consuming phytochemically different monofloral honeys [Dataset]. http://doi.org/10.13012/B2IDB-6733018_V1
Explore at:
Unique identifier
https://doi.org/10.13012/B2IDB-6733018_V1
Dataset updated
Apr 23, 2024
Authors
Ling-Hsiu Liao; Wen-Yen Wu; May Berenbaum
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dataset funded by
U.S. Department of Agriculture (USDA)
Description
Data: Variation in pesticide toxicity in the western honey bee (Apis mellifera) associated with consuming phytochemically different monofloral honeys Includes: Identification and quantification of phenolic components of honeys: Raw_data_JOCE.xlsx – sheet: “HoneyPhytochemicals” Effects of honey phytochemicals on acute pesticide toxicity: Raw_data_JOCE.xlsx – sheet: “raw_LD50 Raw_data_JOCE.xlsx – sheet: “raw_LD50_hive_based”
f
Determination and comparison of the HQ and the revisited HQ.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Nov 20, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fusellier, Marion; Cousin, Marianne; Bodin, Laurent; Lafay, Florent; Giroud, Barbara; Buleté, Audrey; Pélissier, Michel; Tchamitchian, Marc; Poquet, Yannick; Brunet, Jean-Luc; Tchamitchian, Sylvie; Belzunces, Luc P. (2014). Determination and comparison of the HQ and the revisited HQ. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001170399
Explore at:
Dataset updated
Nov 20, 2014
Authors
Fusellier, Marion; Cousin, Marianne; Bodin, Laurent; Lafay, Florent; Giroud, Barbara; Buleté, Audrey; Pélissier, Michel; Tchamitchian, Marc; Poquet, Yannick; Brunet, Jean-Luc; Tchamitchian, Sylvie; Belzunces, Luc P.
Description
g a.s./ha: mass of active substance per hectare expressed in grams.µg a.s./bee: mass of active substance per bee expressed in micrograms.ng a.s./bee: nanogram of active substance per bee.E.D.: Experimental Data.LD50: Median Lethal Dose.HQ: Hazard Quotient (field rate (g/ha)/LD50 (µg/bee)).Revisited HQ (exposure (ng/bee)/LD50 (µg/bee)).N.C.: Not Calculated because of the low toxicity of the active substance.DAR EFSA: Draft Assessment Report of the European Food Safety Authority.PED US EPA: Pesticides Ecotoxicity Database of the United States Environmental Protection Agency.aTime at which the LD50 was determined. The LD50 values resulting from the experimental data were calculated at the time corresponding to a stabilized mortality.bFor each active substance, 2 scenarios of exposure are presented: the lowest and the highest homologated application rate.cFor each active substance, the highest and the lowest known LD50 values were compared to the lowest and highest homologated application rates, respectively.dHQ is the ratio between the application rate (g a.s./ha) and the LD50 (µg a.s./bee).eThe exposure was calculated from the application rate (ng a.s./cm2) and the mean exposure surface area determined with the 20 active substances (1.05 cm2/bee).fThe LD50 values from the experimental data were calculated with the BMD software from the US EPA.gThe revisited HQ is the ratio between the exposure (ng a.s./bee) and the LD50 (ng a.s./bee).hFor each active substance, the dose-mortality relationship was modeled at the time corresponding to a stabilized mortality.Determination and comparison of the HQ and the revisited HQ.
Comparison of larval honey bee (Apis mellifera) acute (LAO; single dose;...
plos.figshare.com
xls
Updated Jun 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Frank T. Farruggia; Kristina Garber; Christine Hartless; Kristin Jones; Lee Kyle; Nicholas Mastrota; Joseph P. Milone; Sujatha Sankula; Keith Sappington; Katherine Stebbins; Thomas Steeger; Holly Summers; Pamela G. Thompson; Michael Wagman (2023). Comparison of larval honey bee (Apis mellifera) acute (LAO; single dose; OECD TG 237) and chronic (LCO; repeat dose; OECD GD 239) LD50 values (N = 15) expressed in terms of μg ai larva-1 day-1. [Dataset]. http://doi.org/10.1371/journal.pone.0265962.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0265962.t002
Dataset updated
Jun 15, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Frank T. Farruggia; Kristina Garber; Christine Hartless; Kristin Jones; Lee Kyle; Nicholas Mastrota; Joseph P. Milone; Sujatha Sankula; Keith Sappington; Katherine Stebbins; Thomas Steeger; Holly Summers; Pamela G. Thompson; Michael Wagman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparison of larval honey bee (Apis mellifera) acute (LAO; single dose; OECD TG 237) and chronic (LCO; repeat dose; OECD GD 239) LD50 values (N = 15) expressed in terms of μg ai larva-1 day-1.
d
Data from: Exposure of dengue-1 virus to Aedes aegypti and sensitivity to...
catalog.data.gov
agdatacommons.nal.usda.gov
Updated Jun 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Data from: Exposure of dengue-1 virus to Aedes aegypti and sensitivity to insecticides [Dataset]. https://catalog.data.gov/dataset/data-from-exposure-of-dengue-1-virus-to-aedes-aegypti-and-sensitivity-to-insecticides
Explore at:
Dataset updated
Jun 5, 2025
Dataset provided by
Agricultural Research Service
Description
"PhD obj3 LD50 data submitted April 21 2025": applied to Table 1 of manuscript; used to calculate the lethal dose at which 25%, 50%, and 90% of the organisms (Aedes aegypti) die (LD25, LD50, LD90) using Probit software; PoloPlus - Leora Software. These data were generated by RLA under the supervision of KJL in July-August, 2021, at the USDA-ARS Center for Medical, Agricultural and Veterinary Entomology (USDA-ARS-CMAVE) facility in Gainesville, FL. using dilutions of technical malathion, technical permethrin, and rapeseed methyl ester. These values were generated in order to identify an LD50 dose to apply to 17d old female Ae. aegypti (7 days old prior to blood meal), 10 days after feeding on a blood meal containing dengue-1 virus. Ae. aegypti to determine the LD50 were blood fed at 7 d old, and aged 10 d after blood feeding, then exposed to the dilutions of technical malathion, technical permethrin, and rapeseed methyl ester."PhD Obj3 dengue mortality check submitted April 21 2025": used to populate the results for Table 2 and Table 3; describing the differences in mortality associated with exposure to a dengue-1 treated bloodmeal and LD50 pesticide/control treatment exposure in Aedes aegypti mosquitoes. Analysis was conducted using R. These data were generated by RLA under the supervision of BWA at the University of Florida - Florida Medical Entomology Laboratory (UF-FMEL) facility in Vero Beach, FL between September and November 2021. Ae. aegypti females were aged to 7 d, provided a dengue-1 tainted blood meal or sham blood meal, allowed to age for 10 d after blood feeding, then a topical LD50 dose of technical permethrin, technical malathion diluted in rapeseed methyl ester or neat rapeseed methyl ester as a control were applied and mortality was recorded at 24 h and 48 h. Following mortality checks, Ae. aegypti were frozen (-80 deg C) and checked for the presence of virus using RT-qPCR.
f
Data from: Large-Scale Modeling of Multispecies Acute Toxicity End Points...
datasetcatalog.nlm.nih.gov
acs.figshare.com
Updated Feb 3, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kleinstreuer, Nicole; Siramshetty, Vishal B.; Muratov, Eugene N.; Simeonov, Anton; Nicklaus, Marc C.; Tropsha, Alexander; Zakharov, Alexey V.; Alves, Vinicius M.; Jain, Sankalp (2021). Large-Scale Modeling of Multispecies Acute Toxicity End Points Using Consensus of Multitask Deep Learning Methods [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000773506
Explore at:
Dataset updated
Feb 3, 2021
Authors
Kleinstreuer, Nicole; Siramshetty, Vishal B.; Muratov, Eugene N.; Simeonov, Anton; Nicklaus, Marc C.; Tropsha, Alexander; Zakharov, Alexey V.; Alves, Vinicius M.; Jain, Sankalp
Description
Computational methods to predict molecular properties regarding safety and toxicology represent alternative approaches to expedite drug development, screen environmental chemicals, and thus significantly reduce associated time and costs. There is a strong need and interest in the development of computational methods that yield reliable predictions of toxicity, and many approaches, including the recently introduced deep neural networks, have been leveraged towards this goal. Herein, we report on the collection, curation, and integration of data from the public data sets that were the source of the ChemIDplus database for systemic acute toxicity. These efforts generated the largest publicly available such data set comprising > 80,000 compounds measured against a total of 59 acute systemic toxicity end points. This data was used for developing multiple single- and multitask models utilizing random forest, deep neural networks, convolutional, and graph convolutional neural network approaches. For the first time, we also reported the consensus models based on different multitask approaches. To the best of our knowledge, prediction models for 36 of the 59 end points have never been published before. Furthermore, our results demonstrated a significantly better performance of the consensus model obtained from three multitask learning approaches that particularly predicted the 29 smaller tasks (less than 300 compounds) better than other models developed in the study. The curated data set and the developed models have been made publicly available at https://github.com/ncats/ld50-multitask, https://predictor.ncats.io/, and https://cactus.nci.nih.gov/download/acute-toxicity-db (data set only) to support regulatory and research applications.
b
Registry of Toxic Effects of Chemical Substances
bioregistry.io
Updated Apr 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Registry of Toxic Effects of Chemical Substances [Dataset]. https://bioregistry.io/rtecs
Explore at:
Dataset updated
Apr 25, 2023
Description
RTECS is a compendium of data extracted from the open scientific literature. The data are recorded in the format developed by the RTECS staff and arranged in alphabetical order by prime chemical name. Six types of toxicity data are included in the file: (1) primary irritation; (2) mutagenic effects; (3) reproductive effects; (4) tumorigenic effects; (5) acute toxicity; and (6) other multiple dose toxicity. Specific numeric toxicity values such as LD50, LC50, TDLo, and TCLo are noted as well as species studied and route of administration used. For each citation, the bibliographic source is listed thereby enabling the user to access the actual studies cited. No attempt has been made to evaluate the studies cited in RTECS. The user has the responsibility of making such assessments.
Data from: Random forest algorithm-based accurate prediction of rat acute...
tandf.figshare.com
datasetcatalog.nlm.nih.gov
xlsx
Updated Feb 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Linrong Xiao; Jiyong Deng; Liping Yang; Xianwei Huang; Xinliang Yu (2024). Random forest algorithm-based accurate prediction of rat acute oral toxicity [Dataset]. http://doi.org/10.6084/m9.figshare.21444642.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21444642.v1
Dataset updated
Feb 29, 2024
Dataset provided by
Taylor & Francishttps://taylorandfrancis.com/
Authors
Linrong Xiao; Jiyong Deng; Liping Yang; Xianwei Huang; Xinliang Yu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Predicting acute oral toxicity LD50 of chemicals in rats is a challenge since many factors affect toxicity data. In this paper, 40 descriptors were successfully used to develop a quantitative structure–activity relationship model for 8448 rat acute oral toxicity logLD50 by applying the random forest (RF) algorithm. To develop the optimal RF model, a training set (5914 chemicals) was used to establish models, a validation set (1267 chemicals) used to tune RF parameters and a test set (1267 chemicals) used to assess the performance of RF models. It yielded correlation coefficients R of 0.9695 and rms errors (log unit) of 0.3171 for the training set, R = 0.8322 and rms = 0.2889 for the validation set and R = 0.8335 and rms = 0.3060 for the test set. More than 99% of rat acute oral toxicity logLD50 in the dataset can be accurately predicted, although the dataset is large.
d
Data from: Data release for toxicity of Antimycin A incorporated management...
catalog.data.gov
data.usgs.gov
Updated Nov 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Data release for toxicity of Antimycin A incorporated management bait for grass carp [Dataset]. https://catalog.data.gov/dataset/data-release-for-toxicity-of-antimycin-a-incorporated-management-bait-for-grass-carp
Explore at:
Dataset updated
Nov 21, 2025
Dataset provided by
U.S. Geological Survey
Description
The goal of this study was to develop and examine whether a management bait that can be used for selective control of grass carp. Our objectives were to 1) quantify the water-based 24-h LC50 of Antimycin-A for grass carp and rainbow trout, 2) quantify the 96-h LD50 of orally administered Antimycin-A laden bait for grass carp and rainbow trout, 3) quantify the leaching rate of Antimycin-A from the bait in water, and 4) determine if a management bait laden with Antimycin-A will be consumed by grass carp and cause lethality in the laboratory. To meet our objectives, Antimycin-A was encapsulated in a wax microparticle similar to Poole et al. (2018) and incorporated into a rapeseed bait for oral gavage feeding and consumption trials to demonstrate if Antimycin-A can be orally delivered, protected from degradation, and readily consumed by grass carp. The dataset includes raw files of toxicity survival information, water quality, bait leaching, and Antimycin-A analytical measurements.
f
Data from: An effective machine learning model for rat acute oral toxicity...
tandf.figshare.com
docx
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
J. Yan; Z. Shen (2025). An effective machine learning model for rat acute oral toxicity prediction of emerging chemicals: multi-domain applications and structure-activity relationships [Dataset]. http://doi.org/10.6084/m9.figshare.29712744.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.29712744.v1
Dataset updated
Jul 31, 2025
Dataset provided by
Taylor & Francis
Authors
J. Yan; Z. Shen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Given the widespread presence of emerging contaminants in the environment, assessing and ensuring their biosafety is urgent. Under the Globally Harmonized System (GHS), the LD50 parameter of acute oral toxicity (AOT) is crucial for chemical safety classification. Animal testing limitations have highlighted the need for alternative methods, and machine learning offers a new approach to predict LD50 through quantitative structure-activity relationship (QSAR) models. This study developed and optimized a machine learning model for LD50 classification of emerging contaminants based on data from more than 6000 known AOT. Using molecular descriptors and fingerprints, the model achieves an accuracy above 0.86 and a recall score over 0.84, outperforming previous models. The model’s robustness was confirmed across various types of emerging contaminants. Shapley additive explanations (SHAP) identified key descriptors like BCUTp_1h, ATSC1pe, and SLogP_VSA4, while the information gain (IG) method highlighted alert substructures [P-O, P-S]. These findings suggest that compounds with high polarizability, mean electronegativity and significant surface area may adversely affect rats. This model enhances understanding of acute toxicity mechanisms and serves as a tool for early screening of safer compounds, promoting the design of greener chemicals.
Guidelines used in selecting LD50 values from multiple sources of data.
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael DiBartolomeis; Susan Kegley; Pierre Mineau; Rosemarie Radford; Kendra Klein (2023). Guidelines used in selecting LD50 values from multiple sources of data. [Dataset]. http://doi.org/10.1371/journal.pone.0220029.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0220029.t002
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Michael DiBartolomeis; Susan Kegley; Pierre Mineau; Rosemarie Radford; Kendra Klein
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Guidelines used in selecting LD50 values from multiple sources of data.
d
Data for: Comparability of comparative toxicity: insect sensitivity to...
datasets.ai
researchdata.se
+1more
0
Updated Apr 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sveriges dataportal (2024). Data for: Comparability of comparative toxicity: insect sensitivity to imidacloprid reveals huge variations across species but also within species. [Dataset]. https://datasets.ai/datasets/https-doi-org-10-5878-w6ct-z602
Explore at:
0Available download formats
Dataset updated
Apr 30, 2024
Dataset authored and provided by
Sveriges dataportal
Description
Our tabular data consists of a very comprehensive collection of LC50 and LD50 values for imidacloprid across insect taxa. It is limited to adult insects and contains a range of variables such as body mass, pesticide formulation, exposure duration, geographic origin, insect strain etc. The file can be accessed by any program which can access spreadsheets (e.g. excel, R, matlab).
c
Data from: Adult mosquito and butterfly exposure to permethrin and relative...
s.cnmilf.com
data.usgs.gov
+1more
Updated Oct 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Adult mosquito and butterfly exposure to permethrin and relative risk following ULV sprays [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/adult-mosquito-and-butterfly-exposure-to-permethrin-and-relative-risk-following-ulv-sprays
Explore at:
Dataset updated
Oct 1, 2025
Dataset provided by
U.S. Geological Survey
Description
This dataset contains data of permethrin residues on adult mosquitoes and adult butterflies following their exposure to ultra-low volume (ULV) sprays containing permethrin. The dataset also contains toxicity information for permethrin; first for adult mosquitoes and adult butterflies following their exposure to the ULV sprays, and for adult mosquitoes exposed during toxicity tests to determine median lethal dose levels (LD50).
MOESM4 of SAR and QSAR modeling of a large collection of LD50 rat acute oral...
springernature.figshare.com
txt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Domenico Gadaleta; Kristijan VukoviÄ‡; Cosimo Toma; Giovanna Lavado; Agnes Karmaus; Kamel Mansouri; Nicole Kleinstreuer; Emilio Benfenati; Alessandra Roncaglioni (2023). MOESM4 of SAR and QSAR modeling of a large collection of LD50 rat acute oral toxicity data [Dataset]. http://doi.org/10.6084/m9.figshare.9756041.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.9756041.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Domenico Gadaleta; Kristijan VukoviÄ‡; Cosimo Toma; Giovanna Lavado; Agnes Karmaus; Kamel Mansouri; Nicole Kleinstreuer; Emilio Benfenati; Alessandra Roncaglioni
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 4. The list of descriptors used for models derivation are included.
Distribution of all available acute toxicity data (LD50 values) for honey...
plos.figshare.com
xls
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Frank T. Farruggia; Kristina Garber; Christine Hartless; Kristin Jones; Lee Kyle; Nicholas Mastrota; Joseph P. Milone; Sujatha Sankula; Keith Sappington; Katherine Stebbins; Thomas Steeger; Holly Summers; Pamela G. Thompson; Michael Wagman (2023). Distribution of all available acute toxicity data (LD50 values) for honey bee (Apis mellifera) adults and larvae. [Dataset]. http://doi.org/10.1371/journal.pone.0265962.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0265962.t003
Dataset updated
Jun 4, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Frank T. Farruggia; Kristina Garber; Christine Hartless; Kristin Jones; Lee Kyle; Nicholas Mastrota; Joseph P. Milone; Sujatha Sankula; Keith Sappington; Katherine Stebbins; Thomas Steeger; Holly Summers; Pamela G. Thompson; Michael Wagman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
AAO- Adult Acute Oral, LAO- Larval Acute Oral, LCO- Larval Chronic Oral.

Facebook

Twitter

Click to copy link

Link copied

Cite

U.S. EPA Office of Research and Development (ORD) - National Center for Computational Toxicology (NCCT) (2020). Toxicity Reference Database [Dataset]. https://catalog.data.gov/dataset/toxicity-reference-database

Toxicity Reference Database

Explore at:

351 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Dec 3, 2020

Dataset provided by

United States Environmental Protection Agencyhttp://www.epa.gov/

Description

The Toxicity Reference Database (ToxRefDB) contains approximately 30 years and $2 billion worth of animal studies. ToxRefDB allows scientists and the interested public to search and download thousands of animal toxicity testing results for hundreds of chemicals that were previously found only in paper documents. Currently, there are 474 chemicals in ToxRefDB, primarily the data rich pesticide active ingredients, but the number will continue to expand.

Clear search

Close search

Google apps

Main menu

Toxicity Reference Database

[Therapeutics Data Commons] Acute Toxicity LD50

The toxicity data of compounds

ld50-smiles-descriptors-dataset

Data from: Multimodal Dataset for LD50 Toxicity Prediction of Pesticides...

Data from: Integrating QSAR models predicting acute contact toxicity and...

Data: Variation in pesticide toxicity in the western honey bee (Apis...

Determination and comparison of the HQ and the revisited HQ.

Comparison of larval honey bee (Apis mellifera) acute (LAO; single dose;...

Data from: Exposure of dengue-1 virus to Aedes aegypti and sensitivity to...

Data from: Large-Scale Modeling of Multispecies Acute Toxicity End Points...

Registry of Toxic Effects of Chemical Substances

Data from: Random forest algorithm-based accurate prediction of rat acute...

Data from: Data release for toxicity of Antimycin A incorporated management...

Data from: An effective machine learning model for rat acute oral toxicity...

Guidelines used in selecting LD50 values from multiple sources of data.

Data for: Comparability of comparative toxicity: insect sensitivity to...

Data from: Adult mosquito and butterfly exposure to permethrin and relative...

MOESM4 of SAR and QSAR modeling of a large collection of LD50 rat acute oral...

Distribution of all available acute toxicity data (LD50 values) for honey...

Toxicity Reference DatabaseSee More Versions

Toxicity Reference Database