100+ datasets found
  1. ChEMBL EBI Small Molecules Database

    • kaggle.com
    zip
    Updated Feb 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google BigQuery (2019). ChEMBL EBI Small Molecules Database [Dataset]. https://www.kaggle.com/bigquery/ebi-chembl
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Feb 12, 2019
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Authors
    Google BigQuery
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Context

    ChEMBL is maintained by the European Bioinformatics Institute (EBI), of the European Molecular Biology Laboratory (EMBL), based at the Wellcome Trust Genome Campus, Hinxton, UK.

    Content

    ChEMBL is a manually curated database of bioactive molecules with drug-like properties used in drug discovery, including information about existing patented drugs.

    Schema: http://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_23/chembl_23_schema.png

    Documentation: http://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_23/schema_documentation.html

    Fork this notebook to get started on accessing data in the BigQuery dataset using the BQhelper package to write SQL queries.

    Acknowledgements

    “ChEMBL” by the European Bioinformatics Institute (EMBL-EBI), used under CC BY-SA 3.0. Modifications have been made to add normalized publication numbers.

    Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:ebi_chembl

    Banner photo by rawpixel on Unsplash

  2. n

    ChEMBL

    • neuinfo.org
    • scicrunch.org
    Updated Jun 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). ChEMBL [Dataset]. http://identifiers.org/RRID:SCR_014042
    Explore at:
    Dataset updated
    Jun 18, 2025
    Description

    Collection of bioactive drug-like small molecules that contains 2D structures, calculated properties and abstracted bioactivities. Used for drug discovery and chemical biology research. Clinical progress of new compounds is continuously integrated into the database.

  3. ChEMBL - Data Lakehouse Ready

    • registry.opendata.aws
    Updated Sep 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amazon Web Services (2020). ChEMBL - Data Lakehouse Ready [Dataset]. https://registry.opendata.aws/chembl/
    Explore at:
    Dataset updated
    Sep 15, 2020
    Dataset provided by
    Amazon Web Serviceshttps://aws.amazon.com/
    Amazon Web Serviceshttp://aws.amazon.com/
    Description

    ChEMBL is a manually curated database of bioactive molecules with drug-like properties. It brings together chemical, bioactivity and genomic data to aid the translation of genomic information into effective new drugs. This representation of ChEMBL is stored in Parquet format and most easily utilized through Amazon Athena. Follow the documentation for install instructions (< 2 minute install). New ChEMBL releases occur sporadically; the most up to date information on ChEMBL releases can be found here.

  4. P

    ChEMBL Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ChEMBL Dataset [Dataset]. https://paperswithcode.com/dataset/chembl-v-27
    Explore at:
    Description

    ChEMBL is a manually curated database of bioactive molecules with drug-like properties. It brings together chemical, bioactivity and genomic data to aid the translation of genomic information into effective new drugs.

  5. Drug Targets and Drug Lists Data Package

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). Drug Targets and Drug Lists Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/drug-targets-and-drug-lists-data-package/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Description

    This data package contains information on approved, researched and proven drug targets and drug lists.

  6. Compound activity data sets for 15 biological targets compiled from the...

    • zenodo.org
    tsv, txt, zip
    Updated Mar 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Itsuki Maeda; Itsuki Maeda; Akinori Sato; Akinori Sato; Shunsuke Tamura; Shunsuke Tamura; Tomoyuki Miyao; Tomoyuki Miyao (2022). Compound activity data sets for 15 biological targets compiled from the ChEMBL and PubChem databases. [Dataset]. http://doi.org/10.5281/zenodo.5748597
    Explore at:
    tsv, zip, txtAvailable download formats
    Dataset updated
    Mar 8, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Itsuki Maeda; Itsuki Maeda; Akinori Sato; Akinori Sato; Shunsuke Tamura; Shunsuke Tamura; Tomoyuki Miyao; Tomoyuki Miyao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Compound activity data sets for the 15 biological targets are deposited, along with structure-activity relationship matrices IDs. Active compounds were extracted from the ChEMBL database and inactive were from the PubChem database. Details of the data sets are described in the original publication. and the summary of the data sets is given in the readme.txt file.

  7. Z

    Data from: A consensus compound/bioactivity dataset for data-driven drug...

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated May 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Isigkeit, Laura (2022). A consensus compound/bioactivity dataset for data-driven drug design and chemogenomics [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6320760
    Explore at:
    Dataset updated
    May 13, 2022
    Dataset provided by
    Merk, Daniel
    Isigkeit, Laura
    Chaikuad, Apirat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the updated version of the dataset from 10.5281/zenodo.6320761

    Information

    The diverse publicly available compound/bioactivity databases constitute a key resource for data-driven applications in chemogenomics and drug design. Analysis of their coverage of compound entries and biological targets revealed considerable differences, however, suggesting benefit of a consensus dataset. Therefore, we have combined and curated information from five esteemed databases (ChEMBL, PubChem, BindingDB, IUPHAR/BPS and Probes&Drugs) to assemble a consensus compound/bioactivity dataset comprising 1144648 compounds with 10915362 bioactivities on 5613 targets (including defined macromolecular targets as well as cell-lines and phenotypic readouts). It also provides simplified information on assay types underlying the bioactivity data and on bioactivity confidence by comparing data from different sources. We have unified the source databases, brought them into a common format and combined them, enabling an ease for generic uses in multiple applications such as chemogenomics and data-driven drug design.

    The consensus dataset provides increased target coverage and contains a higher number of molecules compared to the source databases which is also evident from a larger number of scaffolds. These features render the consensus dataset a valuable tool for machine learning and other data-driven applications in (de novo) drug design and bioactivity prediction. The increased chemical and bioactivity coverage of the consensus dataset may improve robustness of such models compared to the single source databases. In addition, semi-automated structure and bioactivity annotation checks with flags for divergent data from different sources may help data selection and further accurate curation.

    This dataset belongs to the publication: https://doi.org/10.3390/molecules27082513

    Structure and content of the dataset

    Dataset structure
    

    ChEMBL

    ID

    PubChem

    ID

    IUPHAR

    ID

        Target
    

    Activity

    type

        Assay type
        Unit
        Mean C (0)
        ...
        Mean PC (0)
        ...
        Mean B (0)
        ...
        Mean I (0)
        ...
        Mean PD (0)
        ...
        Activity check annotation
        Ligand names
        Canonical SMILES C
        ...
        Structure check (Tanimoto)
        Source
    

    The dataset was created using the Konstanz Information Miner (KNIME) (https://www.knime.com/) and was exported as a CSV-file and a compressed CSV-file.

    Except for the canonical SMILES columns, all columns are filled with the datatype ‘string’. The datatype for the canonical SMILES columns is the smiles-format. We recommend the File Reader node for using the dataset in KNIME. With the help of this node the data types of the columns can be adjusted exactly. In addition, only this node can read the compressed format.

    Column content:

    ChEMBL ID, PubChem ID, IUPHAR ID: chemical identifier of the databases

    Target: biological target of the molecule expressed as the HGNC gene symbol

    Activity type: for example, pIC50

    Assay type: Simplification/Classification of the assay into cell-free, cellular, functional and unspecified

    Unit: unit of bioactivity measurement

    Mean columns of the databases: mean of bioactivity values or activity comments denoted with the frequency of their occurrence in the database, e.g. Mean C = 7.5 *(15) -> the value for this compound-target pair occurs 15 times in ChEMBL database

    Activity check annotation: a bioactivity check was performed by comparing values from the different sources and adding an activity check annotation to provide automated activity validation for additional confidence

    no comment: bioactivity values are within one log unit;

    check activity data: bioactivity values are not within one log unit;

    only one data point: only one value was available, no comparison and no range calculated;

    no activity value: no precise numeric activity value was available;

    no log-value could be calculated: no negative decadic logarithm could be calculated, e.g., because the reported unit was not a compound concentration

    Ligand names: all unique names contained in the five source databases are listed

    Canonical SMILES columns: Molecular structure of the compound from each database

    Structure check (Tanimoto): To denote matching or differing compound structures in different source databases

    match: molecule structures are the same between different sources;

    no match: the structures differ. We calculated the Jaccard-Tanimoto similarity coefficient from Morgan Fingerprints to reveal true differences between sources and reported the minimum value;

    1 structure: no structure comparison is possible, because there was only one structure available;

    no structure: no structure comparison is possible, because there was no structure available.

    Source: From which databases the data come from

  8. t

    The ChEMBL database in 2017 - Dataset - LDM

    • service.tib.eu
    Updated Dec 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). The ChEMBL database in 2017 - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/the-chembl-database-in-2017
    Explore at:
    Dataset updated
    Dec 3, 2024
    Description

    The ChEMBL database is a large collection of bioactive compounds and their biological activities.

  9. h

    smiles-molecules-chembl

    • huggingface.co
    Updated Aug 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antoine Branchoux (2024). smiles-molecules-chembl [Dataset]. https://huggingface.co/datasets/antoinebcx/smiles-molecules-chembl
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 9, 2024
    Authors
    Antoine Branchoux
    Description

    ChEMBL Molecule Generation Dataset

      Dataset Description
    

    ChEMBL is a manually curated database of bioactive molecules with drug-like properties. It brings together chemical, bioactivity and genomic data to aid the translation of genomic information into effective new drugs.

      Task Description
    

    For both distribution learning-based and goal-oriented molecule generation. That is to generate new molecules that has desirable properties measured by some oracles.… See the full description on the dataset page: https://huggingface.co/datasets/antoinebcx/smiles-molecules-chembl.

  10. ChEMBL Data

    • console.cloud.google.com
    Updated Aug 4, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:Google%20Patents%20Public%20Datasets&inv=1&invt=Ab2qSw (2020). ChEMBL Data [Dataset]. https://console.cloud.google.com/marketplace/product/google_patents_public_datasets/chembl
    Explore at:
    Dataset updated
    Aug 4, 2020
    Dataset provided by
    Googlehttp://google.com/
    License
    Description

    ChEMBL Data is a manually curated database of small molecules used in drug discovery, including information about existing patented drugs.

  11. Data from: hERG Me Out

    • acs.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paul Czodrowski (2023). hERG Me Out [Dataset]. http://doi.org/10.1021/ci400308z.s002
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    ACS Publications
    Authors
    Paul Czodrowski
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    A detailed analysis of the hERG content inside the ChEMBL database is performed. The correlation between the outcome from binding assays and functional assays is probed. On the basis of descriptor distributions, design paradigms with respect to structural and physicochemical properties of hERG active and hERG inactive compounds are challenged. Finally, classification models with different data sets are trained. All source code is provided, which is based on the Python open source packages RDKit and scikit-learn to enable the community to rerun the experiments. The code is stored on github (https://github.com/pzc/herg_chembl_jcim).

  12. Z

    ChEMBL data against CHEMBL367, CHEMBL368 and CHEMBL612348

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arnaud Gaudry (2023). ChEMBL data against CHEMBL367, CHEMBL368 and CHEMBL612348 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7953283
    Explore at:
    Dataset updated
    May 20, 2023
    Dataset authored and provided by
    Arnaud Gaudry
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data from ChEMBL compounds reported with an activity against one of the following targets: CHEMBL367 : Leishmania donovani, CHEMBL368 : Trypanosoma cruzi, and CHEMBL612348 : Trypanosoma brucei rhodesiense.

  13. f

    Data_Sheet_1_How to Achieve Better Results Using PASS-Based Virtual...

    • figshare.com
    • frontiersin.figshare.com
    docx
    Updated Jun 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pavel V. Pogodin; Alexey A. Lagunin; Anastasia V. Rudik; Dmitry A. Filimonov; Dmitry S. Druzhilovskiy; Mark C. Nicklaus; Vladimir V. Poroikov (2023). Data_Sheet_1_How to Achieve Better Results Using PASS-Based Virtual Screening: Case Study for Kinase Inhibitors.docx [Dataset]. http://doi.org/10.3389/fchem.2018.00133.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Frontiers
    Authors
    Pavel V. Pogodin; Alexey A. Lagunin; Anastasia V. Rudik; Dmitry A. Filimonov; Dmitry S. Druzhilovskiy; Mark C. Nicklaus; Vladimir V. Poroikov
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Discovery of new pharmaceutical substances is currently boosted by the possibility of utilization of the Synthetically Accessible Virtual Inventory (SAVI) library, which includes about 283 million molecules, each annotated with a proposed synthetic one-step route from commercially available starting materials. The SAVI database is well-suited for ligand-based methods of virtual screening to select molecules for experimental testing. In this study, we compare the performance of three approaches for the analysis of structure-activity relationships that differ in their criteria for selecting of “active” and “inactive” compounds included in the training sets. PASS (Prediction of Activity Spectra for Substances), which is based on a modified Naïve Bayes algorithm, was applied since it had been shown to be robust and to provide good predictions of many biological activities based on just the structural formula of a compound even if the information in the training set is incomplete. We used different subsets of kinase inhibitors for this case study because many data are currently available on this important class of drug-like molecules. Based on the subsets of kinase inhibitors extracted from the ChEMBL 20 database we performed the PASS training, and then applied the model to ChEMBL 23 compounds not yet present in ChEMBL 20 to identify novel kinase inhibitors. As one may expect, the best prediction accuracy was obtained if only the experimentally confirmed active and inactive compounds for distinct kinases in the training procedure were used. However, for some kinases, reasonable results were obtained even if we used merged training sets, in which we designated as inactives the compounds not tested against the particular kinase. Thus, depending on the availability of data for a particular biological activity, one may choose the first or the second approach for creating ligand-based computational tools to achieve the best possible results in virtual screening.

  14. i

    ChEMBL

    • registry.identifiers.org
    • bioregistry.io
    Updated Apr 24, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). ChEMBL [Dataset]. https://registry.identifiers.org/registry/chembl
    Explore at:
    Dataset updated
    Apr 24, 2021
    Description

    ChEMBL is a database of bioactive compounds, their quantitative properties and bioactivities (binding constants, pharmacology and ADMET, etc). The data is abstracted and curated from the primary scientific literature.

  15. Analog series-based scaffolds from ChEMBL with associated activity...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dilyana Dimova; Dagmar Stumpfe; Ye Hu; Jürgen Bajorath; Dilyana Dimova; Dagmar Stumpfe; Ye Hu; Jürgen Bajorath (2020). Analog series-based scaffolds from ChEMBL with associated activity information [Dataset]. http://doi.org/10.5281/zenodo.155302
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dilyana Dimova; Dagmar Stumpfe; Ye Hu; Jürgen Bajorath; Dilyana Dimova; Dagmar Stumpfe; Ye Hu; Jürgen Bajorath
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Reported is the activity information for the 12,294 analog series-based (ASB) scaffolds extracted from ChEMBL database. For each ASB scaffold structural and activity information for all analogs comprising the analog series is provoded.

  16. i

    ChEMBL target

    • registry.identifiers.org
    Updated Nov 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). ChEMBL target [Dataset]. https://registry.identifiers.org/registry/chembl.target
    Explore at:
    Dataset updated
    Nov 26, 2024
    Description

    ChEMBL is a database of bioactive compounds, their quantitative properties and bioactivities (binding constants, pharmacology and ADMET, etc). The data is abstracted and curated from the primary scientific literature.

  17. 31 ChEMBL data sets for regression modeling

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jan 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jenny Balfer; Jürgen Bajorath; Jenny Balfer; Jürgen Bajorath (2020). 31 ChEMBL data sets for regression modeling [Dataset]. http://doi.org/10.5281/zenodo.13986
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 21, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jenny Balfer; Jürgen Bajorath; Jenny Balfer; Jürgen Bajorath
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    From ChEMBL version 17, 31 compound data sets have been selected for regression modeling. Compounds had to be active against human targets in a direct inhibition/binding assay with highest ChEMBL confidence score and Ki values below 100 micromolar. Multiple Ki values for the same compound were averaged if they fell into the same order of magnitude, or else they were disregarded. Duplicates, known pan-assay interference, and other reactive molecules were removed. Only sets with at least 500 compounds were considered.

    Note: The SD files contain a field "pKi"; note however that this field contains the Ki value in nM units, not the logarithmic value.

  18. f

    Data from: PDEStrIAn: A Phosphodiesterase Structure and Ligand Interaction...

    • acs.figshare.com
    zip
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chimed Jansen; Albert J. Kooistra; Georgi K. Kanev; Rob Leurs; Iwan J. P. de Esch; Chris de Graaf (2023). PDEStrIAn: A Phosphodiesterase Structure and Ligand Interaction Annotated Database As a Tool for Structure-Based Drug Design [Dataset]. http://doi.org/10.1021/acs.jmedchem.5b01813.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    ACS Publications
    Authors
    Chimed Jansen; Albert J. Kooistra; Georgi K. Kanev; Rob Leurs; Iwan J. P. de Esch; Chris de Graaf
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    A systematic analysis is presented of the 220 phosphodiesterase (PDE) catalytic domain crystal structures present in the Protein Data Bank (PDB) with a focus on PDE–ligand interactions. The consistent structural alignment of 57 PDE ligand binding site residues enables the systematic analysis of PDE–ligand interaction fingerprints (IFPs), the identification of subtype-specific PDE–ligand interaction features, and the classification of ligands according to their binding modes. We illustrate how systematic mining of this phosphodiesterase structure and ligand interaction annotated (PDEStrIAn) database provides new insights into how conserved and selective PDE interaction hot spots can accommodate the large diversity of chemical scaffolds in PDE ligands. A substructure analysis of the cocrystallized PDE ligands in combination with those in the ChEMBL database provides a toolbox for scaffold hopping and ligand design. These analyses lead to an improved understanding of the structural requirements of PDE binding that will be useful in future drug discovery studies.

  19. O

    ChEMBL

    • opendatalab.com
    zip
    Updated May 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). ChEMBL [Dataset]. https://opendatalab.com/OpenDataLab/ChEMBL
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 7, 2024
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    ChEMBL is a database of bioactive drug-like small molecules, it contains 2-D structures, calculated properties (e.g. logP, Molecular Weight, Lipinski Parameters, etc.) and abstracted bioactivities (e.g. binding constants, pharmacology and ADMET data). The data is abstracted and curated from the primary scientific literature, and cover a significant fraction of the SAR and discovery of modern drugs We attempt to normalise the bioactivities into a uniform set of end-points and units where possible, and also to tag the links between a molecular target and a published assay with a set of varying confidence levels. Additional data on clinical progress of compounds is being integrated into ChEMBL at the current time.

  20. Raw data extracted from ChEMBL

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolas Drizard; Lukas Friedrich; Lukas Friedrich; Nicolas Drizard (2022). Raw data extracted from ChEMBL [Dataset]. http://doi.org/10.5281/zenodo.5045055
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nicolas Drizard; Lukas Friedrich; Lukas Friedrich; Nicolas Drizard
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    Raw data files extracted from ChEMBL for the MELLODDY project.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Google BigQuery (2019). ChEMBL EBI Small Molecules Database [Dataset]. https://www.kaggle.com/bigquery/ebi-chembl
Organization logo

ChEMBL EBI Small Molecules Database

A large-scale bioactivity database for drug discovery (BigQuery)

Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
BigQueryhttps://cloud.google.com/bigquery
Authors
Google BigQuery
License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

Context

ChEMBL is maintained by the European Bioinformatics Institute (EBI), of the European Molecular Biology Laboratory (EMBL), based at the Wellcome Trust Genome Campus, Hinxton, UK.

Content

ChEMBL is a manually curated database of bioactive molecules with drug-like properties used in drug discovery, including information about existing patented drugs.

Schema: http://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_23/chembl_23_schema.png

Documentation: http://ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/releases/chembl_23/schema_documentation.html

Fork this notebook to get started on accessing data in the BigQuery dataset using the BQhelper package to write SQL queries.

Acknowledgements

“ChEMBL” by the European Bioinformatics Institute (EMBL-EBI), used under CC BY-SA 3.0. Modifications have been made to add normalized publication numbers.

Data Origin: https://bigquery.cloud.google.com/dataset/patents-public-data:ebi_chembl

Banner photo by rawpixel on Unsplash

Search
Clear search
Close search
Google apps
Main menu