100+ datasets found
  1. Comprehensive A-Z Pharmaceutical Drug Database

    • kaggle.com
    zip
    Updated Sep 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shayan Husain (2025). Comprehensive A-Z Pharmaceutical Drug Database [Dataset]. https://www.kaggle.com/datasets/shayanhusain/comprehensive-a-z-pharmaceutical-drug-database
    Explore at:
    zip(43473 bytes)Available download formats
    Dataset updated
    Sep 22, 2025
    Authors
    Shayan Husain
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides a comprehensive, structured overview of hundreds of commonly used pharmaceutical drugs, listed alphabetically by generic name. It serves as a valuable resource for healthcare students, professionals, data analysts, and anyone interested in pharmacology.

    Compiled from reputable sources like the FDA Prescribing Information, Lexicomp, and Micromedex, each entry includes detailed information on drug properties, safety, and usage. This dataset is ideal for educational purposes, data analysis projects, and as a reference for building healthcare applications.

    Key Features (Columns):

    Generic Name: The common name of the drug.

    Drug Class: The pharmacological category (e.g., SSRI, Beta-Blocker, Statin).

    Indications: The medical conditions the drug is used to treat.

    Dosage Form: The physical form of the drug (e.g., Tablet, Capsule, Injection, Cream).

    Strength: The potency of the drug (e.g., 500 mg, 0.1%).

    Route of Administration: How the drug is administered (e.g., Oral, Topical, Intravenous).

    Side Effects: Common adverse reactions associated with the drug.

    Contraindications: Conditions or factors that serve as a reason to not use the drug.

    Interaction warnings & Precautions: Important information on how the drug interacts with others and key safety measures.

    Storage Conditions: Recommended storage instructions (e.g., Room Temperature, Refrigerate).

    Reference: The primary source(s) of the information.

    Availability: Whether the drug is typically available by prescription or over-the-counter (OTC).

    Potential Use Cases:

    Educational Tool: For students of medicine, pharmacy, and nursing to learn about drug properties.

    Data Analysis & Visualization: Analyze the distribution of drug classes, common side effects, or storage requirements.

    Drug Interaction Checker (Basic Foundation): Use as a base dataset to build a simple drug interaction screening tool.

    Clinical Reference Application: Populate a mobile or web app with essential drug information.

    Natural Language Processing (NLP): Train models to extract drug information from text or to classify drugs based on their descriptions.

    File(s):

    drugs_from_a_to_z.csv (The Excel data converted to a CSV for broader compatibility)

    Acknowledgements:

    This dataset synthesizes information from publicly available drug monographs and prescribing information from sources including the U.S. Food and Drug Administration (FDA), Lexicomp, and Micromedex.

  2. Drug Targets and Drug Lists Data Package

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). Drug Targets and Drug Lists Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/drug-targets-and-drug-lists-data-package/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Description

    This data package contains information on approved, researched and proven drug targets and drug lists.

  3. D

    Drug Reference App Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Oct 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Drug Reference App Report [Dataset]. https://www.datainsightsmarket.com/reports/drug-reference-app-1963370
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    Oct 23, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global Drug Reference App market is poised for substantial expansion, projected to reach a market size of approximately $1,800 million by 2025, with a robust Compound Annual Growth Rate (CAGR) of 16.5% anticipated through 2033. This significant growth is propelled by a confluence of factors, primarily the escalating need for accurate, up-to-date pharmaceutical information among healthcare professionals, researchers, and students. The increasing prevalence of chronic diseases necessitates continuous access to comprehensive drug databases for effective patient management and treatment. Furthermore, the proliferation of smartphones and the widespread adoption of digital health solutions are creating fertile ground for the adoption of these essential applications. Key drivers include the demand for enhanced clinical decision support, streamlined drug discovery and development processes, and the growing emphasis on evidence-based medicine. The market is segmented into distinct applications, with Doctors representing the largest segment due to their direct involvement in prescribing and managing medications, followed by Researchers, Students, and Other users. The landscape of the Drug Reference App market is characterized by several prevailing trends and a few inherent restraints. A significant trend is the integration of advanced features such as artificial intelligence (AI) and machine learning (ML) to provide personalized drug recommendations, identify potential drug interactions, and predict treatment outcomes. The development of user-friendly interfaces, offline access capabilities, and multilingual support are also crucial for broadening accessibility and enhancing user experience. The rise of specialized drug reference apps catering to specific therapeutic areas or professional niches is another notable trend. However, challenges such as data security and privacy concerns, the cost of maintaining extensive and updated drug databases, and the need for continuous regulatory compliance can act as restraints. Despite these hurdles, the market is expected to witness strong growth driven by continuous innovation and the indispensable role these apps play in modern healthcare. Key players like Epocrates, Wolters Kluwer (Lexicomp), and Medscape are at the forefront, continually evolving their offerings to meet the dynamic needs of the healthcare ecosystem. This comprehensive report delves into the dynamic Drug Reference App market, providing in-depth analysis and actionable insights for stakeholders. Covering a study period from 2019 to 2033, with a base year of 2025 and a forecast period extending from 2025 to 2033, the report meticulously examines historical trends and future projections. The estimated market size for 2025 is projected to reach $3.5 million, with significant growth anticipated throughout the forecast period.

  4. Druggable Genome Comprehensive Drug Targets

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). Druggable Genome Comprehensive Drug Targets [Dataset]. https://www.johnsnowlabs.com/marketplace/druggable-genome-comprehensive-drug-targets/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Area covered
    N/A
    Description

    This dataset Druggable Genome Comprehensive Drug Targets is a selection of supplementary data from "The Druggable Genome: Evaluation of Drug Targets in Clinical Trials Suggests Major Shifts in Molecular Class and Indication" (2013) [PMID:24016212]. The comprehensive list includes 461 targets of approved drugs.

  5. n

    Comprehensive Drug Self-administration and Discrimination Bibliographic...

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Comprehensive Drug Self-administration and Discrimination Bibliographic Databases [Dataset]. http://identifiers.org/RRID:SCR_000707
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Database of bibliographic details of over 9,000 references published between 1951 and the present day, and includes abstracts, journal articles, book chapters and books replacing the two former separate websites for Ian Stolerman's drug discrimination database and Dick Meisch's drug self-administration database. Lists of standardized keywords are used to index the citations. Most of the keywords are generic drug names but they also include methodological terms, species studied and drug classes. This index makes it possible to selectively retrieve references according to the drugs used as the training stimuli, drugs used as test stimuli, drugs used as pretreatments, species, etc. by entering your own terms or by using our comprehensive lists of search terms. Drug Discrimination Drug Discrimination is widely recognized as one of the major methods for studying the behavioral and neuropharmacological effects of drugs and plays an important role in drug discovery and investigations of drug abuse. In Drug Discrimination studies, effects of drugs serve as discriminative stimuli that indicate how reinforcers (e.g. food pellets) can be obtained. For example, animals can be trained to press one of two levers to obtain food after receiving injections of a drug, and to press the other lever to obtain food after injections of the vehicle. After the discrimination has been learned, the animal starts pressing the appropriate lever according to whether it has received the training drug or vehicle; accuracy is very good in most experiments (90 or more correct). Discriminative stimulus effects of drugs are readily distinguished from the effects of food alone by collecting data in brief test sessions where responses are not differentially reinforced. Thus, trained subjects can be used to determine whether test substances are identified as like or unlike the drug used for training. Drug Self-administration Drug Self-administration methodology is central to the experimental analysis of drug abuse and dependence (addiction). It constitutes a key technique in numerous investigations of drug intake and its neurobiological basis and has even been described by some as the gold standard among methods in the area. Self-administration occurs when, after a behavioral act or chain of acts, a feedback loop results in the introduction of a drug or drugs into a human or infra-human subject. The drug is usually conceptualized as serving the role of a positive reinforcer within a framework of operant conditioning. For example, animals can be given the opportunity to press a lever to obtain an infusion of a drug through a chronically-indwelling venous catheter. If the available dose of the drug serves as a positive reinforcer then the rate of lever-pressing will increase and a sustained pattern of responding at a high rate may develop. Reinforcing effects of drugs are distinguishable from other actions such as increases in general activity by means of one or more control procedures. Trained subjects can be used to investigate the behavioral and neuropharmacological basis of drug-taking and drug-seeking behaviors and the reinstatement of these behaviors in subjects with a previous history of drug intake (relapse models). Other applications include evaluating novel compounds for liability to produce abuse and dependence and for their value in the treatment of drug dependence and addiction. The bibliography is updated about four times per year.

  6. DrugBank Database Data Package

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). DrugBank Database Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/drugbank-database-data-package/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Description

    DrugBank Vocabulary contains information on DrugBank identifiers, names, and synonyms to permit easy linking and integration into any type of project. DrugBank is a richly annotated resource that combines detailed drug data with comprehensive drug target and drug action information. DrugBank is widely used to facilitate in silico drug target discovery, drug design, drug docking or screening, drug metabolism prediction, drug interaction prediction and general pharmaceutical education.

  7. Drug Labels & Side Effects Dataset | 1400+ Records

    • kaggle.com
    zip
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pratyush Puri (2025). Drug Labels & Side Effects Dataset | 1400+ Records [Dataset]. https://www.kaggle.com/datasets/pratyushpuri/drug-labels-and-side-effects-dataset-1400-records
    Explore at:
    zip(51886 bytes)Available download formats
    Dataset updated
    Aug 2, 2025
    Authors
    Pratyush Puri
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Drug Labels and Side Effects Dataset

    Dataset Overview

    This comprehensive pharmaceutical synthetic dataset contains 1,393 records of synthetic drug information with 15 columns, designed for data science projects focusing on healthcare analytics, drug safety analysis, and pharmaceutical research. The dataset simulates real-world pharmaceutical data with appropriate variety and realistic constraints for machine learning applications.

    Dataset Specifications

    AttributeValue
    Total Records1,393
    Total Columns15
    File FormatCSV
    Data TypesMixed (intentional for data cleaning practice)
    DomainPharmaceutical/Healthcare
    Use CaseML Training, Data Analysis, Healthcare Research

    Column Specifications

    Categorical Features

    Column NameData TypeUnique ValuesDescriptionExample Values
    drug_nameObject1,283 uniquePharmaceutical drug names with realistic naming patterns"Loxozepam32", "Amoxparin43", "Virazepam10"
    manufacturerObject10 uniqueMajor pharmaceutical companiesPfizer Inc., AstraZeneca, Johnson & Johnson
    drug_classObject10 uniqueTherapeutic drug classificationsAntibiotic, Analgesic, Antidepressant, Vaccine
    indicationsObject10 uniqueMedical conditions the drug treats"Pain relief", "Bacterial infections", "Depression treatment"
    side_effectsObject434 uniqueCombination of side effects (1-3 per drug)"Nausea, Dizziness", "Headache, Fatigue, Rash"
    administration_routeObject7 uniqueMethod of drug deliveryOral, Intravenous, Topical, Inhalation, Sublingual
    contraindicationsObject10 uniqueMedical warnings for drug usage"Pregnancy", "Heart disease", "Liver disease"
    warningsObject10 uniqueSafety instructions and precautions"Take with food", "Avoid alcohol", "Monitor blood pressure"
    batch_numberObject1,393 uniqueManufacturing batch identifiers"xr691zv", "Ye266vU", "Rm082yX"
    expiry_dateObject782 uniqueDrug expiration dates (YYYY-MM-DD)"2025-12-13", "2027-03-09", "2026-10-06"
    side_effect_severityObject3 uniqueSeverity classificationMild, Moderate, Severe
    approval_statusObject3 uniqueRegulatory approval statusApproved, Pending, Rejected

    Numerical Features

    Column NameData TypeRangeMeanStd DevDescription
    approval_yearFloat/String*1990-20242006.710.0FDA/regulatory approval year
    dosage_mgFloat/String*10-990 mg499.7290.0Medication strength in milligrams
    price_usdFloat/String*$2.32-$499.24$251.12$144.81Drug price in US dollars

    *Intentionally stored as mixed types for data cleaning practice

    Key Statistics

    Manufacturer Distribution

    ManufacturerCountPercentage
    Pfizer Inc.17012.2%
    AstraZeneca~140~10.0%
    Merck & Co.~140~10.0%
    Johnson & Johnson~140~10.0%
    GlaxoSmithKline~140~10.0%
    Others~623~44.8%

    Drug Class Distribution

    Drug ClassCountMost Common
    Anti-inflammatory154
    Antibiotic~140
    Antidepressant~140
    Antiviral~140
    Vaccine~140
    Others~679

    Side Effect Severity

    SeverityCountPercentage
    Severe48835.0%
    Moderate~453~32.5%
    Mild~452~32.5%

    Potential Use Cases

    1. Machine Learning Applications

    • Drug Approval Prediction: Predict approval likelihood based on drug characteristics
    • Price Prediction: Estimate drug pricing using features like class, manufacturer, dosage
    • Side Effect Classification: Classify severity based on drug properties
    • Market Success Analysis: Analyze factors contributing to drug market performance

    2. Data Engineering Projects

    • ETL Pipeline Development: Practice data cleaning and transformation
    • Data Quality Assessment: Implement data validation and quality checks
    • Database Design: Create normalized pharmaceutical database schema
    • Real-time Processing: Stream processing for drug monitoring systems

    3. Business Intelligence

    • Pharmaceutical Market Analysis: Manufacturer market share and competitive analysis
    • Drug Safety Analytics: Side effect patterns and safety profile analysis
    • Regulatory Compliance: Approval trends and regulatory timeline analysis
    • Pricing Strategy: Competitive pricing analysis across drug classes

    Recommended Next Steps

    1. Data Cleaning Pipeline: Implement comprehe...
  8. b

    DrugBank

    • bioregistry.io
    Updated Apr 12, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). DrugBank [Dataset]. http://identifiers.org/re3data:r3d100010544
    Explore at:
    Dataset updated
    Apr 12, 2021
    Description

    The DrugBank database is a bioinformatics and chemoinformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. This collection references drug information.

  9. Drug-Food Interactions Dataset

    • kaggle.com
    zip
    Updated Sep 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shayan Husain (2025). Drug-Food Interactions Dataset [Dataset]. https://www.kaggle.com/datasets/shayanhusain/drug-food-interactions-dataset
    Explore at:
    zip(58289 bytes)Available download formats
    Dataset updated
    Sep 26, 2025
    Authors
    Shayan Husain
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This dataset provides a crucial resource for healthcare professionals, researchers, patients, and developers by cataloging known food, beverage, and herbal supplement interactions for a wide range of pharmaceutical drugs. Sourced from DrugBank 6.0 (2024), a leading bioinformatics and cheminformatics database, this information is vital for understanding how diet can affect drug efficacy, absorption, and safety.

    Proper management of drug-food interactions is essential for maximizing therapeutic benefits and minimizing adverse effects. This dataset can be used for:

    • Clinical Decision Support: Informing doctors, pharmacists, and patients about potential dietary modifications during treatment.
    • Research: Analyzing patterns in drug interactions, particularly with common items like grapefruit, alcohol, and St. John's Wort.
    • Application Development: Powering features in mobile health apps, electronic health records (EHRs), and medication management tools.
    • Public Health Education: Raising awareness about the importance of considering diet as part of a medication regimen.

    Source: The data is extracted from DrugBank 6.0. The official citation is: Knox C, Wilson M, Klinger CM, et al. DrugBank 6.0: the DrugBank Knowledgebase for 2024. Nucleic Acids Res. 2024 Jan 5;52(D1):D1265-D1275. doi: 10.1093/nar/gkad976.

  10. FDA-Approved Drugs & Therapeutics

    • kaggle.com
    zip
    Updated Jan 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). FDA-Approved Drugs & Therapeutics [Dataset]. https://www.kaggle.com/datasets/thedevastator/fda-approved-drugs-therapeutics
    Explore at:
    zip(2006218 bytes)Available download formats
    Dataset updated
    Jan 23, 2023
    Authors
    The Devastator
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    FDA-Approved Drugs & Therapeutics

    Exploring Human Drug & Biological Therapies

    By Health [source]

    About this dataset

    This dataset contains a wealth of information about FDA-approved human drugs and biological therapeutic products. Whether you are studying the effects of drugs, exploring new treatment methods, or researching potential side effects, this database holds detailed insights into the approved medicines available to individuals today. From brand names to generic prescriptions to over-the-counter products, you can access a variety of important details such as reviews, labels, approval letters and patient information. Gain a comprehensive understanding of the drug products approved since 1939 to develop safer and more effective treatments for patients going forward

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset contains information about nearly all of the FDA-approved brand name and generic prescription drugs, as well as biological therapeutic products. It is important to note that most information is available for drug products approved since 1998, meaning that drugs approved before then may have less comprehensive data associated with them.

    To get started using this dataset, you should begin by familiarizing yourself with the available columns in the dataset: - Drug Name--The name of the drug (brand name or generic). - Active Ingredient(s)--A list of active ingredients present in each drug product.
    - Dosage form--The physical form and route a patient takes a specific drug product (e.g., tablet taken orally).
    - Approval Description--A summary of key features and benefits related to the approval process for each product.

    • Route(s) -- The manner or way by which a medication has been formulated to be absorbed or introduced into an organism's system (e.g., oral ingestion, injection).

    Next, you will want to understand what type of queries can be run on this data set so that you can effectively search for specific items to analyze within your project goals:

    •You can search through column headers/specific terms in order to find information related to your query such as active ingredients, dosage forms or routes used by different products;
    •You can use simple comparison operators such as “=”, “<” and “>” to find ranges between certain values; •You can utilize Boolean operators such as “AND” & “OR” within SQL statements in order to combine two conditions together; •You can implement searching feature on multiple columns simultaneously using a combination of LIKE commands coupled with wildcard characters (); •Lastly you can build subqueries upon which more complicated queries are applied depending on your research objectives (these advanced scripts often incorporate functions like SUM(), AVG() etc.)

    Research Ideas

    • Developing a tool to help patients identify potential interactions between different drugs they are taking by cross-referencing this dataset with the patient's records.
    • Developing an AI/machine learning model which evaluates all approved drugs and their effects on disease, helping physicians determine the best treatment options for their patients.
    • Building an online marketplace, sponsored by health care organizations or private companies, where customers can compare prices and availability of FDA approved drugs before buying them online or in stores

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Open Database License (ODbL) v1.0 - You are free to: - Share - copy and redistribute the material in any medium or format. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices. - No Derivatives - If you remix, transform, or build upon the material, you may not distribute the modified material. - No additional restrictions - You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

    Columns

    Acknowledgements

    If you use this dataset in your ...

  11. n

    MedlinePlus

    • neuinfo.org
    • rrid.site
    • +2more
    Updated May 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). MedlinePlus [Dataset]. http://identifiers.org/RRID:SCR_006512
    Explore at:
    Dataset updated
    May 3, 2025
    Description

    Database of authoritative health information about diseases, conditions, and wellness issues that offers reliable, up-to-date health information for free. It contains the latest treatments, information on drugs and supplements, the meanings of words, and medical videos and illustrations. Links to the latest topic or disease specific medical research or clinical trials are also offered. * MedlinePlus pages contain carefully selected links to Web resources with health information on over 900 topics. ** The MedlinePlus health topic pages include links to current news on the topic and related information. You can also find preformulated searches of the MEDLINE/PubMed database, which allow you to find references to latest health professional articles on your topic. * The A.D.A.M. medical encyclopedia brings health consumers an extensive library of medical images and videos, as well as over 4,000 articles about diseases, tests, symptoms, injuries, and surgeries. * The Merriam-Webster medical dictionary allows you to look up definitions and spellings of medical words. * Drug and supplement information is available from the American Society of Health-System Pharmacists (ASHP) via AHFS Consumer Medication Information, and Natural Medicines Comprehensive Database Consumer Version. ** AHFS Consumer Medication Information provides extensive information about more than 1,000 brand name and generic prescription and over-the-counter drugs, including side effects, precautions and storage for each drug. ** Natural Medicines Comprehensive Database Consumer Version is an evidence-based collection of information on alternative treatments. MedlinePlus has 100 monographs on herbs and supplements. * Interactive tutorials from the Patient Education Institute explain over 165 procedures and conditions in easy-to-read language. An XML File for the MedlinePlus Health Topics is available, http://www.nlm.nih.gov/medlineplus/xmldescription.html. The ontology is available through Bioportal, http://bioportal.bioontology.org/ontologies/MEDLINEPLUS

  12. b

    Curated Drug-Drug Interactions Database - Drug

    • bioregistry.io
    Updated Oct 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Curated Drug-Drug Interactions Database - Drug [Dataset]. https://bioregistry.io/ddinter.drug
    Explore at:
    Dataset updated
    Oct 11, 2021
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Ddinter is a comprehensive, professional, and open-access database specific to drug-drug interactions. it provides abundant annotations for each ddi association including mechanism description, risk levels, management strategies, alternative medications, etc. to improve clinical decision-making and patient safety.

  13. b

    Multum MediSource Lexicon

    • bioregistry.io
    Updated May 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Multum MediSource Lexicon [Dataset]. https://bioregistry.io/registry/mmsl
    Explore at:
    Dataset updated
    May 27, 2021
    Description

    The Lexicon is a foundational database with comprehensive drug product and disease nomenclature information. It includes drug names, drug product information, disease names, coding systems such as ICD-9-CM and NDC, generic names, brand names and common abbreviations. A comprehensive list of standard or customized disease names and ICD-9 codes is also included.

  14. Historical data from the National Drug Code Directory

    • figshare.com
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mark Howison; Ted Lawless; John Ucles (2023). Historical data from the National Drug Code Directory [Dataset]. http://doi.org/10.6084/m9.figshare.6128225.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Mark Howison; Ted Lawless; John Ucles
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains text file snapshots from the National Drug Code Directory during the years 2000-2018, as available in the Internet Archive (web.archive.org) on April 11, 2018. The files span several database and formatting changes, but together they provide a more comprehensive list of National Drug Codes than are available in the most recent database snapshot (https://www.fda.gov/Drugs/InformationOnDrugs/ucm142438.htm).

  15. s

    Potential Drug Target Database

    • scicrunch.org
    • rrid.site
    Updated Oct 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Potential Drug Target Database [Dataset]. http://identifiers.org/RRID:SCR_007069
    Explore at:
    Dataset updated
    Oct 26, 2025
    Description

    It is a dual function database that associates an informatics database to a structural database of known and potential drug targets. PDTD is a comprehensive, web-accessible database of drug targets, and focuses on those drug targets with known 3D-structures. PDTD contains 1207 entries covering 841 known and potential drug targets with structures from the Protein Data Bank (PDB). Drug targets of PDTD were categorized into 15 and 13 types according to two criteria: therapeutic areas and biochemical criteria. The database supports extensive searching function using PDB ID, target name and category, related disease.

  16. KSA Drug Database (Metadata, PILs & SPCs) - AR/EN

    • kaggle.com
    zip
    Updated Oct 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Meshal Falah (2025). KSA Drug Database (Metadata, PILs & SPCs) - AR/EN [Dataset]. https://www.kaggle.com/datasets/meshalfalah/ksa-drug-database-metadata-pils-and-spcs-aren/discussion
    Explore at:
    zip(125041521 bytes)Available download formats
    Dataset updated
    Oct 23, 2025
    Authors
    Meshal Falah
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    Saudi Arabia
    Description

    Saudi Arabia Drug Database (Metadata, PILs & SPCs) - AR/EN

    🟢 Overview

    This is a comprehensive database of registered pharmaceutical products in the Kingdom of Saudi Arabia, collected from the official public portal of the Saudi Food and Drug Authority (SFDA).

    This dataset is uniquely bilingual (Arabic / English) and provides rich, structured metadata (JSON). This makes it a valuable resource for researchers, students, Natural Language Processing (NLP) specialists, and data scientists interested in the healthcare and pharmaceutical informatics sectors in the Middle East.

    🔑 Key Features

    • Rich Metadata: Each drug includes detailed structured data (see "Data Structure" below), such as official price, trade and generic names, legal classification, manufacturer, agent, and storage conditions.
    • Bilingual (AR/EN): Provides the "Patient Information Leaflet" (PIL) in both Arabic and English, opening significant opportunities for bilingual NLP research.
    • Specialized Leaflets (SPCs): Contains the "Summary of Product Characteristics" (SPC), the technical leaflet aimed at healthcare professionals, which provides in-depth technical data.
    • Processor-Ready Format (JSON): The data is organized in a JSON format, making it easy to parse and process programmatically.
    • Comprehensive: The vast majority of drug records contain the full set of metadata and all three associated leaflets.

    🗂️ Data Structure

    The dataset is provided as a single .zip archive which contains 563 individual JSON files.

    • Each JSON file contains a list of 15 drug records.
    • Each drug record is an object containing its metadata and the three leaflet texts.

    Example Single Drug Record

    Each drug record contains a Drug Data object (the metadata) and three keys for the leaflets:

    json{
     "Drug Data": {
      "Registration Number": "0202256789",
      "Register Year": "2025",
      "Trade Name": "Brevie",
      "Generic Name": "BRIVARACETAM",
      "Strength": "50",
      "Strength Unit": "mg",
      "Administration Route": "Oral use",
      "Pharmaceutical Form": "Film-coated tablet",
      "Package Size": "60",
      "Packages Types": "Blister",
      "Legal Classification": "Prescription",
      "Product Control": "Uncontrolled",
      "Drug Type": "Generic",
      "ShelfLife in Months": "36",
      "Storage Conditions": "do not store above 30°c",
      "Public price (SAR)": "266.05",
      "Manufacture": "MSN LABORATORIES PRIVATE LIMITED",
      "الوكيل": "SUDAIR PHARMA COMPANY",
      "Marketing Company": "SUDAIR PHARMA COMPANY"
     },
     "Patient Information Leaflet (PIL) in English": "[...English leaflet text...]",
     "Patient Information Leaflet (PIL) in Arabic": "[...Arabic leaflet text...]",
     "Summary of Product Characteristics (SPC)": "[...Healthcare professional leaflet text...]"
    }
    ````
    ## 🔗 Data Collection Code
    
    The full code used to collect and structure this dataset is publicly available on GitHub:
    
    👉 **[Data Collection Repository](https://github.com/MQushaym/web-scraping-data-collection)**
    
    This repository contains the web scraping and data processing scripts used to compile and clean the dataset.
    
    
    -----
    
    ## 🎯 Potential Use Cases
    
     * **AI Agents & RAG (Retrieval-Augmented Generation):**
    
       * **(Highly Recommended)** Building a specialized AI Agent (like a GPT or LLM assistant) that answers complex questions about Saudi-registered drugs.
       * This dataset acts as a perfect "Knowledge Base" for RAG. The agent can retrieve specific leaflets (PILs/SPCs) or structured metadata (like price, storage, manufacturer) to provide accurate, verifiable, and context-aware answers.
       * Developing advanced Q\&A systems for both patients ("Can I take this drug with X?") and professionals ("What are the contraindications for this drug?").
    
     * **Natural Language Processing (NLP):**
    
       * Building specialized medical terminology translation models (Ar/En).
       * Named Entity Recognition (NER) to identify side effects, active ingredients, and dosages from the leaflet texts.
       * Text summarization of the long SPC and PIL documents.
    
     * **Data Analysis & Health Informatics:**
    
       * Analyzing drug pricing in relation to manufacturers or drug type (Generic/Innovator).
       * Constructing knowledge graphs (KGs) that link drugs, ingredients, manufacturers, and legal classifications.
       * Studying storage conditions in relation to pharmaceutical forms.
    
    -----
    
    ## 📄 License & Citation
    
    This dataset is made available under the **CC BY-NC 4.0 (Attribution-NonCommercial 4.0)** license.
    
    This means you are free to use it for **academic and research purposes** as long as you provide **attribution (citation)** and do not use it for commercial purposes.
    
    When using this dataset, please cite as follows:
    
    > **Data collected and structured by:** Meshal AL-Qushaym
    > **Dataset:** KS...
    
  17. Pharmacology Complete List of Ligand Molecules

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). Pharmacology Complete List of Ligand Molecules [Dataset]. https://www.johnsnowlabs.com/marketplace/pharmacology-complete-list-of-ligand-molecules/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Area covered
    N/A
    Description

    This dataset shows the complete list of ligand molecules from the Guide to PHARMACOLOGY; an online, open-access portal to pharmacological information on all the human targets of prescription drugs, which is the product from the collaboration of the International Union of Basic and Clinical Pharmacology (IUPHAR) and the British Pharmacological Society (BPS).

  18. Z

    Data from: PubChem Compound TOC: Drug and Medication Information

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    Updated Jul 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Preston London; Taniya Rainge (2023). PubChem Compound TOC: Drug and Medication Information [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8102885
    Explore at:
    Dataset updated
    Jul 3, 2023
    Authors
    Preston London; Taniya Rainge
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT: PubChem, a widely used chemical information resource, has undergone notable transformations in the last two years. Over 120 data sources have been incorporated into PubChem, enriching its data repository. Key highlights of the updates include the integration of Google Patents data, which significantly expanded the PubChem Patent collection's coverage. Additionally, new data collections for Cell Line and Taxonomy were introduced, offering convenient access to chemical information based on specific cell lines and taxa. The bioassay data model was updated to enhance its functionality. Moreover, PubChem's programmatic access protocols, PUG-REST and PUG-View, received enhancements, including support for target-centric data download and the 'standardize' option for returning standardized chemical structures. Furthermore, PubChemRDF underwent a substantial update. This paper presents a comprehensive overview of these transformative changes.

    Instruction: Data underwent a cleaning process involving the removal of duplicates. Collaboratively with my colleague, we identified and eliminated redundant columns from the dataset. Additionally, we successfully modified the names of certain columns to enhance clarity and eliminate redundancy.

    Inspiration: The dataset was uploaded to UBRITE for "DGR_DEPOT" summer 2023 team project

    Acknowledgements: Sunghwan Kim, Jie Chen, Tiejun Cheng, Asta Gindulyte, Jia He, Siqian He, Qingliang Li, Benjamin A Shoemaker, Paul A Thiessen, Bo Yu, Leonid Zaslavsky, Jian Zhang, Evan E Bolton

    Kim, Sunghwan et al. “PubChem 2023 update.” Nucleic Acids Res. vol. 51,D1 (2023): D1373-D1380. doi:10.1093/nar/gkac956

    UBRITE LAST UPDATED July 1, 2023

  19. Psychedelic Drug Database

    • kaggle.com
    zip
    Updated Dec 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Psychedelic Drug Database [Dataset]. https://www.kaggle.com/datasets/thedevastator/psychedelic-drug-database
    Explore at:
    zip(119090 bytes)Available download formats
    Dataset updated
    Dec 3, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Psychedelic Drug Database

    Psychotropic and psychedelics drugs database with molecular descriptors

    By Juan Jose [source]

    About this dataset

    This dataset is a comprehensive database of psychotropic and psychedelic drugs, focusing on their molecular descriptors. The data was sourced from the PubChem Database, which is a widely-used resource for chemical information. The main objective of this project is to create an easily accessible and centralized database specifically for psychedelic compounds.

    To achieve this, the dataset includes information on identified psychedelic compounds obtained from the PubChem Database. Additionally, molecular descriptors for these compounds were generated using the KNIME Analytics Platform and RDKit module. These molecular descriptors provide important characteristics and properties of each compound, making it easier to perform quantitative structure-activity relationship (QSAR) and quantitative structure-property relationship (QSPR) analyses.

    By providing access to such data, researchers and scientists can have a valuable resource for studying psychoactive substances in a more efficient manner. This database offers consolidated and accurate information about various psychotropic drugs, aiding in research related to their effects, mechanisms of action, toxicity profiles, and potential therapeutic uses.

    External resources used in this project include the PubChem Project website as well as the KNIME Analytics Platform and RDKit software tools. With these resources combined, this dataset serves as a dependable repository for both basic research purposes as well as applications in drug design or development efforts targeting psychoactive substances.

    The columns within this dataset provide detailed information about each compound's molecular descriptors derived from its chemical structure. This diverse set of characteristics enables researchers to compare different compounds based on their structural features or predict certain properties using computational models.

    Overall, this comprehensive psychotropic and psychedelics drugs database plays a crucial role in advancing understanding of these substances' pharmacological activities while facilitating more efficient drug discovery processes through predictive modeling approaches like QSAR/QSPR analysis

    How to use the dataset

    Understanding the Columns

    • Compound Name: The name or identifier of each compound in the database.
    • Molecular Formula: The chemical formula representing the number and types of atoms in a compound.
    • Molecular Weight: The mass of a molecule, calculated as the sum of atomic weights.
    • Canonical SMILES: A simplified molecular representation using standardised notation for atoms and bonds.
    • Isomeric SMILES: A more specific molecular representation that includes information about stereochemistry (the spatial arrangement of atoms). 6-10. Additional columns may be included with specific molecular descriptors depending on how they were generated.

    Accessing Additional Information

    To delve deeper into any given compound in this database, make use of external resources such as The PubChem Project. This comprehensive resource provides additional data on each compound including chemical properties, biological activities, safety information, and much more.

    Performing QSAR or QSPR Analysis

    One potential application for this dataset is Quantitative Structure-Activity Relationship (QSAR) or Quantitative Structure-Property Relationship (QSPR) analysis. These approaches involve studying the relationship between a set of chemical properties (molecular descriptors) and an observed activity/property value for a set of compounds.

    To perform QSAR/QSPR analysis using this dataset:

    • Import these data into your preferred analytics platform such as KNIME Analytics Platform.
    • Use the molecular descriptors provided in the dataset as independent variables.
    • Obtain an activity/property dataset as your dependent variable (e.g., biological activity, toxicity, physical property).
    • Apply appropriate machine learning or statistical modeling techniques to build a model that predicts the activity/property based on the molecular descriptors.
    • Evaluate and validate your model using suitable methods (e.g., cross-validation, external test set).

    Precautions and Ethical Considerations

    While this database provides valuable information for research purposes, it is essential to handle psychedelic substances with caution and adhere to legal and ethical considerations.

    • Leg...
  20. DrugCentral (Full PostgreSQL Database)

    • figshare.com
    txt
    Updated Jan 19, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adam Brown (2017). DrugCentral (Full PostgreSQL Database) [Dataset]. http://doi.org/10.6084/m9.figshare.3811608.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 19, 2017
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Adam Brown
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This figshare dataset contains the relevant tables from the DrugCentral PostgreSQL database download, accessed November 16, 2016.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shayan Husain (2025). Comprehensive A-Z Pharmaceutical Drug Database [Dataset]. https://www.kaggle.com/datasets/shayanhusain/comprehensive-a-z-pharmaceutical-drug-database
Organization logo

Comprehensive A-Z Pharmaceutical Drug Database

Drug Classes, Indications, Dosage, Side Effects, Interactions, and More.

Explore at:
zip(43473 bytes)Available download formats
Dataset updated
Sep 22, 2025
Authors
Shayan Husain
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

This dataset provides a comprehensive, structured overview of hundreds of commonly used pharmaceutical drugs, listed alphabetically by generic name. It serves as a valuable resource for healthcare students, professionals, data analysts, and anyone interested in pharmacology.

Compiled from reputable sources like the FDA Prescribing Information, Lexicomp, and Micromedex, each entry includes detailed information on drug properties, safety, and usage. This dataset is ideal for educational purposes, data analysis projects, and as a reference for building healthcare applications.

Key Features (Columns):

Generic Name: The common name of the drug.

Drug Class: The pharmacological category (e.g., SSRI, Beta-Blocker, Statin).

Indications: The medical conditions the drug is used to treat.

Dosage Form: The physical form of the drug (e.g., Tablet, Capsule, Injection, Cream).

Strength: The potency of the drug (e.g., 500 mg, 0.1%).

Route of Administration: How the drug is administered (e.g., Oral, Topical, Intravenous).

Side Effects: Common adverse reactions associated with the drug.

Contraindications: Conditions or factors that serve as a reason to not use the drug.

Interaction warnings & Precautions: Important information on how the drug interacts with others and key safety measures.

Storage Conditions: Recommended storage instructions (e.g., Room Temperature, Refrigerate).

Reference: The primary source(s) of the information.

Availability: Whether the drug is typically available by prescription or over-the-counter (OTC).

Potential Use Cases:

Educational Tool: For students of medicine, pharmacy, and nursing to learn about drug properties.

Data Analysis & Visualization: Analyze the distribution of drug classes, common side effects, or storage requirements.

Drug Interaction Checker (Basic Foundation): Use as a base dataset to build a simple drug interaction screening tool.

Clinical Reference Application: Populate a mobile or web app with essential drug information.

Natural Language Processing (NLP): Train models to extract drug information from text or to classify drugs based on their descriptions.

File(s):

drugs_from_a_to_z.csv (The Excel data converted to a CSV for broader compatibility)

Acknowledgements:

This dataset synthesizes information from publicly available drug monographs and prescribing information from sources including the U.S. Food and Drug Administration (FDA), Lexicomp, and Micromedex.

Search
Clear search
Close search
Google apps
Main menu