100+ datasets found

b
Data from: Drug Database
biomedsyn.com
Updated Jan 18, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2026). Drug Database [Dataset]. https://www.biomedsyn.com/formList?pageTitle=Learn+More&type=10
Explore at:
Dataset updated
Jan 18, 2026
Description
Comprehensive drug information database
Druggable Genome Comprehensive Drug Targets
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Druggable Genome Comprehensive Drug Targets [Dataset]. https://www.johnsnowlabs.com/marketplace/druggable-genome-comprehensive-drug-targets/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Area covered
N/A
Description
This dataset Druggable Genome Comprehensive Drug Targets is a selection of supplementary data from "The Druggable Genome: Evaluation of Drug Targets in Clinical Trials Suggests Major Shifts in Molecular Class and Indication" (2013) [PMID:24016212]. The comprehensive list includes 461 targets of approved drugs.
Comprehensive A-Z Pharmaceutical Drug Database
kaggle.com
zip
Updated Sep 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shayan Husain (2025). Comprehensive A-Z Pharmaceutical Drug Database [Dataset]. https://www.kaggle.com/datasets/shayanhusain/comprehensive-a-z-pharmaceutical-drug-database
Explore at:
zip(43473 bytes)Available download formats
Dataset updated
Sep 22, 2025
Authors
Shayan Husain
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset provides a comprehensive, structured overview of hundreds of commonly used pharmaceutical drugs, listed alphabetically by generic name. It serves as a valuable resource for healthcare students, professionals, data analysts, and anyone interested in pharmacology.

Compiled from reputable sources like the FDA Prescribing Information, Lexicomp, and Micromedex, each entry includes detailed information on drug properties, safety, and usage. This dataset is ideal for educational purposes, data analysis projects, and as a reference for building healthcare applications.

Key Features (Columns):

Generic Name: The common name of the drug.

Drug Class: The pharmacological category (e.g., SSRI, Beta-Blocker, Statin).

Indications: The medical conditions the drug is used to treat.

Dosage Form: The physical form of the drug (e.g., Tablet, Capsule, Injection, Cream).

Strength: The potency of the drug (e.g., 500 mg, 0.1%).

Route of Administration: How the drug is administered (e.g., Oral, Topical, Intravenous).

Side Effects: Common adverse reactions associated with the drug.

Contraindications: Conditions or factors that serve as a reason to not use the drug.

Interaction warnings & Precautions: Important information on how the drug interacts with others and key safety measures.

Storage Conditions: Recommended storage instructions (e.g., Room Temperature, Refrigerate).

Reference: The primary source(s) of the information.

Availability: Whether the drug is typically available by prescription or over-the-counter (OTC).

Potential Use Cases:

Educational Tool: For students of medicine, pharmacy, and nursing to learn about drug properties.

Data Analysis & Visualization: Analyze the distribution of drug classes, common side effects, or storage requirements.

Drug Interaction Checker (Basic Foundation): Use as a base dataset to build a simple drug interaction screening tool.

Clinical Reference Application: Populate a mobile or web app with essential drug information.

Natural Language Processing (NLP): Train models to extract drug information from text or to classify drugs based on their descriptions.

File(s):

drugs_from_a_to_z.csv (The Excel data converted to a CSV for broader compatibility)

Acknowledgements:

This dataset synthesizes information from publicly available drug monographs and prescribing information from sources including the U.S. Food and Drug Administration (FDA), Lexicomp, and Micromedex.
Drug Targets and Drug Lists Data Package
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Drug Targets and Drug Lists Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/drug-targets-and-drug-lists-data-package/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Description
This data package contains information on approved, researched and proven drug targets and drug lists.
r
DrugBank - Open Data Drug and Drug Target Database
researchdata.edu.au
Updated May 2, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
QFAB (2013). DrugBank - Open Data Drug and Drug Target Database [Dataset]. https://researchdata.edu.au/drugbank-open-drug-target-database/14044
Explore at:
Dataset updated
May 2, 2013
Dataset provided by
QFAB
Description
The DrugBank database is a unique bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. The database contains 6712 drug entries including 1448 FDA-approved small molecule drugs, 131 FDA-approved biotech (protein/peptide) drugs, 85 nutraceuticals and 5080 experimental drugs. Additionally, 4227 non-redundant protein (i.e. drug target/enzyme/transporter/carrier) sequences are linked to these drug entries. Each DrugCard entry contains more than 150 data fields with half of the information being devoted to drug/chemical data and the other half devoted to drug target or protein data. DrugBank is supported by David Wishart, Departments of Computing Science X Biological Sciences, University of Alberta. DrugBank is also supported by The Metabolomics Innovation Centre, a Genome Canada-funded core facility serving the scientific community and industry with world-class expertise and cutting-edge technologies in metabolomics.
r
Comprehensive Drug Self-administration and Discrimination Bibliographic...
rrid.site
dknet.org
+2more
Updated Dec 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Comprehensive Drug Self-administration and Discrimination Bibliographic Databases [Dataset]. http://identifiers.org/RRID:SCR_000707
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_000707
Dataset updated
Dec 23, 2025
Description
Database of bibliographic details of over 9,000 references published between 1951 and the present day, and includes abstracts, journal articles, book chapters and books replacing the two former separate websites for Ian Stolerman's drug discrimination database and Dick Meisch's drug self-administration database. Lists of standardized keywords are used to index the citations. Most of the keywords are generic drug names but they also include methodological terms, species studied and drug classes. This index makes it possible to selectively retrieve references according to the drugs used as the training stimuli, drugs used as test stimuli, drugs used as pretreatments, species, etc. by entering your own terms or by using our comprehensive lists of search terms. Drug Discrimination Drug Discrimination is widely recognized as one of the major methods for studying the behavioral and neuropharmacological effects of drugs and plays an important role in drug discovery and investigations of drug abuse. In Drug Discrimination studies, effects of drugs serve as discriminative stimuli that indicate how reinforcers (e.g. food pellets) can be obtained. For example, animals can be trained to press one of two levers to obtain food after receiving injections of a drug, and to press the other lever to obtain food after injections of the vehicle. After the discrimination has been learned, the animal starts pressing the appropriate lever according to whether it has received the training drug or vehicle; accuracy is very good in most experiments (90 or more correct). Discriminative stimulus effects of drugs are readily distinguished from the effects of food alone by collecting data in brief test sessions where responses are not differentially reinforced. Thus, trained subjects can be used to determine whether test substances are identified as like or unlike the drug used for training. Drug Self-administration Drug Self-administration methodology is central to the experimental analysis of drug abuse and dependence (addiction). It constitutes a key technique in numerous investigations of drug intake and its neurobiological basis and has even been described by some as the gold standard among methods in the area. Self-administration occurs when, after a behavioral act or chain of acts, a feedback loop results in the introduction of a drug or drugs into a human or infra-human subject. The drug is usually conceptualized as serving the role of a positive reinforcer within a framework of operant conditioning. For example, animals can be given the opportunity to press a lever to obtain an infusion of a drug through a chronically-indwelling venous catheter. If the available dose of the drug serves as a positive reinforcer then the rate of lever-pressing will increase and a sustained pattern of responding at a high rate may develop. Reinforcing effects of drugs are distinguishable from other actions such as increases in general activity by means of one or more control procedures. Trained subjects can be used to investigate the behavioral and neuropharmacological basis of drug-taking and drug-seeking behaviors and the reinstatement of these behaviors in subjects with a previous history of drug intake (relapse models). Other applications include evaluating novel compounds for liability to produce abuse and dependence and for their value in the treatment of drug dependence and addiction. The bibliography is updated about four times per year.

Drug Labels & Side Effects Dataset | 1400+ Records

kaggle.com

zip

Updated Aug 2, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Pratyush Puri (2025). Drug Labels & Side Effects Dataset | 1400+ Records [Dataset]. https://www.kaggle.com/datasets/pratyushpuri/drug-labels-and-side-effects-dataset-1400-records

Explore at:

zip(51886 bytes)Available download formats

Dataset updated

Aug 2, 2025

Authors

Pratyush Puri

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Drug Labels and Side Effects Dataset

Dataset Overview

This comprehensive pharmaceutical synthetic dataset contains 1,393 records of synthetic drug information with 15 columns, designed for data science projects focusing on healthcare analytics, drug safety analysis, and pharmaceutical research. The dataset simulates real-world pharmaceutical data with appropriate variety and realistic constraints for machine learning applications.

Dataset Specifications

Attribute	Value
Total Records	1,393
Total Columns	15
File Format	CSV
Data Types	Mixed (intentional for data cleaning practice)
Domain	Pharmaceutical/Healthcare
Use Case	ML Training, Data Analysis, Healthcare Research

Column Specifications

Categorical Features

Column Name	Data Type	Unique Values	Description	Example Values
`drug_name`	Object	1,283 unique	Pharmaceutical drug names with realistic naming patterns	"Loxozepam32", "Amoxparin43", "Virazepam10"
`manufacturer`	Object	10 unique	Major pharmaceutical companies	Pfizer Inc., AstraZeneca, Johnson & Johnson
`drug_class`	Object	10 unique	Therapeutic drug classifications	Antibiotic, Analgesic, Antidepressant, Vaccine
`indications`	Object	10 unique	Medical conditions the drug treats	"Pain relief", "Bacterial infections", "Depression treatment"
`side_effects`	Object	434 unique	Combination of side effects (1-3 per drug)	"Nausea, Dizziness", "Headache, Fatigue, Rash"
`administration_route`	Object	7 unique	Method of drug delivery	Oral, Intravenous, Topical, Inhalation, Sublingual
`contraindications`	Object	10 unique	Medical warnings for drug usage	"Pregnancy", "Heart disease", "Liver disease"
`warnings`	Object	10 unique	Safety instructions and precautions	"Take with food", "Avoid alcohol", "Monitor blood pressure"
`batch_number`	Object	1,393 unique	Manufacturing batch identifiers	"xr691zv", "Ye266vU", "Rm082yX"
`expiry_date`	Object	782 unique	Drug expiration dates (YYYY-MM-DD)	"2025-12-13", "2027-03-09", "2026-10-06"
`side_effect_severity`	Object	3 unique	Severity classification	Mild, Moderate, Severe
`approval_status`	Object	3 unique	Regulatory approval status	Approved, Pending, Rejected

Numerical Features

Column Name	Data Type	Range	Mean	Std Dev	Description
`approval_year`	Float/String*	1990-2024	2006.7	10.0	FDA/regulatory approval year
`dosage_mg`	Float/String*	10-990 mg	499.7	290.0	Medication strength in milligrams
`price_usd`	Float/String*	$2.32-$499.24	$251.12	$144.81	Drug price in US dollars

*Intentionally stored as mixed types for data cleaning practice

Key Statistics

Manufacturer Distribution

Manufacturer	Count	Percentage
Pfizer Inc.	170	12.2%
AstraZeneca	~140	~10.0%
Merck & Co.	~140	~10.0%
Johnson & Johnson	~140	~10.0%
GlaxoSmithKline	~140	~10.0%
Others	~623	~44.8%

Drug Class Distribution

Drug Class	Count	Most Common
Anti-inflammatory	154	✓
Antibiotic	~140
Antidepressant	~140
Antiviral	~140
Vaccine	~140
Others	~679

Side Effect Severity

Severity	Count	Percentage
Severe	488	35.0%
Moderate	~453	~32.5%
Mild	~452	~32.5%

Potential Use Cases

1. Machine Learning Applications

Drug Approval Prediction: Predict approval likelihood based on drug characteristics
Price Prediction: Estimate drug pricing using features like class, manufacturer, dosage
Side Effect Classification: Classify severity based on drug properties
Market Success Analysis: Analyze factors contributing to drug market performance

2. Data Engineering Projects

ETL Pipeline Development: Practice data cleaning and transformation
Data Quality Assessment: Implement data validation and quality checks
Database Design: Create normalized pharmaceutical database schema
Real-time Processing: Stream processing for drug monitoring systems

3. Business Intelligence

Pharmaceutical Market Analysis: Manufacturer market share and competitive analysis
Drug Safety Analytics: Side effect patterns and safety profile analysis
Regulatory Compliance: Approval trends and regulatory timeline analysis
Pricing Strategy: Competitive pricing analysis across drug classes

Recommended Next Steps

Data Cleaning Pipeline: Implement comprehe...

DrugBank Database Data Package
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). DrugBank Database Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/drugbank-database-data-package/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Description
DrugBank Vocabulary contains information on DrugBank identifiers, names, and synonyms to permit easy linking and integration into any type of project. DrugBank is a richly annotated resource that combines detailed drug data with comprehensive drug target and drug action information. DrugBank is widely used to facilitate in silico drug target discovery, drug design, drug docking or screening, drug metabolism prediction, drug interaction prediction and general pharmaceutical education.

KSA Drug Database (Metadata, PILs & SPCs) - AR/EN

kaggle.com

zip

Updated Oct 23, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Meshal Falah (2025). KSA Drug Database (Metadata, PILs & SPCs) - AR/EN [Dataset]. https://www.kaggle.com/datasets/meshalfalah/ksa-drug-database-metadata-pils-and-spcs-aren

Explore at:

zip(125041521 bytes)Available download formats

Dataset updated

Oct 23, 2025

Authors

Meshal Falah

License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Area covered

Saudi Arabia

Description

Saudi Arabia Drug Database (Metadata, PILs & SPCs) - AR/EN

🟢 Overview

This is a comprehensive database of registered pharmaceutical products in the Kingdom of Saudi Arabia, collected from the official public portal of the Saudi Food and Drug Authority (SFDA).

This dataset is uniquely bilingual (Arabic / English) and provides rich, structured metadata (JSON). This makes it a valuable resource for researchers, students, Natural Language Processing (NLP) specialists, and data scientists interested in the healthcare and pharmaceutical informatics sectors in the Middle East.

🔑 Key Features

Rich Metadata: Each drug includes detailed structured data (see "Data Structure" below), such as official price, trade and generic names, legal classification, manufacturer, agent, and storage conditions.
Bilingual (AR/EN): Provides the "Patient Information Leaflet" (PIL) in both Arabic and English, opening significant opportunities for bilingual NLP research.
Specialized Leaflets (SPCs): Contains the "Summary of Product Characteristics" (SPC), the technical leaflet aimed at healthcare professionals, which provides in-depth technical data.
Processor-Ready Format (JSON): The data is organized in a JSON format, making it easy to parse and process programmatically.
Comprehensive: The vast majority of drug records contain the full set of metadata and all three associated leaflets.

🗂️ Data Structure

The dataset is provided as a single .zip archive which contains 563 individual JSON files.

Each JSON file contains a list of 15 drug records.
Each drug record is an object containing its metadata and the three leaflet texts.

Example Single Drug Record

Each drug record contains a Drug Data object (the metadata) and three keys for the leaflets:

json{
 "Drug Data": {
  "Registration Number": "0202256789",
  "Register Year": "2025",
  "Trade Name": "Brevie",
  "Generic Name": "BRIVARACETAM",
  "Strength": "50",
  "Strength Unit": "mg",
  "Administration Route": "Oral use",
  "Pharmaceutical Form": "Film-coated tablet",
  "Package Size": "60",
  "Packages Types": "Blister",
  "Legal Classification": "Prescription",
  "Product Control": "Uncontrolled",
  "Drug Type": "Generic",
  "ShelfLife in Months": "36",
  "Storage Conditions": "do not store above 30°c",
  "Public price (SAR)": "266.05",
  "Manufacture": "MSN LABORATORIES PRIVATE LIMITED",
  "الوكيل": "SUDAIR PHARMA COMPANY",
  "Marketing Company": "SUDAIR PHARMA COMPANY"
 },
 "Patient Information Leaflet (PIL) in English": "[...English leaflet text...]",
 "Patient Information Leaflet (PIL) in Arabic": "[...Arabic leaflet text...]",
 "Summary of Product Characteristics (SPC)": "[...Healthcare professional leaflet text...]"
}
````
## 🔗 Data Collection Code

The full code used to collect and structure this dataset is publicly available on GitHub:

👉 **[Data Collection Repository](https://github.com/MQushaym/web-scraping-data-collection)**

This repository contains the web scraping and data processing scripts used to compile and clean the dataset.


-----

## 🎯 Potential Use Cases

 * **AI Agents & RAG (Retrieval-Augmented Generation):**

   * **(Highly Recommended)** Building a specialized AI Agent (like a GPT or LLM assistant) that answers complex questions about Saudi-registered drugs.
   * This dataset acts as a perfect "Knowledge Base" for RAG. The agent can retrieve specific leaflets (PILs/SPCs) or structured metadata (like price, storage, manufacturer) to provide accurate, verifiable, and context-aware answers.
   * Developing advanced Q\&A systems for both patients ("Can I take this drug with X?") and professionals ("What are the contraindications for this drug?").

 * **Natural Language Processing (NLP):**

   * Building specialized medical terminology translation models (Ar/En).
   * Named Entity Recognition (NER) to identify side effects, active ingredients, and dosages from the leaflet texts.
   * Text summarization of the long SPC and PIL documents.

 * **Data Analysis & Health Informatics:**

   * Analyzing drug pricing in relation to manufacturers or drug type (Generic/Innovator).
   * Constructing knowledge graphs (KGs) that link drugs, ingredients, manufacturers, and legal classifications.
   * Studying storage conditions in relation to pharmaceutical forms.

-----

## 📄 License & Citation

This dataset is made available under the **CC BY-NC 4.0 (Attribution-NonCommercial 4.0)** license.

This means you are free to use it for **academic and research purposes** as long as you provide **attribution (citation)** and do not use it for commercial purposes.

When using this dataset, please cite as follows:

> **Data collected and structured by:** Meshal AL-Qushaym
> **Dataset:** KS...

b
DrugBank
bioregistry.io
Updated Apr 12, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). DrugBank [Dataset]. http://identifiers.org/re3data:r3d100010544
Explore at:
Unique identifier
https://identifiers.org/wikidata:P715 https://identifiers.org/re3data:r3d100010544
Dataset updated
Apr 12, 2021
Description
The DrugBank database is a bioinformatics and chemoinformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. This collection references drug information.
Comprehensive Drug Information Dataset
kaggle.com
zip
Updated Aug 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anoop Johny (2023). Comprehensive Drug Information Dataset [Dataset]. https://www.kaggle.com/anoopjohny/comprehensive-drug-information-dataset
Explore at:
zip(9733 bytes)Available download formats
Dataset updated
Aug 19, 2023
Authors
Anoop Johny
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
The "Pharmaceutical Product Data Repository" is a comprehensive dataset containing detailed information about a wide range of pharmaceutical drugs. This dataset encompasses various attributes related to each drug, including drug names, generic names, drug classes, indications, dosage forms, strengths, routes of administration, mechanisms of action, side effects, contraindications, interactions, warnings, precautions, pregnancy categories, storage conditions, manufacturers, approval dates, availability status (prescription or over-the-counter), National Drug Code (NDC) numbers, and prices.

https://media.giphy.com/media/1o1lLXi38lqbNKau66/giphy.gif" alt="Drugs">

With a diverse collection of over 200 drug entries, the dataset provides valuable insights into the pharmaceutical landscape, making it a valuable resource for research, analysis, and applications related to healthcare, pharmacology, and medical informatics. Researchers, healthcare professionals, and data enthusiasts can leverage this dataset to gain a deeper understanding of drug attributes, potential interactions, and safety considerations.

https://media.giphy.com/media/3oEjI0OTRRGazB7Ahq/giphy.gif" alt="Don't do drugs">

Please note that the data in this dataset is entirely fictional and for illustrative purposes only. It does not reflect real-world drug information or attributes. Users are advised to exercise caution and not use this dataset for any practical or clinical applications.
M
MedGuide Global Drug Information Dataset
drug-scribe-ai.lovable.app
json-ld
Updated Jan 7, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MedGuide Global Medical Team (2026). MedGuide Global Drug Information Dataset [Dataset]. https://drug-scribe-ai.lovable.app/
Explore at:
json-ldAvailable download formats
Dataset updated
Jan 7, 2026
Dataset provided by
MedGuide Global
Authors
MedGuide Global Medical Team
License
https://www.medguideglobal.com/termshttps://www.medguideglobal.com/terms
Time period covered
2024 - Present
Area covered
Worldwide
Description
Comprehensive FDA-verified pharmaceutical database containing drug information, interactions, safety data, and medical guidance
d
MedlinePlus
dknet.org
rrid.site
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). MedlinePlus [Dataset]. http://identifiers.org/RRID:SCR_006512
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006512 https://identifiers.org/RRID:SCR_006512/resolver
Dataset updated
Jan 29, 2022
Description
Database of authoritative health information about diseases, conditions, and wellness issues that offers reliable, up-to-date health information for free. It contains the latest treatments, information on drugs and supplements, the meanings of words, and medical videos and illustrations. Links to the latest topic or disease specific medical research or clinical trials are also offered. * MedlinePlus pages contain carefully selected links to Web resources with health information on over 900 topics. ** The MedlinePlus health topic pages include links to current news on the topic and related information. You can also find preformulated searches of the MEDLINE/PubMed database, which allow you to find references to latest health professional articles on your topic. * The A.D.A.M. medical encyclopedia brings health consumers an extensive library of medical images and videos, as well as over 4,000 articles about diseases, tests, symptoms, injuries, and surgeries. * The Merriam-Webster medical dictionary allows you to look up definitions and spellings of medical words. * Drug and supplement information is available from the American Society of Health-System Pharmacists (ASHP) via AHFS Consumer Medication Information, and Natural Medicines Comprehensive Database Consumer Version. ** AHFS Consumer Medication Information provides extensive information about more than 1,000 brand name and generic prescription and over-the-counter drugs, including side effects, precautions and storage for each drug. ** Natural Medicines Comprehensive Database Consumer Version is an evidence-based collection of information on alternative treatments. MedlinePlus has 100 monographs on herbs and supplements. * Interactive tutorials from the Patient Education Institute explain over 165 procedures and conditions in easy-to-read language. An XML File for the MedlinePlus Health Topics is available, http://www.nlm.nih.gov/medlineplus/xmldescription.html. The ontology is available through Bioportal, http://bioportal.bioontology.org/ontologies/MEDLINEPLUS
FDA-Approved Drugs & Therapeutics
kaggle.com
zip
Updated Jan 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). FDA-Approved Drugs & Therapeutics [Dataset]. https://www.kaggle.com/datasets/thedevastator/fda-approved-drugs-therapeutics
Explore at:
zip(2006218 bytes)Available download formats
Dataset updated
Jan 23, 2023
Authors
The Devastator
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
FDA-Approved Drugs & Therapeutics

Exploring Human Drug & Biological Therapies

By Health [source]

About this dataset

This dataset contains a wealth of information about FDA-approved human drugs and biological therapeutic products. Whether you are studying the effects of drugs, exploring new treatment methods, or researching potential side effects, this database holds detailed insights into the approved medicines available to individuals today. From brand names to generic prescriptions to over-the-counter products, you can access a variety of important details such as reviews, labels, approval letters and patient information. Gain a comprehensive understanding of the drug products approved since 1939 to develop safer and more effective treatments for patients going forward

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset contains information about nearly all of the FDA-approved brand name and generic prescription drugs, as well as biological therapeutic products. It is important to note that most information is available for drug products approved since 1998, meaning that drugs approved before then may have less comprehensive data associated with them.

To get started using this dataset, you should begin by familiarizing yourself with the available columns in the dataset: - Drug Name--The name of the drug (brand name or generic). - Active Ingredient(s)--A list of active ingredients present in each drug product.
- Dosage form--The physical form and route a patient takes a specific drug product (e.g., tablet taken orally).
- Approval Description--A summary of key features and benefits related to the approval process for each product.

Route(s) -- The manner or way by which a medication has been formulated to be absorbed or introduced into an organism's system (e.g., oral ingestion, injection).

Next, you will want to understand what type of queries can be run on this data set so that you can effectively search for specific items to analyze within your project goals:

•You can search through column headers/specific terms in order to find information related to your query such as active ingredients, dosage forms or routes used by different products;
•You can use simple comparison operators such as “=”, “<” and “>” to find ranges between certain values; •You can utilize Boolean operators such as “AND” & “OR” within SQL statements in order to combine two conditions together; •You can implement searching feature on multiple columns simultaneously using a combination of LIKE commands coupled with wildcard characters (); •Lastly you can build subqueries upon which more complicated queries are applied depending on your research objectives (these advanced scripts often incorporate functions like SUM(), AVG() etc.)

Research Ideas

Developing a tool to help patients identify potential interactions between different drugs they are taking by cross-referencing this dataset with the patient's records.

Developing an AI/machine learning model which evaluates all approved drugs and their effects on disease, helping physicians determine the best treatment options for their patients.

Building an online marketplace, sponsored by health care organizations or private companies, where customers can compare prices and availability of FDA approved drugs before buying them online or in stores

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: Open Database License (ODbL) v1.0 - You are free to: - Share - copy and redistribute the material in any medium or format. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices. - No Derivatives - If you remix, transform, or build upon the material, you may not distribute the modified material. - No additional restrictions - You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.

Columns

Acknowledgements

If you use this dataset in your ...
b
Multum MediSource Lexicon
bioregistry.io
Updated May 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Multum MediSource Lexicon [Dataset]. https://bioregistry.io/registry/mmsl
Explore at:
Dataset updated
May 27, 2021
Description
The Lexicon is a foundational database with comprehensive drug product and disease nomenclature information. It includes drug names, drug product information, disease names, coding systems such as ICD-9-CM and NDC, generic names, brand names and common abbreviations. A comprehensive list of standard or customized disease names and ICD-9 codes is also included.
r
Potential Drug Target Database
rrid.site
scicrunch.org
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Potential Drug Target Database [Dataset]. http://identifiers.org/RRID:SCR_007069
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007069
Dataset updated
Jan 29, 2022
Description
It is a dual function database that associates an informatics database to a structural database of known and potential drug targets. PDTD is a comprehensive, web-accessible database of drug targets, and focuses on those drug targets with known 3D-structures. PDTD contains 1207 entries covering 841 known and potential drug targets with structures from the Protein Data Bank (PDB). Drug targets of PDTD were categorized into 15 and 13 types according to two criteria: therapeutic areas and biochemical criteria. The database supports extensive searching function using PDB ID, target name and category, related disease.
Indian Medicine Data
kaggle.com
zip
Updated Aug 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohneesh_Sreegirisetty (2023). Indian Medicine Data [Dataset]. https://www.kaggle.com/datasets/mohneesh7/indian-medicine-data
Explore at:
zip(18681848 bytes)Available download formats
Dataset updated
Aug 20, 2023
Authors
Mohneesh_Sreegirisetty
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
India Medicine Database

Dataset consists of 8 columns : - sub_category: This classification pertains to specific medical categories that define the domain in which the medicine finds its application. - product_name: This is the name of the product, as available in the indian market. - salt_composition: This is the chemical composition of the drug. - product_price:This represents the previous price of the product. Please consider this as a reference, as it tends to be highly volatile in relation to the health market. - product_manufactured:The pharmaceutical company responsible for producing the medicine/drug. - medicine_desc: Comprehensive overview and detailed description of the specific product. - side_effects:Potential adverse effects associated with the drug/medicine. - drug_interactions:Interactions and effects when combining this specific medicine with other drugs.

There are a few missing values in the dataset, but most information is available for the row, so I have left as is.
Drug-Drug Interactions
kaggle.com
zip
Updated Aug 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MGhobashy (2024). Drug-Drug Interactions [Dataset]. https://www.kaggle.com/datasets/mghobashy/drug-drug-interactions
Explore at:
zip(1923486 bytes)Available download formats
Dataset updated
Aug 31, 2024
Authors
MGhobashy
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset provides a comprehensive collection of drug-drug interactions (DDIs) intended for research in predicting and understanding complex interaction relationships between drugs. It is sourced from the Drug Bank database and is designed to support multi-task learning approaches in the domain of bioinformatics and pharmacology.

Feature Details: Drug 1: Name of the first drug in the interaction. Drug 2: Name of the second drug in the interaction. Interaction Description: Detailed description of the interaction between the two drugs.

Source: The dataset is derived from the datasets provided by the team at TDCommons
Additional file 1: of Toward a comprehensive drug ontology: extraction of...
springernature.figshare.com
xlsx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mark Sharp (2023). Additional file 1: of Toward a comprehensive drug ontology: extraction of drug-indication relations from diverse information sources [Dataset]. http://doi.org/10.6084/m9.figshare.c.3661856_D1.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.c.3661856_D1.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Mark Sharp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Drug-Indication Database non-proprietary subset. (XLSX 61806Â kb)
DataSheet1_Data Sources for Drug Utilization Research in Brazil—DUR-BRA...
frontiersin.figshare.com
datasetcatalog.nlm.nih.gov
xlsx
Updated Jun 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lisiane Freitas Leal; Claudia Garcia Serpa Osorio-de-Castro; Luiz Júpiter Carneiro de Souza; Felipe Ferre; Daniel Marques Mota; Marcia Ito; Monique Elseviers; Elisangela da Costa Lima; Ivan Ricardo Zimmernan; Izabela Fulone; Monica Da Luz Carvalho-Soares; Luciane Cruz Lopes (2023). DataSheet1_Data Sources for Drug Utilization Research in Brazil—DUR-BRA Study.xlsx [Dataset]. http://doi.org/10.3389/fphar.2021.789872.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fphar.2021.789872.s001
Dataset updated
Jun 15, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Lisiane Freitas Leal; Claudia Garcia Serpa Osorio-de-Castro; Luiz Júpiter Carneiro de Souza; Felipe Ferre; Daniel Marques Mota; Marcia Ito; Monique Elseviers; Elisangela da Costa Lima; Ivan Ricardo Zimmernan; Izabela Fulone; Monica Da Luz Carvalho-Soares; Luciane Cruz Lopes
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Brazil
Description
Background: In Brazil, studies that map electronic healthcare databases in order to assess their suitability for use in pharmacoepidemiologic research are lacking. We aimed to identify, catalogue, and characterize Brazilian data sources for Drug Utilization Research (DUR).Methods: The present study is part of the project entitled, “Publicly Available Data Sources for Drug Utilization Research in Latin American (LatAm) Countries.” A network of Brazilian health experts was assembled to map secondary administrative data from healthcare organizations that might provide information related to medication use. A multi-phase approach including internet search of institutional government websites, traditional bibliographic databases, and experts’ input was used for mapping the data sources. The reviewers searched, screened and selected the data sources independently; disagreements were resolved by consensus. Data sources were grouped into the following categories: 1) automated databases; 2) Electronic Medical Records (EMR); 3) national surveys or datasets; 4) adverse event reporting systems; and 5) others. Each data source was characterized by accessibility, geographic granularity, setting, type of data (aggregate or individual-level), and years of coverage. We also searched for publications related to each data source.Results: A total of 62 data sources were identified and screened; 38 met the eligibility criteria for inclusion and were fully characterized. We grouped 23 (60%) as automated databases, four (11%) as adverse event reporting systems, four (11%) as EMRs, three (8%) as national surveys or datasets, and four (11%) as other types. Eighteen (47%) were classified as publicly and conveniently accessible online; providing information at national level. Most of them offered more than 5 years of comprehensive data coverage, and presented data at both the individual and aggregated levels. No information about population coverage was found. Drug coding is not uniform; each data source has its own coding system, depending on the purpose of the data. At least one scientific publication was found for each publicly available data source.Conclusions: There are several types of data sources for DUR in Brazil, but a uniform system for drug classification and data quality evaluation does not exist. The extent of population covered by year is unknown. Our comprehensive and structured inventory reveals a need for full characterization of these data sources.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2026). Drug Database [Dataset]. https://www.biomedsyn.com/formList?pageTitle=Learn+More&type=10

Data from: Drug Database

Explore at:

Dataset updated

Jan 18, 2026

Description

Comprehensive drug information database

Clear search

Close search

Google apps

Main menu

Data from: Drug Database

Druggable Genome Comprehensive Drug Targets

Comprehensive A-Z Pharmaceutical Drug Database

Drug Targets and Drug Lists Data Package

DrugBank - Open Data Drug and Drug Target Database

Comprehensive Drug Self-administration and Discrimination Bibliographic...

Drug Labels & Side Effects Dataset | 1400+ Records

Drug Labels and Side Effects Dataset

Dataset Overview

Dataset Specifications

Column Specifications

Categorical Features

Numerical Features

Key Statistics

Manufacturer Distribution

Drug Class Distribution

Side Effect Severity

Potential Use Cases

1. Machine Learning Applications

2. Data Engineering Projects

3. Business Intelligence

Recommended Next Steps

DrugBank Database Data Package

KSA Drug Database (Metadata, PILs & SPCs) - AR/EN

Saudi Arabia Drug Database (Metadata, PILs & SPCs) - AR/EN

🟢 Overview

🔑 Key Features

🗂️ Data Structure

Example Single Drug Record

DrugBank

Comprehensive Drug Information Dataset

MedGuide Global Drug Information Dataset

MedlinePlus

FDA-Approved Drugs & Therapeutics

FDA-Approved Drugs & Therapeutics

Exploring Human Drug & Biological Therapies

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Acknowledgements

Multum MediSource Lexicon

Potential Drug Target Database

Indian Medicine Data

India Medicine Database

Drug-Drug Interactions

Additional file 1: of Toward a comprehensive drug ontology: extraction of...

DataSheet1_Data Sources for Drug Utilization Research in Brazil—DUR-BRA...

Data from: Drug Database