Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset consists of 8 columns : - sub_category: This classification pertains to specific medical categories that define the domain in which the medicine finds its application. - product_name: This is the name of the product, as available in the indian market. - salt_composition: This is the chemical composition of the drug. - product_price:This represents the previous price of the product. Please consider this as a reference, as it tends to be highly volatile in relation to the health market. - product_manufactured:The pharmaceutical company responsible for producing the medicine/drug. - medicine_desc: Comprehensive overview and detailed description of the specific product. - side_effects:Potential adverse effects associated with the drug/medicine. - drug_interactions:Interactions and effects when combining this specific medicine with other drugs.
There are a few missing values in the dataset, but most information is available for the row, so I have left as is.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The file has data which contains all possible medicines that we were able to find out during our research on finding out medicine's details such as compositions of medicine, type of medicines, there market availability, pricing and many other things.
The data consist of medicines from various pharmaceutical companies including:
*Prices of medicines are reported / recorded as of November,2022.
*is_discontinued column defines Availability of medicines that is reported as of November,2022.
There is another dataset we have published which gives information about drugs side effects, substitutes and usage. *Visit here : https://www.kaggle.com/datasets/shudhanshusingh/250k-medicines-usage-side-effects-and-substitutes *
Dataset will be updated on yearly basis.
Announcement : I have released a new dataset on Real Estate Properties , if you are interested must checkout here: https://www.kaggle.com/datasets/shudhanshusingh/real-estate-properties-dataset ,If you liked it, do give an upvote :)
Facebook
Twitterhttp://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj
This search allows you to find herbal substances that are designated for assessment by the European Medicines Agency's Committee on Herbal Medicinal Products (HMPC). Search results can be exported in Excel format.
Each substance is at a different stage of assessment and various documents are associated with the substance depending on where it is in the assessment process. The HMPC's conclusions on the herbal substance at the end of the assessment process can be found in the final European Union herbal monograph and may also be found in European Union list entry.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
RxNorm is a name of a US-specific terminology in medicine that contains all medications available on US market. Source: https://en.wikipedia.org/wiki/RxNorm
RxNorm provides normalized names for clinical drugs and links its names to many of the drug vocabularies commonly used in pharmacy management and drug interaction software, including those of First Databank, Micromedex, Gold Standard Drug Database, and Multum. By providing links between these vocabularies, RxNorm can mediate messages between systems not using the same software and vocabulary. Source: https://www.nlm.nih.gov/research/umls/rxnorm/
RxNorm was created by the U.S. National Library of Medicine (NLM) to provide a normalized naming system for clinical drugs, defined as the combination of {ingredient + strength + dose form}. In addition to the naming system, the RxNorm dataset also provides structured information such as brand names, ingredients, drug classes, and so on, for each clinical drug. Typical uses of RxNorm include navigating between names and codes among different drug vocabularies and using information in RxNorm to assist with health information exchange/medication reconciliation, e-prescribing, drug analytics, formulary development, and other functions.
This public dataset includes multiple data files originally released in RxNorm Rich Release Format (RXNRRF) that are loaded into Bigquery tables. The data is updated and archived on a monthly basis.
The following tables are included in the RxNorm dataset:
RXNCONSO contains concept and source information
RXNREL contains information regarding relationships between entities
RXNSAT contains attribute information
RXNSTY contains semantic information
RXNSAB contains source info
RXNCUI contains retired rxcui codes
RXNATOMARCHIVE contains archived data
RXNCUICHANGES contains concept changes
Update Frequency: Monthly
Fork this kernel to get started with this dataset.
https://www.nlm.nih.gov/research/umls/rxnorm/
https://bigquery.cloud.google.com/dataset/bigquery-public-data:nlm_rxnorm
https://cloud.google.com/bigquery/public-data/rxnorm
Dataset Source: Unified Medical Language System RxNorm. The dataset is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset. This dataset uses publicly available data from the U.S. National Library of Medicine (NLM), National Institutes of Health, Department of Health and Human Services; NLM is not responsible for the dataset, does not endorse or recommend this or any other dataset.
Banner Photo by @freestocks from Unsplash.
What are the RXCUI codes for the ingredients of a list of drugs?
Which ingredients have the most variety of dose forms?
In what dose forms is the drug phenylephrine found?
What are the ingredients of the drug labeled with the generic code number 072718?
Facebook
TwitterDrugBank Vocabulary contains information on DrugBank identifiers, names, and synonyms to permit easy linking and integration into any type of project. DrugBank is a richly annotated resource that combines detailed drug data with comprehensive drug target and drug action information. DrugBank is widely used to facilitate in silico drug target discovery, drug design, drug docking or screening, drug metabolism prediction, drug interaction prediction and general pharmaceutical education.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The "Drug Pharma New Dataset" is a comprehensive and up-to-date collection of pharmaceutical drugs registered by the Drug Administration of Bangladesh (DGDA). This dataset spans five major drug categories: Allopathic, Unani, Ayurvedic, Homeopathic, and Herbal. It serves as a valuable resource for researchers, data analysts, and anyone interested in the pharmaceutical industry, offering a detailed overview of the variety of drugs registered for medical use.
Source: DGDA http://dgdagov.info/index.php/registered-products/ayurvedic
Dataset Breakdown 📊:
Allopathic: 36,254 entries 💉
Unani: 8,460 entries 🌿
Ayurvedic: 5,262 entries 🌱
Homeopathic: 2,580 entries 💧
Herbal: 1,028 entries 🌸
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15408835%2F4403d24ec95657ce6cf0f67a5091fe24%2FScreenshot%20(88).png?generation=1740769909204066&alt=media" alt="">
Columns in the Dataset 📝:
The dataset contains the following columns to provide detailed drug information:
SL: Serial number for each product 📍 Name of the Manufacturer: The company or manufacturer producing the drug 🏭 Brand Name: The commercial brand name under which the drug is sold 🏷️ Generic Name: The official name of the drug's active ingredient 🏥 Strength: The concentration of the active ingredient(s) in the drug 💪 Dosages Description: The form in which the drug is administered (e.g., tablet, lotion, powder) 💊💧 Use For: The medical use or indication of the drug (e.g., pain relief, antibiotics, etc.) 🩺 DAR: The drug’s registration code, ensuring it’s officially approved for use 🆔 Type: The type of drug (e.g., Allopathic, Ayurvedic, etc.) 💡 Up-to-Date Drug Information 🕒: This dataset contains latest data on drugs registered and approved for use in Bangladesh. The information is continuously updated to reflect new drugs, changes in drug classifications, and updated manufacturer details.
High Data Integrity ✅: The data comes from a trusted and official source — the Drug Administration of Bangladesh (DGDA). This guarantees accuracy and consistency, making it highly reliable for analysis, research, and pharmaceutical studies.
Comprehensive Coverage 🗺️: By incorporating multiple types of drugs, this dataset covers both modern pharmaceutical drugs and traditional medicines, giving a well-rounded view of the pharmaceutical industry. It includes information for over-the-counter (OTC) drugs, prescription medicines, as well as herbal supplements.
Usage & Applications 🌍: The Drug Pharma New Dataset can be leveraged in several fields and for multiple applications:
Pharmaceutical Research 🔬:
New Drug Development: Researchers can use the dataset to identify trends, gaps in the market, and areas for innovation in the pharmaceutical industry. By analyzing drug classifications, strengths, dosages, and usage patterns, pharmaceutical companies can identify areas for new drug development and research. Pharmacovigilance: The dataset can be used in studying the safety and effectiveness of different drugs, monitoring adverse drug reactions (ADR), and identifying drugs that require more attention or changes in dosage recommendations. Market Analysis & Pharmaceutical Industry 📈:
Product Trends: Analyze the popularity of specific drug types (Allopathic, Ayurvedic, etc.) and understand market trends in pharmaceutical consumption. This helps manufacturers and marketers make data-driven decisions on drug production, marketing strategies, and customer targeting. Competitive Analysis: With drug manufacturer names included, this dataset allows for a competitive analysis by comparing the market share of different manufacturers and tracking new market entrants. Drug Classification & Insights ⚖️:
Drug Categorization: The dataset’s categorization of drugs by type (Allopathic, Unani, Ayurvedic, etc.) allows for detailed classification and comparison of the different therapeutic approaches in modern and traditional medicine. Therapeutic Use Analysis: Study the medicinal use of each drug type and identify the most common therapeutic applications (e.g., pain relief, treatment of infections). This is useful for healthcare professionals, policy makers, and regulatory bodies to better understand the most widely used treatments. Medical Database Creation 💻:
The dataset can be used to create comprehensive medical databases or drug repositories for hospitals, pharmacies, or pharmaceutical companies. It can help healthcare professionals quickly access important drug-related information such as dosages, brand names, and generic alternatives. Government & Regulatory Purposes 🏛️:
Regulatory Compliance: Regulatory agencies can use this dataset to monitor which drugs are officially registered and ensure that only approved drugs are sold in the market. The DAR (Drug Approval Registration) codes are especially useful for this purpose. Polic...
Facebook
Twitterhttps://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The database includes information such as demographics, vital sign measurements made at the bedside (~1 data point per hour), laboratory test results, procedures, medications, caregiver notes, imaging reports, and mortality (including post-hospital discharge).MIMIC supports a diverse range of analytic studies spanning epidemiology, clinical decision-rule improvement, and electronic tool development. It is notable for three factors: it is freely available to researchers worldwide; it encompasses a diverse and very large population of ICU patients; and it contains highly granular data, including vital signs, laboratory results, and medications.
Facebook
TwitterMedi-Span pharmacy reference database
Facebook
TwitterRxNorm provides normalized names for clinical drugs and links its names to many of the drug vocabularies commonly used in pharmacy management and drug interaction software, including those of First Databank, Micromedex, Gold Standard, and Multum. By providing links between these vocabularies, RxNorm can mediate messages between systems not using the same software and vocabulary. Technical documentation at http://www.nlm.nih.gov/research/umls/rxnorm/docs/index.html
Facebook
TwitterDailyMed provides health information providers and the public with a standard, comprehensive, up-to-date, look-up and download resource of medication content and labeling as found in medication package inserts, also known as Structured Product Labeling (SPL).
Facebook
TwitterList showing the name of product, name of registration certificate holder, Hong Kong registration number (Permit No) and active ingredient(s) of each registered pharmaceutical product.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides comprehensive information on various medications, including their composition, uses, side effects, manufacturer details, and user reviews. It aims to assist healthcare professionals and patients in making informed decisions about medications.
Classification: Categorizing medicines based on their usage or effectiveness. Segmentation Analysis: Analyzing different groups of medications based on reviews and side effects. Recommendation Systems: Developing models to recommend medications based on user profiles and preferences.
Facebook
TwitterThe Drug Listing Act of 1972 requires registered drug establishments to provide the Food and Drug Administration (FDA) with a current list of all drugs manufactured, prepared, propagated, compounded, or processed by it for commercial distribution. (See Section 510 of the Federal Food, Drug, and Cosmetic Act (Act) (21 U.S.C. � 360)). Drug products are identified and reported using a unique, three-segment number, called the National Drug Code (NDC), which serves as a universal product identifier for drugs. FDA publishes the listed NDC numbers and the information submitted as part of the listing information in the NDC Directory which is updated daily.
Facebook
Twitter👂💉 EHRSHOT is a dataset for benchmarking the few-shot performance of foundation models for clinical prediction tasks. EHRSHOT contains de-identified structured data (e.g., diagnosis and procedure codes, medications, lab values) from the electronic health records (EHRs) of 6,739 Stanford Medicine patients and includes 15 prediction tasks. Unlike MIMIC-III/IV and other popular EHR datasets, EHRSHOT is longitudinal and includes data beyond ICU and emergency department patients.
⚡️Quickstart 1. To recreate the original EHRSHOT paper, download the EHRSHOT_ASSETS.zip file from the "Files" tab 2. To work with OMOP CDM formatted data, download all the tables in the "Tables" tab
⚙️ Please see the "Methodology" section below for details on the dataset and downloadable files.
1. 📖 Overview
EHRSHOT is a benchmark for evaluating models on few-shot learning for patient classification tasks. The dataset contains:
%3C!-- --%3E
2. 💽 Dataset
EHRSHOT is sourced from Stanford’s STARR-OMOP database.
%3C!-- --%3E
We provide two versions of the dataset:
%3C!-- --%3E
To access the raw data, please see the "Tables" and "Files"** **tabs above:
3. 💽 Data Files and Formats
We provide EHRSHOT in two file formats:
%3C!-- --%3E
Within the "Tables" tab...
1. %3Cu%3EEHRSHOT-OMOP%3C/u%3E
* Dataset Version: EHRSHOT-OMOP
* Notes: Contains all OMOP CDM tables for the EHRSHOT patients. Note that this dataset is slightly different than the original EHRSHOT dataset, as these tables contain the full OMOP schema rather than a filtered subset.
Within the "Files" tab...
1. %3Cu%3EEHRSHOT_ASSETS.zip%3C/u%3E
* Dataset Version: EHRSHOT-Original
* Data Format: FEMR 0.1.16
* Notes: The original EHRSHOT dataset as detailed in the paper. Also includes model weights.
2. %3Cu%3EEHRSHOT_MEDS.zip%3C/u%3E
* Dataset Version: EHRSHOT-Original
* Data Format: MEDS 0.3.3
* Notes: The original EHRSHOT dataset as detailed in the paper. It does not include any models.
3. %3Cu%3EEHRSHOT_OMOP_MEDS.zip%3C/u%3E
* Dataset Version: EHRSHOT-OMOP
* Data Format: MEDS 0.3.3 + MEDS-ETL 0.3.8
* Notes: Converts the dataset from EHRSHOT-OMOP into MEDS format via the `meds_etl_omop`command from MEDS-ETL.
4. %3Cu%3EEHRSHOT_OMOP_MEDS_Reader.zip%3C/u%3E
* Dataset Version: EHRSHOT-OMOP
* Data Format: MEDS Reader 0.1.9 + MEDS 0.3.3 + MEDS-ETL 0.3.8
* Notes: Same data as EHRSHOT_OMOP_MEDS.zip, but converted into a MEDS-Reader database for faster reads.
4. 🤖 Model
We also release the full weights of **CLMBR-T-base, **a 141M parameter clinical foundation model pretrained on the structured EHR data of 2.57M patients. Please download from https://huggingface.co/StanfordShahLab/clmbr-t-base
**5. 🧑💻 Code **
Please see our Github repo to obtain code for loading the dataset and running a set of pretrained baseline models: https://github.com/som-shahlab/ehrshot-benchmark/
**NOTE: You must authenticate to Redivis using your formal affiliation's email address. If you use gmail or other personal email addresses, you will not be granted access. **
Access to the EHRSHOT dataset requires the following:
Facebook
TwitterThe Global Unique Device Identification Database (GUDID) contains key device identification information submitted to the FDA about medical devices that have Unique Device Identifiers (UDI). Unique device identification is a system being established by the
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Global Essential Medicines Database
In June of 2017, we searched the WHO Essential Medicines and Health Products Information Portal, an online repository that contains hundreds of publication on medicines and health products related to WHO priorities, and a full-section dedicated to national essential medicines lists (EMLs). A WHO information specialist actively searched for updated versions of national EMLs, including national formularies, reimbursement lists, and lists based on standard treatment guidelines.
We included all national EMLs that were posted on the WHO’s NEMLs Repository irrespective of publication date and language. When we found more than one national EML from the same country, we used the most recent. We excluded documents that were not EMLs, such as prescribing guidelines. We also included the 20th edition of the WHO Model EML (2017) in this database.
From each EML we abstracted medicines using International Nonproprietary Names (INNs). For medicines whose names were not in English we used the Anatomical Therapeutic Chemical (ATC) classification system, if available, or translated the names with the help of Google Translate. We listed each medicine individually, whether it was part of a combination product or not. We treated as the same medicine bases and their salts (e.g. promethazine hydrochloride and promethazine) as well as different compounds of the same vitamin or mineral (e.g. ferrous fumarate and ferrous sulfate). We excluded diagnostic agents, antiseptics, disinfectants, and saline solutions.
In this database "1" and "0" indicate the presence or absence of the medicine respectively on an EML.
Facebook
Twitterhttps://clue.io/termshttps://clue.io/terms
Provided are annotations for 6,125 drug and tool compounds (2,369 FDA-approved drugs, 1,619 drugs that reached phases 1-3 of clinical development, 96 compounds that were previously approved but withdrawn from use, and 2,041 preclinical or tool compounds). Annotations include compound name, chemical structure, clinical trial status, mechanism of action, protein targets, disease areas, approved indications (where applicable), purity of the purchased sample, and vendor ID.
Facebook
Twitterhttp://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj
The EU Veterinary Medicinal Product Database is intended to be a source of information on all medicinal products for veterinary use that have been authorised in the European Union and the European Economic Area. The database is hosted by the European Medicines Agency.
Facebook
TwitterPubMed is a free resource supporting the search and retrieval of biomedical and life sciences literature with the aim of improving health–both globally and personally. The PubMed database contains citations and abstracts of biomedical literature. It does not include full text journal articles; however, links to the full text are often present when available from other sources, such as the publisher's website or PubMed Central (PMC). See the PubMed User Guide for more information. https://pubmed.ncbi.nlm.nih.gov/help/
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides a comprehensive collection of drug-drug interactions (DDIs) intended for research in predicting and understanding complex interaction relationships between drugs. It is sourced from the Drug Bank database and is designed to support multi-task learning approaches in the domain of bioinformatics and pharmacology.
Feature Details: Drug 1: Name of the first drug in the interaction. Drug 2: Name of the second drug in the interaction. Interaction Description: Detailed description of the interaction between the two drugs.
Source: The dataset is derived from the datasets provided by the team at TDCommons
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset consists of 8 columns : - sub_category: This classification pertains to specific medical categories that define the domain in which the medicine finds its application. - product_name: This is the name of the product, as available in the indian market. - salt_composition: This is the chemical composition of the drug. - product_price:This represents the previous price of the product. Please consider this as a reference, as it tends to be highly volatile in relation to the health market. - product_manufactured:The pharmaceutical company responsible for producing the medicine/drug. - medicine_desc: Comprehensive overview and detailed description of the specific product. - side_effects:Potential adverse effects associated with the drug/medicine. - drug_interactions:Interactions and effects when combining this specific medicine with other drugs.
There are a few missing values in the dataset, but most information is available for the row, so I have left as is.