The Computational Photography Project for Pill Identification (C3PI) was sunset in 2018. No new images will be added to the collection. Identifiers for pills will not be updated. Images and metadata are for research and development purposes only.
The Computational Photography Project for Pill Identification (C3PI) created the RxIMAGE database of freely available high-quality digital images of prescription pills and associated data for use in conducting computer vision research in text- and image-based search and retrieval. Photographs of pills for the RxIMAGE database were taken under laboratory lighting conditions, from a camera directly above the front and the back faces of the pill, at high resolution, and using specialized digital macro-photography techniques. Image segmentation algorithms were then applied to create the JPEG images in the database.
Historical information about the project is available in the NLM archive at https://wayback.archive-it.org/7867/20190423182937/https:/lhncbc.nlm.nih.gov/project/c3pi-computational-photography-project-pill-identification.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains images of pills inside a medication bottle from a top down view. The dataset was used to build an image classification model for predicting the national drug code (NDC) of the medication seen in the image. There are 13,955 images of 20 distinct NDC. The image data were used to create a machine learning algorithm which could predict the NDC.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset includes 2000 mobile-captured drug pack images. It is used as a part of a drug label data extraction research. The research is concerned with extracting drug data from packagings to be stored by patients in countries with no electronic medical records. The images in the dataset have the following characteristics: 1. A total of 2000 images of drug packs are captured at different angles. 2. Backgrounds are made of paper, cloth, and plastic. 3. No transparent, reflexive, or patterned backgrounds. 4. 140 images were captured with flash on. 5. 1860 images were captured in different lighting conditions and without the use of a flash. 6. Drug packs only were captured. Nothing was captured without its pack. 7. Captured drug packages contain tablets, capsules, syrups, creams, ampoules, gels, drops, ointments, and other types. 8. 166 drug packages were captured. 9. Images were taken using 7 devices: • Huawei P30 Lite • Huawei Cun-L21 • Samsung A50 • Samsung A30s • Oppo A9 2020 • Realme 6 • iPhone XS Max 10. The images are in the following resolutions: • 3264 x 2448 pixels – 72 dpi • 1920 x 1080 pixels – 72 dpi • 2336 x 1080 pixels – 72 dpi • 4224 x 5632 pixels – 96 dpi • 6944 x 9280 pixels – 72 dpi • 899 x 1599 pixels – 96 dpi • 3024 x 4032 pixels – 72 dpi 11. Some drug packs have handwriting. 12. Some images contain shadows and flash burns. 13. Some images were resized or compressed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Medical Transcription Assistant: This computer vision model can be used in transcription services, helping transcribe handwritten prescriptions into digital text. Doctors, pharmacists, and healthcare professionals can use such transcriptions for digital record-keeping, data analysis, sharing medical information, and patient follow-ups.
Medicine Inventory Management: The model can help pharmacies automate their drug inventory management. By identifying medicines names in prescriptions, the software can update inventory data in real time, ensuring that stocks are always updated and adequate.
Drug Interaction Analysis: The model can be applied in an application that identifies potential drug interactions for a given patient's multiple prescriptions. By recognizing the names of medicines, it could cross-check them with a database of known drug interactions, alerting the pharmacist or patient about potential risks.
Telemedicine Applications: This model can be useful in telemedicine scenarios where patients send images of their prescriptions. It can analyze the prescription, identify the drug names, and forward the information to online pharmacies for home deliveries or to doctors/nurses for tele-consultations.
Pharma Market Research: Companies can use this model to analyze prescriptions to understand the most commonly prescribed drugs, aiding in market research and trending analysis in pharmaceutical industries.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data to accompany the manuscript "Deep learnings models for lipid-nanoparticle-based drug delivery"Abstract:Large-scale time-lapse microscopy experiments are useful to understand delivery and expression in RNA-based therapeutics. The resulting data has high dimensionality and high (but sparse) information content, making it challenging and costly to store and process. Early prediction of experimental outcome enables intelligent data management and decision making. We start from time-lapse data of HepG2 cells exposed to lipid-nanoparticles loaded with mRNA for expression of green fluorescent protein (GFP). We hypothesize that it is possible to predict if a cell will express GFP or not based on cell morphology at time-points prior to GFP expression. Here we present results on per-cell classification (GFP expression/no GFP expression) and regression (level of GFP expression) using three different approaches. In the first approach we use a convolutional neural network extracting per-cell features at each time point. We then utilize the same features combined with: a long-short-term memory (LSTM) network encoding temporal dynamics (approach 2); and time-series feature extraction using the python package tsfresh followed by principal component analysis and gradient boosting machines (approach 3), to reach a final classification or regression result. Application of the three approaches to a previously unanalyzed test set of cells showed good predictive performance of all three approaches but that accounting for the temporal dynamics via LSTMs or tsfresh led to significantly improved performance. The predictions made by the LSTM and tsfresh applications were not significantly different. The results highlight the benefit of accounting for temporal dynamics when studying drug delivery using high content imaging.Python code:https://github.com/pharmbio/phil_LNP_modelling
This resource was retired on January 28, 2021 and is no longer updated. These data remain available to support research and development efforts. Pillbox's final image library is available at https://ftp.nlm.nih.gov/projects/pillbox/pillbox_production_images_full_202008.zip. For more information on Pillbox's retirement visit https://www.nlm.nih.gov/pubs/techbull/ja20/ja20_pillbox_discontinue.html. Pillbox contains metadata for oral solid dosage form medications, derived from FDA drug labeling, including physical characteristics, active and inactive ingredients, National Drug Codes, information about firms marketing those products, selected information from RxNorm, and links to images provided by the National Library of Medicine.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
## Overview
Drug Name Detector is a dataset for object detection tasks - it contains Drug Name annotations for 1,307 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Drugs Classification is a dataset for classification tasks - it contains Pill annotations for 5,404 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
1. Title of Dataset
Cebulka (Polish dark web cryptomarket and image board) messages data.
2. Data Collectors
Haitao Shi (The University of Edinburgh, UK); Patrycja Cheba (Jagiellonian University); Leszek Świeca (Kazimierz Wielki University in Bydgoszcz, Poland).
3. Funding Information
The dataset is part of the research supported by the Polish National Science Centre (Narodowe Centrum Nauki) grant 2021/43/B/HS6/00710.
Project title: “Rhizomatic networks, circulation of meanings and contents, and offline contexts of online drug trade” (2022-2025; PLN 956 620; funding institution: Polish National Science Centre [NCN], call: OPUS 22; Principal Investigator: Piotr Siuda [Kazimierz Wielki University in Bydgoszcz, Poland]).
4. Data Source
Polish dark web cryptomarket and image board called Cebulka (http://cebulka7uxchnbpvmqapg5pfos4ngaxglsktzvha7a5rigndghvadeyd.onion/index.php).
5. Purpose
This dataset was developed within the abovementioned project. The project focuses on studying internet behavior concerning disruptive actions, particularly emphasizing the online narcotics market in Poland. The research seeks to (1) investigate how the open internet, including social media, is used in the drug trade; (2) outline the significance of darknet platforms in the distribution of drugs; and (3) explore the complex exchange of content related to the drug trade between the surface web and the darknet, along with understanding meanings constructed within the drug subculture.
Within this context, Cebulka is identified as a critical digital venue in Poland’s dark web illicit substances scene. Besides serving as a marketplace, it plays a crucial role in shaping the narratives and discussions prevalent in the drug subculture. The dataset has proved to be a valuable tool for performing the analyses needed to achieve the project’s objectives.
6. Data Description
The data was collected in three periods, i.e., in January 2023, June 2023, and January 2024.
The dataset comprises a sample of messages posted on Cebulka from its inception until January 2024 (including all the messages with drug advertisements). These messages include the initial posts that start each thread and the subsequent posts (replies) within those threads. The dataset is organized into two directories. The “cebulka_adverts” directory contains posts related to drug advertisements (both advertisements and comments). In contrast, the “cebulka_community” directory holds a sample of posts from other parts of the cryptomarket, i.e., those not related directly to trading drugs but rather focusing on discussing illicit substances. The dataset consists of 16,842 posts.
7. Data Cleaning, Processing, and Anonymization
The data has been cleaned and processed using regular expressions in Python. Additionally, all personal information was removed through regular expressions. The data has been hashed to exclude all identifiers related to instant messaging apps and email addresses. Furthermore, all usernames appearing in messages have been eliminated.
8. File Formats and Variables/Fields
The dataset consists of the following files:
9. Ethics Statement
A set of data handling policies aimed at ensuring safety and ethics has been outlined in the following paper:
Harviainen, J.T., Haasio, A., Ruokolainen, T., Hassan, L., Siuda, P., Hamari, J. (2021). Information Protection in Dark Web Drug Markets Research [in:] Proceedings of the 54th Hawaii International Conference on System Sciences, HICSS 2021, Grand Hyatt Kauai, Hawaii, USA, 4-8 January 2021, Maui, Hawaii, (ed.) Tung X. Bui, Honolulu, HI, pp. 4673-4680.
The primary safeguard was the early-stage hashing of usernames and identifiers from the messages, utilizing automated systems for irreversible hashing. Recognizing that automatic name removal might not catch all identifiers, the data underwent manual review to ensure compliance with research ethics and thorough anonymization.
Images from the History of Medicine (IHM) in NLM Digital Collections provides online access to images from the historical collections of the U.S. National Library of Medicine. IHM includes image files of a wide variety of visual media including fine art, photographs, engravings, and posters that illustrate the social and historical aspects of medicine dating from the 15th to 21st century.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
A. R. Mia, M. A. -A. -S. Chowdhury, A. A. Mamun, A. M. Ruddra and N. T. Tanny, "**A Deep Neural Network Approach with Pioneering Local Dataset to Recognize Doctor's Handwritten Prescription in Bangladesh**," 2024 International Conference on Advances in Computing, Communication, Electrical, and Smart Systems (iCACCESS), Dhaka, Bangladesh, 2024.
This dataset was created by extracting and processing prescription images to support educational research and experimentation in machine learning models, particularly for text recognition and classification in healthcare contexts.
To transform the prescription images into a structured dataset suitable for machine learning, a specialized word detection algorithm was employed. This code segmented the prescription images into individual words, converting the data into a format that facilitates accurate recognition by ML models.
Beklo, Maxima, Leptic, Esoral, Omastin, Esonix, Canazole, Fixal, Progut, Diflu, Montair, Flexilax, Maxpro, Vifas, Conaz, Fexofast, Fenadin, Telfast, Dinafex, Ritch, Renova, Flugal, Axodin, Sergel, Nexum, Opton, Nexcap, Fexo, Montex, Exium, Lumona, Napa, Azithrocin, Atrizin, Monas, Nidazyl, Metsina, Baclon, Rozith, Bicozin, Ace, Amodis, Alatrol, Napa Extend, Rivotril, Montene, Filmet, Aceta, Tamen, Bacmax, Disopan, Rhinil, Flamyd, Metro, Zithrin, Candinil, Lucan-R, Backtone, Bacaid, Etizin, Az, Romycin, Azyth, Cetisoft, Dancel, Tridosil, Nizoder, Ketoral, Ketocon, Ketotab, Ketozol, Denixil, Provair, Odmon, Baclofen, MKast, Trilock, Flexibac.
These classes represent commonly prescribed pharmaceutical names likely to appear in handwritten prescriptions.
This dataset is ideal for:
⚠️ Note: This dataset is free to use for educational and research purposes only.
An application for navigating RxNorm drugs. This applications displays relations among drug entities in RxNorm and provides additional information about RxNorm drugs, including drug classes, pill images and drug-drug interactions. RxNav is supported by several drug APIs.
These are the pill images used in the prescription drug use items within the 2019 NSDUH Questionnaire.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Medicine Images is a dataset for object detection tasks - it contains Medicine annotations for 2,627 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract:The current COVID-19 pandemic has highlighted the need for new and fast methods to identify novel or repurposed therapeutic drugs. Here we present a method for untargeted phenotypic drug screening of virus-infected cells, combining Cell Painting with antibody-based detection of viral infection in a single assay. We designed an image analysis pipeline for segmentation and classification of virus-infected and non-infected cells, followed by extraction of morphological properties. We show that the methodology can successfully capture virus-induced phenotypic signatures of MRC-5 human lung fibroblasts infected with Human coronavirus 229E (CoV-229E). Moreover, we demonstrate that our method can be used in phenotypic drug screening using a panel of nine host- and virus-targeting antivirals. Treatment with effective antiviral compounds reversed the morphological profile of the host cells towards a non-infected state. The method can be used in drug discovery for morphological profiling of novel antiviral compounds on both infected and non-infected cells. Screen description: The images are of MRC-5 human lung fibroblasts infected with Human coronavirus 229E (CoV-229E) and treated with a panel of nine host- and virus-targeting antivirals. Cells are labelled with five labels that characterise seven cellular components (from the "Cell Painting" assay) as well as with a Coronavirus pan monoclonal antibody combined with a secondary antibody. This experiment consists of 5 plates. Each plate has 60 wells, and 9 fields of view per well. Each field was imaged in five channels (detection wavelengths), and each channel is stored as a separate, grayscale image file in TIFF format.The channel names (w1-w5) correspond to the following stains: w1 = Hoechst 33342 (HOECHST); w2= Coronavirus pan Monoclonal Antibody (FIPV3-70) + Goat Anti-Mouse IgG H&L secondary antibody (MITO); w3= Wheat Germ Agglutinin/Alexa Fluor 555 + Phalloidin/Alexa Fluor 568 (PHAandWGA); w4= SYTO 14 green (SYTO); w5= Concanavalin A/Alexa Fluor 488 (CONC)Organization of files:1) Raw image data:- MRC5_HCoV229_Plate1.tar.gz - MRC5_HCoV229_Plate2.tar.gz - MRC5_Plate3.tar.gz - MRC5_Plate4.tar.gz - MRC5_HCoV229_Plate5.tar.gz 2) Image analysis pipelines (CellProfiler 4.0.7):Cell Profiler project with a subset of images to try out the analysis pipeline:- Example_PipelineAndData.tar.gz Quality control, illumination correction and feature extraction pipelines:- AnalysisPipelines.tar.gz3) Extracted feature data:- features_MRC5_HCoV229_Plate1.tar.gz- features_MRC5_HCoV229_Plate2.tar.gz- features_MRC5_Plate3.tar.gz- features_MRC5_Plate4.tar.gz- features_MRC5_HCoV229_Plate5.tar.gzMetadata:The file “Metadata_MRC5_HCoV229E_plate1-5.csv“ contains the metadata in CSV format, with the following fields:- Plate_id: corresponds to the experimental plate- Well: well allocation in the 96-well plate- virus: "virus +" when cells are exposed to virus, and "virus -' for non-infected controls- Compound: name of compound- Dose [μM]: dose of compoundFor full information, see the manuscript to which this data is linked.
https://choosealicense.com/licenses/agpl-3.0/https://choosealicense.com/licenses/agpl-3.0/
Ultralytics Medical-pills Dataset
Introduction
Ultralytics medical-pills detection dataset is a proof-of-concept (POC) dataset, carefully curated to demonstrate the potential of AI in pharmaceutical applications. It contains labeled images specifically designed to train computer vision models for identifying medical-pills.
Sample Images and Annotations
Here are some examples of images from the dataset, along with their corresponding annotations in a… See the full description on the dataset page: https://huggingface.co/datasets/Ultralytics/Medical-pills.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Here are a few use cases for this project:
Pharmaceutical Search Assistance: The model can be used as a search assistance tool within online pharmacy stores or pharmaceutical databases, allowing customers to upload an image of the medication they are looking for and the system would identify if it's Tylenol, Benq or neither.
Medication Identification: The model can be utilized in medical facilities to help health workers identify medication via a mobile application or a desktop software where one only needs to scan the boxes of medication.
Telemedicine and E-Health: The model can be implemented in telemedicine software to enable patients to scan their medication during a consultation, helping the doctor to quickly identify what types of drugs they are currently taking.
Drug Recognition Training: It can be used in a training program for pharmacists, nurses or EMTs, improving their ability to recognize different types of common medications.
Smart Home Health Assistant: The model can be integrated into smart home assistants or devices such as Google Home, Alexa to assist elderly or visually impaired individuals in ensuring they are taking the correct medication at home.
Database of authoritative health information about diseases, conditions, and wellness issues that offers reliable, up-to-date health information for free. It contains the latest treatments, information on drugs and supplements, the meanings of words, and medical videos and illustrations. Links to the latest topic or disease specific medical research or clinical trials are also offered. * MedlinePlus pages contain carefully selected links to Web resources with health information on over 900 topics. ** The MedlinePlus health topic pages include links to current news on the topic and related information. You can also find preformulated searches of the MEDLINE/PubMed database, which allow you to find references to latest health professional articles on your topic. * The A.D.A.M. medical encyclopedia brings health consumers an extensive library of medical images and videos, as well as over 4,000 articles about diseases, tests, symptoms, injuries, and surgeries. * The Merriam-Webster medical dictionary allows you to look up definitions and spellings of medical words. * Drug and supplement information is available from the American Society of Health-System Pharmacists (ASHP) via AHFS Consumer Medication Information, and Natural Medicines Comprehensive Database Consumer Version. ** AHFS Consumer Medication Information provides extensive information about more than 1,000 brand name and generic prescription and over-the-counter drugs, including side effects, precautions and storage for each drug. ** Natural Medicines Comprehensive Database Consumer Version is an evidence-based collection of information on alternative treatments. MedlinePlus has 100 monographs on herbs and supplements. * Interactive tutorials from the Patient Education Institute explain over 165 procedures and conditions in easy-to-read language. An XML File for the MedlinePlus Health Topics is available, http://www.nlm.nih.gov/medlineplus/xmldescription.html. The ontology is available through Bioportal, http://bioportal.bioontology.org/ontologies/MEDLINEPLUS
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
According to the World Drug Report 2020, cocaine and ecstasy are the most consumed stimulant drugs, with 19 and 27 millions estimated users in 2018. Unsurprisingly, large efforts are being made to design fast and cost effective analytical methods to track and monitor the distribution networks of these synthetic drugs. Here we share two datasets of ecstasy pills seized in the north-east of Switzerland between 2010 and 2011. The first contains 621 forensic grade images of pills, while the second one consists of 486 mIR spectra. While both sets are not covering the same seizure, both provide high quality data with orthogonal information to evaluate clustering and dimension reduction methods.
Explore our dataset of 6,000+ high-quality nuclear medicine scintigraphy exams in DICOM format for AI healthcare development.
The Computational Photography Project for Pill Identification (C3PI) was sunset in 2018. No new images will be added to the collection. Identifiers for pills will not be updated. Images and metadata are for research and development purposes only.
The Computational Photography Project for Pill Identification (C3PI) created the RxIMAGE database of freely available high-quality digital images of prescription pills and associated data for use in conducting computer vision research in text- and image-based search and retrieval. Photographs of pills for the RxIMAGE database were taken under laboratory lighting conditions, from a camera directly above the front and the back faces of the pill, at high resolution, and using specialized digital macro-photography techniques. Image segmentation algorithms were then applied to create the JPEG images in the database.
Historical information about the project is available in the NLM archive at https://wayback.archive-it.org/7867/20190423182937/https:/lhncbc.nlm.nih.gov/project/c3pi-computational-photography-project-pill-identification.