100+ datasets found

h
Medical-QA-RS
huggingface.co
Updated Apr 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sushant Dagaji Desale (2025). Medical-QA-RS [Dataset]. https://huggingface.co/datasets/MrMaxMind99/Medical-QA-RS
Explore at:
Dataset updated
Apr 27, 2025
Authors
Sushant Dagaji Desale
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
MrMaxMind99/Medical-QA-RS dataset hosted on Hugging Face and contributed by the HF Datasets community
h
medical-qa-datasets
huggingface.co
Updated Nov 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lavita AI (2023). medical-qa-datasets [Dataset]. https://huggingface.co/datasets/lavita/medical-qa-datasets
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 14, 2023
Dataset authored and provided by
Lavita AI
Description
all-processed dataset is a concatenation of of medical-meadow-* and chatdoctor_healthcaremagic datasets The Chat Doctor term is replaced by the chatbot term in the chatdoctor_healthcaremagic dataset Similar to the literature the medical_meadow_cord19 dataset is subsampled to 50,000 samples truthful-qa-* is a benchmark dataset for evaluating the truthfulness of models in text generation, which is used in Llama 2 paper. Within this dataset, there are 55 and 16 questions related to Health and… See the full description on the dataset page: https://huggingface.co/datasets/lavita/medical-qa-datasets.
p
Data from: EHR-DS-QA: A Synthetic QA Dataset Derived from Medical Discharge...
physionet.org
Updated Jan 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Konstantin Kotschenreuther (2024). EHR-DS-QA: A Synthetic QA Dataset Derived from Medical Discharge Summaries for Enhanced Medical Information Retrieval Systems [Dataset]. http://doi.org/10.13026/25fx-f706
Explore at:
Unique identifier
https://doi.org/10.13026/25fx-f706
Dataset updated
Jan 11, 2024
Authors
Konstantin Kotschenreuther
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
This dataset was designed and created to enable advancements in healthcare-focused large language models, particularly in the context of retrieval-augmented clinical question-answering capabilities. Developed using a self-constructed pipeline based on the 13-billion parameter Meta Llama 2 model, this dataset encompasses 21466 medical discharge summaries extracted from the MIMIC-IV-Note dataset, with 156599 synthetically generated question-and-answer pairs, a subset of which was verified for accuracy by a physician. These pairs were generated by providing the model with a discharge summary and instructing it to generate question-and-answer pairs based on the contextual information present in the summaries. This work aims to generate data in support of the development of compact large language models capable of efficiently extracting information from medical notes and discharge summaries, thus enabling potential improvements for real-time decision-making processes in clinical settings. Additionally, accompanying the dataset is code facilitating question-and-answer pair generation from any medical and non-medical text. Despite the robustness of the presented dataset, it has certain limitations. The generation process was confined to a maximum context length of 6000 input tokens, owing to hardware constraints. The large language model's nature in generating these question-and-answer pairs may introduce an underlying bias or a lack in diversity and complexity. Future iterations should focus on rectifying these issues, possibly through diversified training and expanded verification procedures as well as the employment of more powerful large language models.
medical-qa-shared-task-v1-toy-eval
huggingface.co
Updated Sep 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lavita AI (2023). medical-qa-shared-task-v1-toy-eval [Dataset]. https://huggingface.co/datasets/lavita/medical-qa-shared-task-v1-toy-eval
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 3, 2023
Dataset authored and provided by
Lavita AI
Description
Dataset Card for "medical-qa-shared-task-v1-toy-eval"

More Information needed
MEDQA-USMLE QA JSON Only
kaggle.com
Updated Oct 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nithin Dhananjayan (2023). MEDQA-USMLE QA JSON Only [Dataset]. https://www.kaggle.com/datasets/evidence/medqa-usmle-qa-json-only
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 24, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nithin Dhananjayan
Description
The current dataset is a subset and reformatting of a more raw dataset. The focus here is only on US questions and answers split into dev, train, and test sets in separate json files. This format ought to be easier to use. This notebook captures how the conversion was done.

The more raw dataset is pulled from paperswithcode which was originally pulled from A Large-scale Open Domain Question Answering Dataset from Medical Exams

The dataset is collected from the professional medical board exams. It covers three languages: English, simplified Chinese, and traditional Chinese, and contains 12,723, 34,251, and 14,123 questions for the three languages, respectively.

This is under the MIT License

MIT License (As given on github)

Copyright (c) 2022 Di Jin

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Written with StackEdit.
S
Test dataset of ChatGPT in medical field
scidb.cn
Updated Mar 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
robin shen (2023). Test dataset of ChatGPT in medical field [Dataset]. http://doi.org/10.57760/sciencedb.o00130.00001
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.o00130.00001
Dataset updated
Mar 3, 2023
Dataset provided by
Science Data Bank
Authors
robin shen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The researcher tests the QA capability of ChatGPT in the medical field from the following aspects:1. Test their reserve capacity for medical knowledge2. Check their ability to read literature and understand medical literature3. Test their ability of auxiliary diagnosis after reading case data4. Test its error correction ability for case data5. Test its ability to standardize medical terms6. Test their evaluation ability to experts7. Check their ability to evaluate medical institutionsThe conclusion is:ChatGPT has great potential in the application of medical and health care, and may directly replace human beings or even professionals at a certain level in some fields;The researcher preliminarily believe that ChatGPT has basic medical knowledge and the ability of multiple rounds of dialogue, and its ability to understand Chinese is not weak;ChatGPT has the ability to read, understand and correct cases;ChatGPT has the ability of information extraction and terminology standardization, and is quite excellent;ChatGPT has the reasoning ability of medical knowledge;ChatGPT has the ability of continuous learning. After continuous training, its level has improved significantly;ChatGPT does not have the academic evaluation ability of Chinese medical talents, and the results are not ideal;ChatGPT does not have the academic evaluation ability of Chinese medical institutions, and the results are not ideal;ChatGPT is an epoch-making product, which can become a useful assistant for medical diagnosis and treatment, knowledge service, literature reading, review and paper writing.
MedRedQA
data.csiro.au
researchdata.edu.au
Updated May 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vincent Nguyen; Sarvnaz Karimi; Maciek Rybinski; Zhenchang Xing (2024). MedRedQA [Dataset]. http://doi.org/10.25919/yn7x-9148
Explore at:
Unique identifier
https://doi.org/10.25919/yn7x-9148
Dataset updated
May 1, 2024
Dataset provided by
CSIROhttp://www.csiro.au/
Authors
Vincent Nguyen; Sarvnaz Karimi; Maciek Rybinski; Zhenchang Xing
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Time period covered
Jul 10, 2013 - Apr 2, 2022
Dataset funded by
CSIROhttp://www.csiro.au/
Australian National University
Description
A large non-factoid English consumer Question Answering (QA) dataset containing 51,000 pairs of consumer questions and their corresponding expert answers. This dataset is useful for bench-marking or training systems on more difficult real-world questions and responses which may contain spelling or formatting errors, or lexical gaps between consumer and expert vocabularies.

By downloading this dataset, you agree to have obtained ethics approval from your institution. Lineage: We collected data from posts and comments to subreddit /r/askdocs, published between July 10, 2013, and April 2, 2022, totalling 600,000 submissions (original posts) and 1,700,000 comments (replies). We generated question-answer pairs by taking the highest scoring answer from a verified medical expert to a Reddit question. Questions with only images are removed, all links are removed and authors are removed.

We provide two separate datasets in this collection and provide the following schemas. MedRedQA - Reddit Medical Question and Answer pairs from /r/askdocs. CSV format. i. the poster's question (Body) ii. Title of the post iii. The filtered answer from a verified physician comment (Response) iv. Occupation indicated for verification status v. Any PMCIDs found in the post

MedRedQA+PubMed - PubMed Enriched subset of MedRedQA. JSON format. i. Question. The user's original question. The is equivalent to the Body field in MedRedQA ii. Document: The abstract of the PubMed document (if it exists and contains an abstract) for that particular post. Note: it does not necessarily mean the answer references this document. But at least one other verified physician in the responses has mentioned that particular document. iii. The filtered response. This is equivalent to the Response field in MedRedQA.
h
merged-medical-qa
huggingface.co
Updated Jun 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emmanuel Micaiah Afriyie (2025). merged-medical-qa [Dataset]. https://huggingface.co/datasets/Faithality/merged-medical-qa
Explore at:
Dataset updated
Jun 26, 2025
Authors
Emmanuel Micaiah Afriyie
Description
Faithality/merged-medical-qa dataset hosted on Hugging Face and contributed by the HF Datasets community
h
medical-qa
huggingface.co
Updated May 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Intelligence and Database System Lab (2024). medical-qa [Dataset]. https://huggingface.co/datasets/TUDB-Labs/medical-qa
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 11, 2024
Dataset authored and provided by
Intelligence and Database System Lab
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for Dataset Name

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

Dataset Details Dataset Description

Curated by: [More Information Needed] Funded by [optional]: [More Information Needed] Shared by [optional]: [More Information Needed] Language(s) (NLP): [More Information Needed] License: [More Information Needed]

Dataset Sources [optional]

Repository: [More… See the full description on the dataset page: https://huggingface.co/datasets/TUDB-Labs/medical-qa.
z
Patient Doctor Q&A TR 321179
zenodo.org
csv
Updated Jul 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammed Kayra Bulut; Muhammed Kayra Bulut (2024). Patient Doctor Q&A TR 321179 [Dataset]. http://doi.org/10.5281/zenodo.12798934
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.12798934
Dataset updated
Jul 23, 2024
Dataset provided by
Muhammed Kayra Bulut
Authors
Muhammed Kayra Bulut; Muhammed Kayra Bulut
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Time period covered
Jul 18, 2024
Description
# Patient Doctor Q&A TR 321179 Veri Seti
Patient Doctor Q&A TR 321179 veri seti, [**Patient Doctor Q&A TR 19583**](https://www.kaggle.com/datasets/kaayra2000/patient-doctor-qa-dataset-tr), [**Patient Doctor Q&A TR 167732**](https://www.kaggle.com/datasets/kaayra2000/patient-doctor-q-and-a-tr-167732), [**Patient Doctor Q&A TR 5695**](https://www.kaggle.com/datasets/kaayra2000/patient-doctor-q-and-a-translated-from-id-to-tr) ve [**Patient Doctor Q&A TR 95588**](https://www.kaggle.com/datasets/kaayra2000/patient-doctor-q-and-a-tr-95588) veri setlerinin birleştirilmiş ve karıştırılmış halidir.

## Ana Özellikler:
* İçerik: Çeşitli tıbbi konuları kapsayan hasta soruları ve doktor yanıtları.
* Yapı: 2 sütun içerir: Soru, Cevap.
* Dil: Türkçe.
## Potansiyel Kullanım Alanları:
* Tıbbi araştırmalar
* Doğal Dil İşleme (NLP)
* Tıbbi eğitim
## Sınırlamalar:
* Veri gizliliği endişeleri
* Yanıt kalitesinde değişkenlik
* Potansiyel önyargılar
## Genel Değerlendirme:
Patient Doctor Q&A TR 321179 veri seti, gerçek dünyadaki tıbbi iletişimi ve bilgi alışverişini anlamak için değerli bir kaynaktır. Türkçeye çevrilmiş bu veri seti, tıbbi araştırmalar ve eğitim için önemli bir kaynak olup, hasta ve doktor arasındaki iletişimi analiz etmek için kullanılabilir. Ancak, veri gizliliği ve yanıt kalitesindeki değişkenlik gibi sınırlamalar göz önünde bulundurulmalıdır.

Bu veri seti, araştırmacılara ve eğitimcilere, Türkçe tıbbi iletişim verilerini kullanarak daha derinlemesine analiz yapma ve doğal dil işleme tekniklerini uygulama fırsatı sunar.

# Patient Doctor Q&A TR 321179 Dataset
The Patient Doctor Q&A TR 321179 dataset is a combined and shuffled version of the [**Patient Doctor Q&A TR 19583**](https://www.kaggle.com/datasets/kaayra2000/patient-doctor-qa-dataset-tr), [**Patient Doctor Q&A TR 167732**](https://www.kaggle.com/datasets/kaayra2000/patient-doctor-q-and-a-tr-167732), [**Patient Doctor Q&A TR 5695**](https://www.kaggle.com/datasets/kaayra2000/patient-doctor-q-and-a-translated-from-id-to-tr), and [**Patient Doctor Q&A TR 95588**](https://www.kaggle.com/datasets/kaayra2000/patient-doctor-q-and-a-tr-95588) datasets.

## Main Features:
* Content: Patient questions and doctor answers covering various medical topics.
* Structure: Contains 2 columns: Question, Answer.
* Language: Turkish.
## Potential Uses:
* Medical research
* Natural Language Processing (NLP)
* Medical education
## Limitations:
* Data privacy concerns
* Variability in answer quality
* Potential biases
## General Assessment:
The Patient Doctor Q&A TR 321179 dataset is a valuable resource for understanding real-world medical communication and information exchange. This dataset, translated into Turkish, is an important resource for medical research and education, and can be used to analyze communication between patients and doctors. However, limitations such as data privacy and variability in answer quality should be considered.

This dataset offers researchers and educators the opportunity to conduct more in-depth analyses and apply natural language processing techniques using Turkish medical communication data.
h
medical-qa-formatted
huggingface.co
Updated Mar 15, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juliabot AI System (2020). medical-qa-formatted [Dataset]. https://huggingface.co/datasets/Juliabot/medical-qa-formatted
Explore at:
Dataset updated
Mar 15, 2020
Authors
Juliabot AI System
Description
Juliabot/medical-qa-formatted dataset hosted on Hugging Face and contributed by the HF Datasets community
O
huatuo-encyclopedia-qa
opendatalab.com
zip
Updated Dec 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Beijing Academy of Artificial Intelligence (2023). huatuo-encyclopedia-qa [Dataset]. https://opendatalab.com/OpenDataLab/huatuo-encyclopedia-qa
Explore at:
zipAvailable download formats
Dataset updated
Dec 15, 2023
Dataset provided by
Shenzhen Institute of Big Data
Chinese University of Hong Kong, Shenzhen
Beijing Academy of Artificial Intelligence
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset has a total of 364,420 pieces of medical QA data, some of which have multiple questions in different ways. We extract medical QA pairs from plain texts (e.g., medical encyclopedias and medical articles). We collected 8,699 encyclopedia entries for diseases and 2,736 encyclopedia entries for medicines on Chinese Wikipedia. Moreover, we crawled 226,432 high-quality medical articles from the Qianwen Health website.
p
Learning to Ask Like a Physician: a Discharge Summary Clinical Questions...
physionet.org
Updated Jul 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eric Lehman (2022). Learning to Ask Like a Physician: a Discharge Summary Clinical Questions (DiSCQ) Dataset [Dataset]. http://doi.org/10.13026/7v8e-h745
Explore at:
Unique identifier
https://doi.org/10.13026/7v8e-h745
Dataset updated
Jul 28, 2022
Authors
Eric Lehman
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
Existing question answering (QA) datasets derived from electronic health records (EHR) are artificially generated and consequently fail to capture realistic physician information needs. We present Discharge Summary Clinical Questions (DiSCQ), a newly curated question dataset composed of 2,000+ questions paired with the snippets of text (triggers) that prompted each question. The questions are generated by medical experts from 100+ MIMIC-III, version 1.4, discharge summaries. These discharge summaries overlap with the n2c2 challenge, so they are filled in with surrogate PHI. We analyze this dataset to characterize the types of information sought by medical experts. We also train baseline models for trigger detection and question generation (QG), paired with unsupervised answer retrieval over EHRs. Our baseline model is able to generate high quality questions in over 62% of cases when prompted with human selected triggers. We release this dataset (and a link to all code to reproduce baseline model results) to facilitate further research into realistic clinical QA and QG.
h
Medical-QA-Mistral7B-Finetuning
huggingface.co
Updated Feb 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Balakrishna Masanam (2024). Medical-QA-Mistral7B-Finetuning [Dataset]. https://huggingface.co/datasets/bala1524/Medical-QA-Mistral7B-Finetuning
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 21, 2024
Authors
Balakrishna Masanam
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
bala1524/Medical-QA-Mistral7B-Finetuning dataset hosted on Hugging Face and contributed by the HF Datasets community
h
Medical-QA-dataset
huggingface.co
Updated Jun 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Genius Shrestha (2025). Medical-QA-dataset [Dataset]. https://huggingface.co/datasets/Starlord1010/Medical-QA-dataset
Explore at:
Dataset updated
Jun 8, 2025
Authors
Genius Shrestha
Description
Starlord1010/Medical-QA-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
d
Number of Visitors to Health Centers by Health Center
data.gov.qa
csv, excel, json
Updated May 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Number of Visitors to Health Centers by Health Center [Dataset]. https://www.data.gov.qa/explore/dataset/number-of-visitors-to-health-centers-by-health-center/
Explore at:
csv, excel, jsonAvailable download formats
Dataset updated
May 7, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset presents the number of visitors to public health centers in Qatar. It is categorized by the name of the health center and helps evaluate patient load, service demand, and regional distribution of healthcare access across the country.
d
Trade Data for Optical and Medical Instruments
data.qa
csv, excel, json
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Trade Data for Optical and Medical Instruments [Dataset]. https://www.data.qa/explore/dataset/trade-data-for-optical-and-medical-instruments/
Explore at:
csv, json, excelAvailable download formats
Dataset updated
May 28, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains trade data on optical, medical, and precision instruments, including imports and re-exports. It supports analysis of Qatar’s scientific and medical equipment market.
D
Daily QA Check Device Report
datainsightsmarket.com
doc, pdf, ppt
Updated Dec 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2024). Daily QA Check Device Report [Dataset]. https://www.datainsightsmarket.com/reports/daily-qa-check-device-603497
Explore at:
doc, ppt, pdfAvailable download formats
Dataset updated
Dec 29, 2024
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
Market Overview According to market research, the global Daily QA Check Device market is projected to reach a significant X million value by 2033, expanding at a robust CAGR of X%. This expansion is attributed to various driving factors, including the increasing demand for quality assurance in healthcare, food production, and education. The growing adoption of artificial intelligence (AI) and IoT technologies in these sectors has also contributed to the market's growth. Market Segmentation The Daily QA Check Device market can be segmented based on application, type, and region. By application, the market caters to educational institutions, food production industry, medical institutions, and others. Medical institutions dominate the market due to the stringent regulations for maintaining the accuracy and reliability of medical equipment. By type, the market is classified into basic, intelligent, and professional types. Professional-type devices offer advanced features and automation, leading to their popularity in hospitals and research labs. Regionally, North America holds the largest market share, followed by Europe and Asia Pacific. The Asia Pacific region is expected to witness substantial growth due to the rising demand for quality assurance in emerging economies. Key players in the industry include Guangzhou Raydose Software Technology LLC, Sichuan Jingwei Food Testing Technology, and Shenzhen Ruikang'an Technology Development.
Dermatology Question-Answer Dataset: Skin Disease
kaggle.com
Updated Nov 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Areeb Khan (2023). Dermatology Question-Answer Dataset: Skin Disease [Dataset]. https://www.kaggle.com/datasets/muhammadareebkhan/skin-disease-medical-text-data-for-fine-tuning
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 11, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Muhammad Areeb Khan
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset is a comprehensive compilation of questions related to dermatology, spanning inquiries about various skin diseases, their symptoms, recommended medications, and available treatment modalities. Each question is paired with a concise and informative response, making it an ideal resource for training and fine-tuning language models in the field of dermatological healthcare. The dataset is designed to facilitate the development of advanced medical chat-bots and language models tailored to dermatology, providing valuable insights into skin health-related inquiries.

Please Explore the Work Here: https://github.com/Mreeb/llama2-Fine-tuning-On-Custom-Medical_data/tree/master
Data from: Behavioral Health Workforce: Quality Assurance Practices in...
catalog.data.gov
data.virginia.gov
+1more
Updated Jul 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Substance Abuse and Mental Health Services Administration (2025). Behavioral Health Workforce: Quality Assurance Practices in Mental Health Treatment Facilities [Dataset]. https://catalog.data.gov/dataset/behavioral-health-workforce-quality-assurance-practices-in-mental-health-treatment-facilit
Explore at:
Dataset updated
Jul 31, 2025
Dataset provided by
Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
Description
This report examines the number, percentage, and characteristics of specialty mental health treatment facilities in the United States that use three quality assurance practices related to the behavioral health workforce as part of their standard operating procedures.