https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
The PMC Open Access Subset includes more than 3.4 million journal articles and preprints that are made available under license terms that allow reuse.
Not all articles in PMC are available for text mining and other reuse, many have copyright protection, however articles in the PMC Open Access Subset are made available under Creative Commons or similar licenses that generally allow more liberal redistribution and reuse than a traditional copyrighted work.
The PMC Open Access Subset is one part of the PMC Article Datasets
Dataset Summary
A daily-updated dataset of PubMed abstracts, collected via PubMed’s API and published on Hugging Face Datasets.Each snapshot is versioned by date (e.g., 2025-03-27) so users can track historical changes or use a consistent snapshot for reproducibility.
Updated daily Each version tagged by date Abstract-only dataset (no full text)
Dataset Structure
Column Type Description
pmid string Unique PubMed identifier
abstract string Abstract text… See the full description on the dataset page: https://huggingface.co/datasets/uiyunkim-hub/pubmed-abstract.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
The PMC Open Access Subset includes more than 3.4 million journal articles and preprints that are made available under license terms that allow reuse.
Not all articles in PMC are available for text mining and other reuse, many have copyright protection, however articles in the PMC Open Access Subset are made available under Creative Commons or similar licenses that generally allow more liberal redistribution and reuse than a traditional copyrighted work.
The PMC Open Access Subset is one part of the PMC Article Datasets