72 datasets found

h
pubmed-pmc-sr-filtered
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
shuai wang, pubmed-pmc-sr-filtered [Dataset]. https://huggingface.co/datasets/wshuai190/pubmed-pmc-sr-filtered
Explore at:
Authors
shuai wang
Description
wshuai190/pubmed-pmc-sr-filtered

Dataset Description

This dataset contains medical literature data for training Boolean query generation models. The data includes PubMed articles with their associated metadata, references, and result section PMIDs.

Dataset Structure Data Fields

pmid: PubMed ID of the article pmc-id: PMC ID (if available) title: Article title max-date: Maximum publication date references-pmids: List of PMIDs referenced in the article… See the full description on the dataset page: https://huggingface.co/datasets/wshuai190/pubmed-pmc-sr-filtered.
PubMed Datasets
brightdata.com
.json, .csv, .xlsx
Updated Nov 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2023). PubMed Datasets [Dataset]. https://brightdata.com/products/datasets/pubmed
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Nov 19, 2023
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Unlock valuable biomedical knowledge with our comprehensive PubMed Dataset, designed for researchers, analysts, and healthcare professionals to track medical advancements, explore drug discoveries, and analyze scientific literature.

Dataset Features

Scientific Articles & Abstracts: Access structured data from PubMed, including article titles, abstracts, authors, publication dates, and journal sources. Medical Research & Clinical Studies: Retrieve data on clinical trials, drug research, disease studies, and healthcare innovations. Keywords & MeSH Terms: Extract key medical subject headings (MeSH) and keywords to categorize and analyze research topics. Publication & Citation Data: Track citation counts, journal impact factors, and author affiliations for academic and industry research.

Customizable Subsets for Specific Needs Our PubMed Dataset is fully customizable, allowing you to filter data based on publication date, research category, keywords, or specific journals. Whether you need broad coverage for medical research or focused data for pharmaceutical analysis, we tailor the dataset to your needs.

Popular Use Cases

Pharmaceutical Research & Drug Development: Analyze clinical trial data, drug efficacy studies, and emerging treatments. Medical & Healthcare Intelligence: Track disease outbreaks, healthcare trends, and advancements in medical technology. AI & Machine Learning Applications: Use structured biomedical data to train AI models for predictive analytics, medical diagnosis, and literature summarization. Academic & Scientific Research: Access a vast collection of peer-reviewed studies for literature reviews, meta-analyses, and academic publishing. Regulatory & Compliance Monitoring: Stay updated on medical regulations, FDA approvals, and healthcare policy changes.

Whether you're conducting medical research, analyzing healthcare trends, or developing AI-driven solutions, our PubMed Dataset provides the structured data you need. Get started today and customize your dataset to fit your research objectives.
h
pubmed-filtered
huggingface.co
Updated Jan 10, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Giulia Dal Cin (2012). pubmed-filtered [Dataset]. https://huggingface.co/datasets/giuliadc/pubmed-filtered
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 10, 2012
Authors
Giulia Dal Cin
Description
Original data from: https://github.com/armancohan/long-summarization The first 3000 rows of the test split of the original dataset were processed and filtered as follows.

In the original dataset, some sentences appear several times in the same article, even if they're only contained once in the original research paper. For this reason, all dataset rows where the same sentence appeared more than once where removed. In the original dataset, every sentence is a separate string, and these strings… See the full description on the dataset page: https://huggingface.co/datasets/giuliadc/pubmed-filtered.
h
dsir-pile-13m-filtered-for-pubmed-central
huggingface.co
Updated Dec 31, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Timaeus (2017). dsir-pile-13m-filtered-for-pubmed-central [Dataset]. https://huggingface.co/datasets/timaeus/dsir-pile-13m-filtered-for-pubmed-central
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 31, 2017
Dataset authored and provided by
Timaeus
Description
timaeus/dsir-pile-13m-filtered-for-pubmed-central dataset hosted on Hugging Face and contributed by the HF Datasets community
Results from total and filtered searches in PubMed
zenodo.org
Updated Aug 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Viet-Thi Tran; Viet-Thi Tran (2025). Results from total and filtered searches in PubMed [Dataset]. http://doi.org/10.5281/zenodo.16758566
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.16758566
Dataset updated
Aug 7, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Viet-Thi Tran; Viet-Thi Tran
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Using Large language Models to directly screen electronic databases as an alternative to traditional search strategies in systematic reviews: the example of the Cochrane Highly sensitive search

The enclosed files correspond to 1) all studies published in MEDLINE between September 1st and September 30th 2024 using the sole keywords diabetes; and 2) studies published in MEDLINE between September 1st and September 30th 2024 using the keywords "diabetes" as well as the Cochrane High Sensitivity search.

The code used to process the data is provided as a supplementary material in the publication
Data from: PubMed's Core Clinical Journals Filter: Redesigned for...
figshare.com
txt
Updated Jul 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michele Klein-Fedyshin; Andrea M. Ketchum (2023). PubMed's Core Clinical Journals Filter: Redesigned for Contemporary Clinical Impact and Utility [Dataset]. http://doi.org/10.6084/m9.figshare.21979832.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21979832.v1
Dataset updated
Jul 12, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Michele Klein-Fedyshin; Andrea M. Ketchum
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Medical journal usage counts across 814 clinical locations in the U.S. and Canada from 2009 - 2015.
f
Data from: Searching for LINCS to Stress: Using Text Mining to Automate...
figshare.com
xlsx
Updated May 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bryant A. Chambers; Danilo Basili; Laura Word; Nancy Baker; Alistair Middleton; Richard S. Judson; Imran Shah (2024). Searching for LINCS to Stress: Using Text Mining to Automate Reference Chemical Curation [Dataset]. http://doi.org/10.1021/acs.chemrestox.3c00335.s008
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.chemrestox.3c00335.s008
Dataset updated
May 13, 2024
Dataset provided by
ACS Publications
Authors
Bryant A. Chambers; Danilo Basili; Laura Word; Nancy Baker; Alistair Middleton; Richard S. Judson; Imran Shah
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Adaptive stress response pathways (SRPs) restore cellular homeostasis following perturbation but may activate terminal outcomes like apoptosis, autophagy, or cellular senescence if disruption exceeds critical thresholds. Because SRPs hold the key to vital cellular tipping points, they are targeted for therapeutic interventions and assessed as biomarkers of toxicity. Hence, we are developing a public database of chemicals that perturb SRPs to enable new data-driven tools to improve public health. Here, we report on the automated text-mining pipeline we used to build and curate the first version of this database. We started with 100 reference SRP chemicals gathered from published biomarker studies to bootstrap the database. Second, we used information retrieval to find co-occurrences of reference chemicals with SRP terms in PubMed abstracts and determined pairwise mutual information thresholds to filter biologically relevant relationships. Third, we applied these thresholds to find 1206 putative SRP perturbagens within thousands of substances in the Library of Integrated Network-Based Cellular Signatures (LINCS). To assign SRP activity to LINCS chemicals, domain experts had to manually review at least three publications for each of 1206 chemicals out of 181,805 total abstracts. To accomplish this efficiently, we implemented a machine learning approach to predict SRP classifications from texts to prioritize abstracts. In 5-fold cross-validation testing with a corpus derived from the 100 reference chemicals, artificial neural networks performed the best (F1-macro = 0.678) and prioritized 2479/181,805 abstracts for expert review, which resulted in 457 chemicals annotated with SRP activities. An independent analysis of enriched mechanisms of action and chemical use class supported the text-mined chemical associations (p < 0.05): heat shock inducers were linked with HSP90 and DNA damage inducers to topoisomerase inhibition. This database will enable novel applications of LINCS data to evaluate SRP activities and to further develop tools for biomedical information extraction from the literature.
STS Model of the PubMed Literature
figshare.com
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kevin Boyack; Caleb Smith; Richard Klavans (2023). STS Model of the PubMed Literature [Dataset]. http://doi.org/10.6084/m9.figshare.12743639.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12743639.v1
Dataset updated
May 30, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Kevin Boyack; Caleb Smith; Richard Klavans
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The PubMed model contains over 18 million PubMed documents (1996-2019) clustered into 28,743 clusters for use in research planning, portfolio analysis, systematic review, etc. This repository contains the PMID-to-cluster listing, an Excel workbook that characterizes each cluster with metadata and cluster-level indicators, and a Tableau workbook containing those same data plus a visual map and filters that can be used to explore the landscape and analyze cluster-level information. Model created by SciTech Strategies, Inc. Details can be found in the accompanying article published in Scientific Data at https://www.nature.com/articles/s41597-020-00749-y (or https://rdcu.be/ca4kv).
R
Uftir_curated Dataset
universe.roboflow.com
zip
Updated Sep 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
uFTIR Particles (2023). Uftir_curated Dataset [Dataset]. https://universe.roboflow.com/uftir-particles/uftir_curated/dataset/6
Explore at:
zipAvailable download formats
Dataset updated
Sep 28, 2023
Dataset authored and provided by
uFTIR Particles
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Particle Polygons
Description
Micro-FTIR Filter Images for Particle Detection

This dataset consists of annotated images of filters containing particles. The primary objective of this dataset is to serve as training and validation data for developing a particle detection model using computer vision techniques. More specifically, this dataset can be used to train an image segmentation model that can be used with GEPARD (https://pubmed.ncbi.nlm.nih.gov/32436395/) in order to perform efficient particle detection and analysis using Micro-FTIR microscope.

Two kind of samples are used in our case:

Normal filters, with a low amount of particles and a clear view of the filter

Saturated filters, where the particles cover almost all the filter

In the first case, particles were annotated easilly as they are clearly visible over the filter. In the second scenario, the most distinguishable particles on the image have been annotated.

Note

In the case of a saturated filters, the correct method would be to collect a spectral image of the entire filter using a FPA detector or similar and then use tools (e.g. sIMPle ) to analyse this image. However, in our scenario such detector was not available, and a semi-random / operator dependant method had to be used in order to select particles or points for scanning.
t
BIOGRID CURATED DATA FOR PUBLICATION: Human VPAC1 receptor selectivity...
thebiogrid.org
zip
Updated Oct 4, 2002
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioGRID Project (2002). BIOGRID CURATED DATA FOR PUBLICATION: Human VPAC1 receptor selectivity filter. Identification of a critical domain for restricting secretin binding. [Dataset]. https://thebiogrid.org/8250/publication/human-vpac1-receptor-selectivity-filter-identification-of-a-critical-domain-for-restricting-secretin-binding.html
Explore at:
zipAvailable download formats
Dataset updated
Oct 4, 2002
Dataset authored and provided by
BioGRID Project
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Protein-Protein, Genetic, and Chemical Interactions for Du K (2002):Human VPAC1 receptor selectivity filter. Identification of a critical domain for restricting secretin binding. curated by BioGRID (https://thebiogrid.org); ABSTRACT: The human VPAC1 receptor for vasoactive intestinal peptide (VIP) and pituitary adenylate cyclase activating peptide (PACAP) belongs to the class II family of G protein coupled receptors with seven transmembrane segments. It recognizes several VIP-related peptides and displays a very low affinity for secretin despite >70% homology between VIP and secretin. Conversely, the human secretin receptor has high affinity for secretin but low affinity for VIP. We took advantage of this reversed selectivity to identify a domain of the VPAC1 receptor responsible for selectivity toward secretin by constructing human VPAC1-secretin receptor chimeras. A first set of chimeras consisted of exchanging the entire N-terminal ectodomain or large parts of this domain. They were constructed by overlap PCR, transfected in COS-7 cells, and their ligand selectivity, expressed as the ratio of EC(50) for secretin/EC(50) for VIP (referred to as S/V), in stimulating cAMP production was measured. Two very informative chimeras respectively referred to as S144V and S123V were obtained by replacing the entire ectodomain or only the first 123 amino acids of the VPAC1 receptor by the corresponding sequences of the secretin receptor. Whereas S144V no longer discriminated between VIP and secretin (S/V = 1.2), S123V discriminated between the two peptides (S/V = 300) in the same manner as the wild-type VPAC1 receptor. The motif responsible for discrimination was determined by introducing small blocks or individual amino acids of secretin receptor in the 123-144 sequence of the S123V chimera. The data obtained from 14 new chimeras sustained that two nonadjacent pairs of amino acids, Gln(135) Thr(136) and Gly(140) Ser(141) in the C-terminal end of the N-terminal VPAC1 receptor ectodomain constitute a selective filter that strongly restricts access of secretin to the VPAC1 receptor.
h
dsir-pile-100k-filtered-for-pubmed-central
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Timaeus, dsir-pile-100k-filtered-for-pubmed-central [Dataset]. https://huggingface.co/datasets/timaeus/dsir-pile-100k-filtered-for-pubmed-central
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset authored and provided by
Timaeus
Description
timaeus/dsir-pile-100k-filtered-for-pubmed-central dataset hosted on Hugging Face and contributed by the HF Datasets community
t
Data for: comprehensive search filters for retrieving publications on...
service.tib.eu
Updated Nov 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Data for: comprehensive search filters for retrieving publications on non-human primates for literature reviews (filternhp) - Vdataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/goe-doi-10-25625-utt4sn
Explore at:
Dataset updated
Nov 14, 2025
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset supports filterNHP, an R package and web-based application for generating search filters to query scientific bibliographic sources (PubMed, PsycINFO, Web of Science) for non-human primate related publications. filterNHP can be found at: https://filterNHP.dpz.eu.
Z
Data from: Citation network data sets for 'Oxytocin – a social peptide?...
nde-dev.biothings.io
Updated Jun 5, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leng, Rhodri Ivor (2022). Citation network data sets for 'Oxytocin – a social peptide? Deconstructing the evidence' [Dataset]. https://nde-dev.biothings.io/resources?id=zenodo_5578956
Explore at:
Dataset updated
Jun 5, 2022
Dataset authored and provided by
Leng, Rhodri Ivor
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction

This note describes the data sets used for all analyses contained in the manuscript 'Oxytocin - a social peptide?’[1] that is currently under review.

Data Collection

The data sets described here were originally retrieved from Web of Science (WoS) Core Collection via the University of Edinburgh’s library subscription [2]. The aim of the original study for which these data were gathered was to survey peer-reviewed primary studies on oxytocin and social behaviour. To capture relevant papers, we used the following query:

TI = (“oxytocin” OR “pitocin” OR “syntocinon”) AND TS = (“social*” OR “pro$social” OR “anti$social”)

The final search was performed on the 13 September 2021. This returned a total of 2,747 records, of which 2,049 were classified by WoS as ‘articles’. Given our interest in primary studies only – articles reporting original data – we excluded all other document types. We further excluded all articles sub-classified as ‘book chapters’ or as ‘proceeding papers’ in order to limit our analysis to primary studies published in peer-reviewed academic journals. This reduced the set to 1,977 articles. All of these were published in the English language, and no further language refinements were unnecessary.

All available metadata on these 1,977 articles was exported as plain text ‘flat’ format files in four batches, which we later merged together via Notepad++. Upon manually examination, we discovered examples of papers classified as ‘articles’ by WoS that were, in fact, reviews. To further filter our results, we searched all available PMIDs in PubMed (1,903 had associated PMIDs - ~96% of set). We then filtered results to identify all records classified as ‘review’, ‘systematic review’, or ‘meta-analysis’, identifying 75 records 3. After examining a sample and agreeing with the PubMed classification, these were removed these from our dataset - leaving a total of 1,902 articles.

From these data, we constructed two datasets via parsing out relevant reference data via the Sci2 Tool [4]. First, we constructed a ‘node-attribute-list’ by first linking unique reference strings (‘Cite Me As’ column in WoS data files) to unique identifiers, we then parsed into this dataset information on the identify of a paper, including the title of the article, all authors, journal publication, year of publication, total citations as recorded from WoS, and WoS accession number. Second, we constructed an ‘edge-list’ that records the citations from a citing paper in the ‘Source’ column and identifies the cited paper in the ‘Target’ column, using the unique identifies as described previously to link these data to the node-attribute-list.

We then constructed a network in which papers are nodes, and citation links between nodes are directed edges between nodes. We used Gephi Version 0.9.2 [5] to manually clean these data by merging duplicate references that are caused by different reference formats or by referencing errors. To do this, we needed to retain both all retrieved records (1,902) as well as including all of their references to papers whether these were included in our original search or not. In total, this produced a network of 46,633 nodes (unique reference strings) and 112,520 edges (citation links). Thus, the average reference list size of these articles is ~59 references. The mean indegree (within network citations) is 2.4 (median is 1) for the entire network reflecting a great diversity in referencing choices among our 1,902 articles.

After merging duplicates, we then restricted the network to include only articles fully retrieved (1,902), and retrained only those that were connected together by citations links in a large interconnected network (i.e. the largest component). In total, 1,892 (99.5%) of our initial set were connected together via citation links, meaning a total of ten papers were removed from the following analysis – and these were neither connected to the largest component, nor did they form connections with one another (i.e. these were ‘isolates’).

This left us with a network of 1,892 nodes connected together by 26,019 edges. It is this network that is described by the ‘node-attribute-list’ and ‘edge-list’ provided here. This network has a mean in-degree of 13.76 (median in-degree of 4). By restricting our analysis in this way, we lose 44,741 unique references (96%) and 86,501 citations (77%) from the full network, but retain a set of articles tightly knitted together, all of which have been fully retrieved due to possessing certain terms related to oxytocin AND social behaviour in their title, abstract, or associated keywords.

Before moving on, we calculated indegree for all nodes in this network – this counts the number of citations to a given paper from other papers within this network – and have included this in the node-attribute-list. We further clustered this network via modularity maximisation via the Leiden algorithm [6]. We set the algorithm to resolution 1, and allowed the algorithm to run over 100 iterations and 100 restarts. This gave Q=0.43 and identified seven clusters, which we describe in detail within the body of the paper. We have included cluster membership as an attribute in the node-attribute-list.

Data description

We include here two datasets: (i) ‘OTSOC-node-attribute-list.csv’ consists of the attributes of 1,892 primary articles retrieved from WoS that include terms indicating a focus on oxytocin and social behaviour; (ii) ‘OTSOC-edge-list.csv’ records the citations between these papers. Together, these can be imported into a range of different software for network analysis; however, we have formatted these for ease of upload into Gephi 0.9.2. Below, we detail their contents:

‘OTSOC-node-attribute-list.csv’ is a comma-separate values file that contains all node attributes for the citation network (n=1,892) analysed in the paper. The columns refer to:

Id, the unique identifier

Label, the reference string of the paper to which the attributes in this row correspond. This is taken from the ‘Cite Me As’ column from the original WoS download. The reference string is in the following format: last name of first author, publication year, journal, volume, start page, and DOI (if available).

Wos_id, unique Web of Science (WoS) accession number. These can be used to query WoS to find further data on all papers via the ‘UT= ’ field tag.

Title, paper title.

Authors, all named authors.

Journal, journal of publication.

Pub_year, year of publication.

Wos_citations, total number of citations recorded by WoS Core Collection to a given paper as of 13 September 2021

Indegree, the number of within network citations to a given paper, calculated for the network shown in Figure 1 of the manuscript.

Cluster, provides the cluster membership number as discussed within the manuscript (Figure 1). This was established via modularity maximisation via the Leiden algorithm (Res 1; Q=0.43|7 clusters)

‘OTSOC-edge -list.csv’ is a comma-separate values file that contains all citation links between the 1,892 articles (n=26,019). The columns refer to:

Source, the unique identifier of the citing paper.

Target, the unique identifier of the cited paper.

Type, edges are ‘Directed’, and this column tells Gephi to regard all edges as such.

Syr_date, this contains the date of publication of the citing paper.

Tyr_date, this contains the date of publication of the cited paper.

Software recommended for analysis

Gephi version 0.9.2 was used for the visualisations within the manuscript, and both files can be read and into Gephi without modification.

Notes

[1] Leng, G., Leng, R. I., Ludwig, M. (Submitted). Oxytocin – a social peptide? Deconstructing the evidence.

[2] Edinburgh University’s subscription to Web of Science covers the following databases: (i) Science Citation Index Expanded, 1900-present; (ii) Social Sciences Citation Index, 1900-present; (iii) Arts & Humanities Citation Index, 1975-present; (iv) Conference Proceedings Citation Index- Science, 1990-present; (v) Conference Proceedings Citation Index- Social Science & Humanities, 1990-present; (vi) Book Citation Index– Science, 2005-present; (vii) Book Citation Index– Social Sciences & Humanities, 2005-present; (viii) Emerging Sources Citation Index, 2015-present.

[3] For those interested, the following PMIDs were identified as ‘articles’ by WoS, but as ‘reviews’ by PubMed: ‘34502097’ ‘33400920’ ‘32060678’ ‘31925983’ ‘31734142’ ‘30496762’ ‘30253045’ ‘29660735’ ‘29518698’ ‘29065361’ ‘29048602’ ‘28867943’ ‘28586471’ ‘28301323’ ‘27974283’ ‘27626613’ ‘27603523’ ‘27603327’ ‘27513442’ ‘27273834’ ‘27071789’ ‘26940141’ ‘26932552’ ‘26895254’ ‘26869847’ ‘26788924’ ‘26581735’ ‘26548910’ ‘26317636’ ‘26121678’ ‘26094200’ ‘25997760’ ‘25631363’ ‘25526824’ ‘25446893’ ‘25153535’ ‘25092245’ ‘25086828’ ‘24946432’ ‘24637261’ ‘24588761’ ‘24508579’ ‘24486356’ ‘24462936’ ‘24239932’ ‘24239931’ ‘24231551’ ‘24216134’ ‘23955310’ ‘23856187’ ‘23686025’ ‘23589638’ ‘23575742’ ‘23469841’ ‘23055480’ ‘22981649’ ‘22406388’ ‘22373652’ ‘22141469’ ‘21960250’ ‘21881219’ ‘21802859’ ‘21714746’ ‘21618004’ ‘21150165’ ‘20435805’ ‘20173685’ ‘19840865’ ‘19546570’ ‘19309413’ ‘15288368’ ‘12359512’ ‘9401603’ ‘9213136’ ‘7630585’

[4] Sci2 Team. (2009). Science of Science (Sci2) Tool. Indiana University and SciTech Strategies. Stable URL: https://sci2.cns.iu.edu

[5] Bastian, M., Heymann, S., & Jacomy, M. (2009).
r
Data for "RegulaTome: a corpus of typed, directed, and signed relations...
resodate.org
Updated Apr 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Katerina Nastou (2024). Data for "RegulaTome: a corpus of typed, directed, and signed relations between biomedical entities in the scientific literature" [Dataset]. https://resodate.org/resources/aHR0cHM6Ly96ZW5vZG8ub3JnL3JlY29yZHMvMTA4MDgzMzA=
Explore at:
Dataset updated
Apr 23, 2024
Dataset provided by
Zenodo
Authors
Katerina Nastou
Description
RegulaTome corpus: this file contains the RegulaTome corpus inBRAT format. The directory"splits" has the corpus split based on the train/dev/test used for the training of the relation extraction system RegulaTome annodoc: The annotation guidelines along with the annotation configuration files for BRAT are provided in annodoc+config.tar.gz. The online version of the annotation documentation can be found here: https://katnastou.github.io/regulatome-annodoc/ The tagger software can be found here:https://github.com/larsjuhljensen/tagger. The command used to run tagger before large-scale execution of the RE system is: gzip -cd ls -1 pmc/*.en.merged.filtered.tsv.gz ls -1r pubmed/*.tsv.gz | cat dictionary/excluded_documents.txt - | tagger/tagcorpus --threads=16 --autodetect --types=dictionary/curated_types.tsv --entities=dictionary/all_entities.tsv --names=dictionary/all_names_textmining.tsv --groups=dictionary/all_groups.tsv --stopwords=dictionary/all_global.tsv --local-stopwords=dictionary/all_local.tsv --type-pairs=dictionary/all_type_pairs.tsv --out-matches=all_matches.tsv Input documents for large-scale execution, which is done on entire PubMed (as of March 2024) and PMC Open Access (as of November 2023) articles in BioC format. The files are converted to a tab-delimited formatto be compatible with the RE system input (see below). Input dictionary files: all the files necessary to execute the command above are available intagger_dictionary_files.tar.gz Tagger output: we filter the results of the tagger run down to gene/protein hits, and documents with more than 1 hit (since we are doing relation extraction) before feeding it to our RE system. The filtered output is available in tagger_matches_ggp_only_gt_1_hit.tsv.gz Relation extraction system input:combined_input_for_re.tar.gz: these are the directories with all the .ann and .txt files used as input for the large-scale execution of the relation extraction pipeline. The files are generated from the tagger tsv output (see above, tagger_matches_ggp_only_gt_1_hit.tsv.gz) using thetagger2standoff.py script from the string-db-tools repository. Relation extraction models. The Transformer-based model used for large-scale relation extraction and prediction on the test set is atrelation_extraction_multi-label-best_model.tar.gz The pre-trained RoBERTa model on PubMed and PMC and MIMIC-III with a BPE Vocab learned from PubMed (RoBERTa-large-PM-M3-Voc), which is used by our system is available here. Relation extraction system output: the tab-delimited outputs of the relation extraction system are found atlarge_scale_relation_extraction_results.tar.gz !!!ATTENTION this file is approximately 1TB in size, so make sure you have enough space to download it on your machine!!! The relation extraction system output files have 86 columns: PMID, Entity BRAT ID1, Entity BRAT ID2, and scores per class produced by the relation extraction model. Each file has a header to denote which score is in which column.
m
2014-2019 Systematic Reviews and Meta-Analyses Data: Evidence-Based...
data.mendeley.com
Updated Jul 28, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Toluwase Asubiaro (2021). 2014-2019 Systematic Reviews and Meta-Analyses Data: Evidence-Based Biomedical Publications in MEDLINE with authors from Sub-Saharan Africa [Dataset]. http://doi.org/10.17632/xkry6rjtjg.3
Explore at:
Unique identifier
https://doi.org/10.17632/xkry6rjtjg.3
Dataset updated
Jul 28, 2021
Authors
Toluwase Asubiaro
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Africa, Sub-Saharan Africa
Description
Bibliographic data of biomedical systematic reviews and meta-analysis studies published between 2014 and 2019, where at least one author is affiliated with an institution in Sub-Saharan Africa was retrieved from MEDLINE via the PubMed search engine. All forty-six (46) countries in Sub-Saharan Africa were included in the search query as affiliation. The search strategy are decripted in four steps:

Step #1: Nigeria[Affiliation] OR South Africa[Affiliation] OR Ghana[Affiliation] OR Tanzania[Affiliation] OR Kenya[Affiliation] OR Rwanda[Affiliation] OR Botswana[Affiliation] OR Cameroun[Affiliation] OR Senegal[Affiliation] OR Angola[Affiliation] OR Uganda[Affiliation] OR Mali[Affiliation] OR Sierra Leone[Affiliation] OR Ivory Coast[Affiliation] OR Ethiopia[Affiliation] OR Lesotho[Affiliation] OR Zambia[Affiliation] OR Zimbabwe[Affiliation] OR Namibia[Affiliation] OR Guinea[Affiliation] OR Mauritius[Affiliation] OR Mozambique[Affiliation] OR Niger[Affiliation] OR Seychelles[Affiliation] OR Burkina Faso[Affiliation] OR Burundi[Affiliation] OR Cape Verde[Affiliation] OR Cameroon[Affiliation] OR Central African Republic[Affiliation] OR Chad[Affiliation] OR Comoros[Affiliation] OR Democratic Republic of Congo[Affiliation] OR DR Congo[Affiliation] OR Djibouti[Affiliation] OR Cote D'ivoire[Affiliation] OR Congo[Affiliation] OR Equatorial Guinea[Affiliation] OR Eritrea[Affiliation] OR Gabon[Affiliation] OR Guinea-Bissau[Affiliation] OR Madagascar[Affiliation] OR Congo Republic[Affiliation] OR Sao Tome and Principe[Affiliation] OR Swaziland[Affiliation] OR Togo[Affiliation] OR Benin[Affiliation] OR Liberia[Affiliation] OR Namibia[Affiliation] OR Gambia[Affiliation] OR (Cent Afr Republ[Affiliation]) OR (Equat Guinea[Affiliation]) OR (Papua N Guinea[Affiliation]) OR (Sao Tome E Prin[Affiliation]) OR Principe[Affiliation] OR Sao Tome E Principe[Affiliation]

Step #2 The filter was set to Meta-Analysis[ptyp] OR systematic[sb]

Step #3: Text word search systematic review[Text Word] OR meta-analysis[Text Word] OR meta analysis[Text Word]

Step #4: Set publication date to: "2014/01/01"[PDAT] : "2019/12/31"[PDAT]

The search which was done on April 2nd, 2020 returned 3,171 results. The bibliographic data collected with the queries posed to PubMed were cleaned, duplicates were removed and articles that were not meta-analysis or systematic reviews were removed. MEDLINE is an authoritative and specialized biomedical database for indexing biomedical publications. Query: (Step #1) AND (Step #2 OR Step #3) AND (Step #4)
Data from: NeuroScape
zenodo.org
zip
Updated Mar 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mario Senden; Mario Senden (2025). NeuroScape [Dataset]. http://doi.org/10.5281/zenodo.14865161
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14865161
Dataset updated
Mar 6, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mario Senden; Mario Senden
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NeuroScape: A Curated Dataset of Neuroscientific Articles from 1999 to 2023

Description

This dataset comprises a collection of neuroscientific articles published between January 1, 1999, and December 31, 2023. The compilation includes information on articles and research domain clusters in multiple formats, including CSV, GraphML, and HDF5.

Scope and Selection Criteria

Source Journals: The articles in this dataset were selectively retrieved from journals ranked in the first and second quartile (Q1 and Q2) in the field of neuroscience according to the SCImago Journal Rank. Additionally, articles from Q1 multidisciplinary journals such as Nature, Science, and PLOS One were included.

Search Methodology: PubMed searches were conducted for each year using the journal name and publication year as query terms. All articles returned from these searches were initially included.

Discipline Classification: A neural network classifier was employed to filter articles specifically related to neuroscience. Articles that did not meet the classifier's threshold were excluded.

Non-Exhaustiveness: This dataset does not encompass all neuroscientific articles published in the given period. Articles without abstracts or key metadata were omitted, and classification errors may have led to the exclusion of some relevant publications.

Changelog

Version 1.0.1 (Latest)

Fixed incorrect cluster citation graph: The previous version had an incorrect cluster_citation_density.graphml file. This has now been corrected.

Directory Structure

. ├── Code │ ├── notebooks
│ │ ├── keyword_search.ipynb │ │ ├── exploring_clusters.ipynb │ │ ├── loading_article_shards.ipynb │ │ ├── traversing_article_graph.ipynb
│ │ ├── discipline_classification.ipynb
│ │ └── from_generic_to_domain_embedding.ipynb │ ├── requirements.txt │ └── src │ ├── data_types.py │ └── utils.py └── Data ├── CSV │ ├── neuroscience_articles_1999-2023.csv │ ├── neuroscience_clusters_1999-2023.csv │ └── neuroscience_dimensions_1999-2023.csv ├── Graphs │ ├── cluster_citation_density.graphml │ ├── article_similarity.graphml ├── HDF5 │ ├── DomainEmbeddings │ │ └── 2037 shard_#SHARD_ID.h5 files containing 200 articles │ └── VoyageAIEmbeddings │ ├── Large_02_Instruct
│ │ └── 2037 shard_#SHARD_ID.h5 files containing 200 articles
│ └── Lite_02_Instruct
│ └── 2037 shard_#SHARD_ID.h5 files containing 200 articles └── Models ├── discipline_classification_model.pth └── domain_embedding_model.pth

Code

The Code folder contains minimal example code to help users get started with the dataset. It includes:

Jupyter Notebooks demonstrating how to work with thet data with minimal usage examples.

Python Scripts with basic utilities for handling the dataset.

These examples provide a simple foundation for working with the dataset. More advanced analysis and demonstrations are covered in the accompanying publication.

CSV Files

Neuroscience Articles (neuroscience_articles_1999-2023.csv)

This file contains metadata on neuroscientific articles from 1999 to 2023.

Variables:

Pmid: PubMed ID (unique identifier).

Doi: Digital Object Identifier.

Type: Article type (Review or Research).

Title: Article title.

Year: Year of publication.

Month: Month of publication.

Age: Age of the article as of January 3, 2025.

Citations: Total number of citations.

Citation Rate: Citations divided by article age.

Cluster ID: The research cluster the article belongs to (neuroscience_clusters_1999-2023.csv).

Journal: The journal where the article was published.

Disciplines: Disciplines published by the journal as classified by SCImago.The article does NOT necessarily qualify for all listed disciplines.

Abstract: The abstract of the article.

Neuroscience Clusters (neuroscience_clusters_1999-2023.csv)

Clusters of related articles based on research themes.

Variables:

Cluster ID: Unique identifier for the cluster.

Title: Title of the research cluster.

Size: Number of articles in the cluster.

Year First Article: Year of the earliest article in the cluster.

MCR Research: Median citation rate for research articles.

MCR Review: Median citation rate for review articles.

Reference Krackhardt: Measure of internal vs. external references.

Citation Krackhardt: Measure of internal vs. external citations.

Most Cited Cluster: Cluster most frequently cited by articles in this cluster.

Most Citing Cluster: Cluster that cites this cluster the most.

Keywords: Keywords describing the cluster.

Description: A summary of the research in the cluster.

Focus: Whether the cluster is focused on content or methodology.

Most Similar Cluster: Cluster most semantically similar to this one.

Similarity: Cosine similarity score with the most similar cluster.

Distinguishing Features: Key features distinguishing the cluster from its similar cluster.

Open Questions: Outstanding research questions within the cluster.

Dimensions: Evaluation of dimensions including appliedness, modality, spatiotemporal scale, cognitive complexity, species focus, theoretical engagement, theorey scope, methodological approach, and interdisciplinarity.

Trends: Emerging or declining trends between Jan 2021 and December 2023.

Neuroscience Dimensions (neuroscience_dimensions_1999-2023.csv)

Provides various research dimensions assessed for each cluster. Each dimension comes with specific binarized categories.

Key Variables:

Appliedness: Fundamental, translational, or clinical focus.

Modality: Auditory, visual, olfactory, gustatory, somatosensory.

Spatiotemporal Scale: Focus on molecular, cellular, system-level neuroscience.

Cognitive Complexity: Simple vs. complex cognitive processes.

Species: Human, non-human primate, rodent, etc.

Theory Engagement: Data-driven vs hypothesis-driven research.

Theory Scope: Scope of theoretical frameworks utilized by the cluster.

Methodological Approach: Experimental, observational, computational, meta-analytic.

Interdisciplinarity: Low to very high.

HDF5 Files

The HDF5 directory contains two sets of embeddings for the abstracts of articles. All folders contain 2037 HDF5 shard files, each holding about 200 articles (using a custom defined article filetype).

Article Datatypes:

pmid, doi, title, type, journal, year, age, citationcount, citationrate, abstract: Corresponds directly with the CSV data.

embedding: Text embedding of the article's abstract. There are two versions.

out_links: List of PubMed IDs for articles in the dataset that are cited by this article (references).

in_links: List of PubMed IDs for articles in the dataset that cite this article (citations).

Please note that abstracts of articles in the subfolders of HDF5/VoyageAIEmbeddings have been embedded using Voyage AI's voyage-lite-02-instruct and voyage-large-02-instruct models, respectively. Those in the folder HDF5/DomainEmbeddings are voyage-large-02-instructembeddings that have subsequently been further transformed into a domain-specific lower dimensional embedding using a custom neural network (domain_embedding_model.pth).

Graph-Based Data

Article Similarity Graph (article_similarity.graphml)

A graph representation of article similarity based on cosine similarity between abstract embeddings (using domain-specific embedding reuslting from domain_embedding_model.pth).

Vertices: Each article is a node with pmid (PubMed ID) as an attribute.

Edges: The top 50 nearest neighbor articles (by cosine similarity) form
Database (PubMed): retracted publications of systematic reviews and...
figshare.com
zip
Updated Jun 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jorge H Ramirez (2023). Database (PubMed): retracted publications of systematic reviews and meta-analysis (1983 - 2013) [Dataset]. http://doi.org/10.6084/m9.figshare.1216653.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1216653.v1
Dataset updated
Jun 6, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Jorge H Ramirez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
PubMed (search date: 24/10/2014) | Search query: "retracted publication"[Publication Type] - Filter: systematic reviews | 48 results Google spreadsheet in the URL below
t
BIOGRID CURATED DATA FOR PUBLICATION: Interaction Between SARS-CoV-2 Spike...
thebiogrid.org
zip
Updated Sep 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioGRID Project (2024). BIOGRID CURATED DATA FOR PUBLICATION: Interaction Between SARS-CoV-2 Spike Protein S1 Subunit and Oyster Heat Shock Protein 70. [Dataset]. https://thebiogrid.org/253343/publication/interaction-between-sars-cov-2-spike-protein-s1-subunit-and-oyster-heat-shock-protein-70.html
Explore at:
zipAvailable download formats
Dataset updated
Sep 1, 2024
Dataset authored and provided by
BioGRID Project
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Protein-Protein, Genetic, and Chemical Interactions for Li J (2024):Interaction Between SARS-CoV-2 Spike Protein S1 Subunit and Oyster Heat Shock Protein 70. curated by BioGRID (https://thebiogrid.org); ABSTRACT: There is growing evidence that severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) contaminates the marine environment and is bioaccumulated in filter-feeding shellfish. Previous study shows the Pacific oyster tissues can bioaccumulate the SARS-CoV-2, and the oyster heat shock protein 70 (oHSP70) may play as the primary attachment receptor to bind SARS-CoV-2's recombinant spike protein S1 subunit (rS1). However, detailed information about the interaction between rS1 and oHSP70 is still unknown. In this study, we confirmed that the affinity of recombinant oHSP70 (roHSP70) for rS1 (KD?=?20.4 nM) is comparable to the receptor-binding affinity of rACE2 for rS1 (KD?=?16.7 nM) by surface plasmon resonance (SPR)-based Biacore and further validated by enzyme-linked immunosorbent assay (ELISA). Three truncated proteins (roHSP70-N/C/M) and five mutated proteins (p.I229del, p.D457del, p.V491_K495del, p.K556I, and p.?roHSP70) were constructed according to the molecular docking results. All three truncated proteins have significantly lower affinity for rS1 than the full-length roHSP70, indicating that all three segments of roHSP70 are involved in binding to rS1. Further, the results of SPR and ELISA showed that all five mutant proteins had significantly lower affinity for rS1 than roHSP70, suggesting that amino acids at these sites are involved in binding to rS1. This study provides a preliminary theoretical basis for the bioaccumulation of SARS-CoV-2 in oyster tissues or using roHSP70 as the capture unit to selectively enrich virus particles for detection.
t
BIOGRID CURATED DATA FOR PUBLICATION: Using lidocaine and benzocaine to link...
thebiogrid.org
zip
Updated Aug 10, 2009
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BioGRID Project (2009). BIOGRID CURATED DATA FOR PUBLICATION: Using lidocaine and benzocaine to link sodium channel molecular conformations to state-dependent antiarrhythmic drug affinity. [Dataset]. https://thebiogrid.org/180062/publication/using-lidocaine-and-benzocaine-to-link-sodium-channel-molecular-conformations-to-state-dependent-antiarrhythmic-drug-affinity.html
Explore at:
zipAvailable download formats
Dataset updated
Aug 10, 2009
Dataset authored and provided by
BioGRID Project
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Protein-Protein, Genetic, and Chemical Interactions for Hanck DA (2009):Using lidocaine and benzocaine to link sodium channel molecular conformations to state-dependent antiarrhythmic drug affinity. curated by BioGRID (https://thebiogrid.org); ABSTRACT: Lidocaine and other antiarrhythmic drugs bind in the inner pore of voltage-gated Na channels and affect gating use-dependently. A phenylalanine in domain IV, S6 (Phe1759 in Na(V)1.5), modeled to face the inner pore just below the selectivity filter, is critical in use-dependent drug block.Measurement of gating currents and concentration-dependent availability curves to determine the role of Phe1759 in coupling of drug binding to the gating changes.The measurements showed that replacement of Phe1759 with a nonaromatic residue permits clear separation of action of lidocaine and benzocaine into 2 components that can be related to channel conformations. One component represents the drug acting as a voltage-independent, low-affinity blocker of closed channels (designated as lipophilic block), and the second represents high-affinity, voltage-dependent block of open/inactivated channels linked to stabilization of the S4s in domains III and IV (designated as voltage-sensor inhibition) by Phe1759. A homology model for how lidocaine and benzocaine bind in the closed and open/inactivated channel conformation is proposed.These 2 components, lipophilic block and voltage-sensor inhibition, can explain the differences in estimates between tonic and open-state/inactivated-state affinities, and they identify how differences in affinity for the 2 binding conformations can control use-dependence, the hallmark of successful antiarrhythmic drugs.
Additional file 1 of Disclosing ambiguous gene aliases by automatic...
springernature.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roney S Coimbra; Dana E Vanderwall; Guilherme C Oliveira (2023). Additional file 1 of Disclosing ambiguous gene aliases by automatic literature profiling [Dataset]. http://doi.org/10.6084/m9.figshare.14438102.v1
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14438102.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Roney S Coimbra; Dana E Vanderwall; Guilherme C Oliveira
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 1:EntrezGene official symbols with PubMed abstracts and their aliases classified by the algorithm. Description of data: 73 randomly chosen official gene symbols that produced text corpora of PubMed abstracts and their aliases. Aliases were classified by the algorithm as “synonyms”, “ambiguous”, “aliases with PubMed abstract but not passing the filters”, or “aliases without PubMed abstracts”. (XLS 42 KB)

Facebook

Twitter

Click to copy link

Link copied

Cite

shuai wang, pubmed-pmc-sr-filtered [Dataset]. https://huggingface.co/datasets/wshuai190/pubmed-pmc-sr-filtered

pubmed-pmc-sr-filtered

wshuai190/pubmed-pmc-sr-filtered

Explore at:

Authors

shuai wang

Description

wshuai190/pubmed-pmc-sr-filtered

  Dataset Description

This dataset contains medical literature data for training Boolean query generation models. The data includes PubMed articles with their associated metadata, references, and result section PMIDs.

  Dataset Structure





  Data Fields

pmid: PubMed ID of the article pmc-id: PMC ID (if available) title: Article title max-date: Maximum publication date references-pmids: List of PMIDs referenced in the article… See the full description on the dataset page: https://huggingface.co/datasets/wshuai190/pubmed-pmc-sr-filtered.

Clear search

Close search

Google apps

Main menu

pubmed-pmc-sr-filtered

PubMed Datasets

pubmed-filtered

dsir-pile-13m-filtered-for-pubmed-central

Results from total and filtered searches in PubMed

Data from: PubMed's Core Clinical Journals Filter: Redesigned for...

Data from: Searching for LINCS to Stress: Using Text Mining to Automate...

STS Model of the PubMed Literature

Uftir_curated Dataset

BIOGRID CURATED DATA FOR PUBLICATION: Human VPAC1 receptor selectivity...

dsir-pile-100k-filtered-for-pubmed-central

Data for: comprehensive search filters for retrieving publications on...

Data from: Citation network data sets for 'Oxytocin – a social peptide?...

Data for "RegulaTome: a corpus of typed, directed, and signed relations...

2014-2019 Systematic Reviews and Meta-Analyses Data: Evidence-Based...

Data from: NeuroScape

NeuroScape: A Curated Dataset of Neuroscientific Articles from 1999 to 2023

Description

Scope and Selection Criteria

Changelog

Version 1.0.1 (Latest)

Directory Structure

Code

CSV Files

Neuroscience Articles (`neuroscience_articles_1999-2023.csv`)

Variables:

Neuroscience Clusters (`neuroscience_clusters_1999-2023.csv`)

Variables:

Neuroscience Dimensions (`neuroscience_dimensions_1999-2023.csv`)

Key Variables:

HDF5 Files

Article Datatypes:

Graph-Based Data

Article Similarity Graph (`article_similarity.graphml`)

Database (PubMed): retracted publications of systematic reviews and...

BIOGRID CURATED DATA FOR PUBLICATION: Interaction Between SARS-CoV-2 Spike...

BIOGRID CURATED DATA FOR PUBLICATION: Using lidocaine and benzocaine to link...

Additional file 1 of Disclosing ambiguous gene aliases by automatic...

pubmed-pmc-sr-filtered

wshuai190/pubmed-pmc-sr-filtered

pubmed-pmc-sr-filtered

PubMed Datasets

pubmed-filtered

dsir-pile-13m-filtered-for-pubmed-central

Results from total and filtered searches in PubMed

Data from: PubMed's Core Clinical Journals Filter: Redesigned for...

Data from: Searching for LINCS to Stress: Using Text Mining to Automate...

STS Model of the PubMed Literature

Uftir_curated Dataset

BIOGRID CURATED DATA FOR PUBLICATION: Human VPAC1 receptor selectivity...

dsir-pile-100k-filtered-for-pubmed-central

Data for: comprehensive search filters for retrieving publications on...

Data from: Citation network data sets for 'Oxytocin – a social peptide?...

Data for "RegulaTome: a corpus of typed, directed, and signed relations...

2014-2019 Systematic Reviews and Meta-Analyses Data: Evidence-Based...

Data from: NeuroScape

NeuroScape: A Curated Dataset of Neuroscientific Articles from 1999 to 2023

Description

Scope and Selection Criteria

Changelog

Version 1.0.1 (Latest)

Directory Structure

Code

CSV Files

Neuroscience Articles (neuroscience_articles_1999-2023.csv)

Variables:

Neuroscience Clusters (neuroscience_clusters_1999-2023.csv)

Variables:

Neuroscience Dimensions (neuroscience_dimensions_1999-2023.csv)

Key Variables:

HDF5 Files

Article Datatypes:

Graph-Based Data

Article Similarity Graph (article_similarity.graphml)

Database (PubMed): retracted publications of systematic reviews and...

BIOGRID CURATED DATA FOR PUBLICATION: Interaction Between SARS-CoV-2 Spike...

BIOGRID CURATED DATA FOR PUBLICATION: Using lidocaine and benzocaine to link...

Additional file 1 of Disclosing ambiguous gene aliases by automatic...

pubmed-pmc-sr-filtered

wshuai190/pubmed-pmc-sr-filtered

Neuroscience Articles (`neuroscience_articles_1999-2023.csv`)

Neuroscience Clusters (`neuroscience_clusters_1999-2023.csv`)

Neuroscience Dimensions (`neuroscience_dimensions_1999-2023.csv`)

Article Similarity Graph (`article_similarity.graphml`)