3 datasets found

h
hallmarks_of_cancer
huggingface.co
Updated Apr 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BigScience Biomedical Datasets (2023). hallmarks_of_cancer [Dataset]. https://huggingface.co/datasets/bigbio/hallmarks_of_cancer
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 4, 2023
Dataset authored and provided by
BigScience Biomedical Datasets
License
https://choosealicense.com/licenses/gpl-3.0/https://choosealicense.com/licenses/gpl-3.0/
Description
The Hallmarks of Cancer (HOC) Corpus consists of 1852 PubMed publication abstracts manually annotated by experts according to a taxonomy. The taxonomy consists of 37 classes in a hierarchy. Zero or more class labels are assigned to each sentence in the corpus. The labels are found under the "labels" directory, while the tokenized text can be found under "text" directory. The filenames are the corresponding PubMed IDs (PMID).
h
HoC
huggingface.co
Updated Feb 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
yanis labrak (2023). HoC [Dataset]. https://huggingface.co/datasets/qanastek/HoC
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 22, 2023
Authors
yanis labrak
Description
The Hallmarks of Cancer Corpus for text classification

The Hallmarks of Cancer (HOC) Corpus consists of 1852 PubMed publication abstracts manually annotated by experts according to a taxonomy. The taxonomy consists of 37 classes in a hierarchy. Zero or more class labels are assigned to each sentence in the corpus. The labels are found under the "labels" directory, while the tokenized text can be found under "text" directory. The filenames are the corresponding PubMed IDs (PMID).

In addition to the HOC corpus, we also have the Cancer Hallmarks Analytics Tool which classifes all of PubMed according to the HoC taxonomy.
O
HOC (Hallmarks of Cancer)
opendatalab.com
zip
Updated Jul 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karolinska Institutet (2024). HOC (Hallmarks of Cancer) [Dataset]. https://opendatalab.com/OpenDataLab/HOC
Explore at:
zip(3193597 bytes)Available download formats
Dataset updated
Jul 1, 2024
Dataset provided by
University of Cambridge
Karolinska Institutet
License
https://choosealicense.com/licenses/gpl-3.0/https://choosealicense.com/licenses/gpl-3.0/
Description
The Hallmarks of Cancer (HOC) corpus consists of 1852 PubMed publication abstracts manually annotated by experts according to the Hallmarks of Cancer taxonomy. The taxonomy consists of 37 classes in a hierarchy. Zero or more class labels are assigned to each sentence in the corpus. The labels are found under the “labels” directory, while the tokenized text can be found under “text” directory. The filenames are the corresponding PubMed IDs (PMID).
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

BigScience Biomedical Datasets (2023). hallmarks_of_cancer [Dataset]. https://huggingface.co/datasets/bigbio/hallmarks_of_cancer

hallmarks_of_cancer

Hallmarks of Cancer

bigbio/hallmarks_of_cancer

Explore at:

19 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 4, 2023

Dataset authored and provided by

BigScience Biomedical Datasets

License

https://choosealicense.com/licenses/gpl-3.0/https://choosealicense.com/licenses/gpl-3.0/

Description

The Hallmarks of Cancer (HOC) Corpus consists of 1852 PubMed publication abstracts manually annotated by experts according to a taxonomy. The taxonomy consists of 37 classes in a hierarchy. Zero or more class labels are assigned to each sentence in the corpus. The labels are found under the "labels" directory, while the tokenized text can be found under "text" directory. The filenames are the corresponding PubMed IDs (PMID).

Clear search

Close search

Google apps

Main menu

hallmarks_of_cancer

HoC

HOC (Hallmarks of Cancer)

hallmarks_of_cancer

Hallmarks of Cancer

bigbio/hallmarks_of_cancer