https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Dataset Card for "ag_news"
Dataset Summary
AG is a collection of more than 1 million news articles. News articles have been gathered from more than 2000 news sources by ComeToMyHead in more than 1 year of activity. ComeToMyHead is an academic news search engine which has been running since July, 2004. The dataset is provided by the academic comunity for research purposes in data mining (clustering, classification, etc), information retrieval (ranking, search, etc)… See the full description on the dataset page: https://huggingface.co/datasets/wangrongsheng/ag_news.
SetFit/ag_news dataset hosted on Hugging Face and contributed by the HF Datasets community
AG is a collection of more than 1 million news articles. News articles have been gathered from more than 2000 news sources by ComeToMyHead in more than 1 year of activity. ComeToMyHead is an academic news search engine which has been running since July, 2004. The dataset is provided by the academic comunity for research purposes in data mining (clustering, classification, etc), information retrieval (ranking, search, etc), xml, data compression, data streaming, and any other non-commercial activity. For more information, please refer to the link http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html .
The AG's news topic classification dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the dataset above. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015).
The AG's news topic classification dataset is constructed by choosing 4 largest classes from the original corpus. Each class contains 30,000 training samples and 1,900 testing samples. The total number of training samples is 120,000 and testing 7,600.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('ag_news_subset', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
contemmcm/ag_news dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw result files used for tables and figures in Hubness Reduction Improves Sentence-BERT Semantic Spaces (DOI: coming)
For more info see: https://github.com/bemigini/hubness-reduction-sentence-bert
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw result files used for tables and figures in Hubness Reduction Improves Sentence-BERT Semantic Spaces (DOI: coming)
For more info see: https://github.com/bemigini/hubness-reduction-sentence-bert
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw result files used for tables and figures in Hubness Reduction Improves Sentence-BERT Semantic Spaces (DOI: coming)
For more info see: https://github.com/bemigini/hubness-reduction-sentence-bert
autoevaluate/autoeval-eval-ag_news-default-5b1609-64790145529 dataset hosted on Hugging Face and contributed by the HF Datasets community
This dataset was created by Pooja_S
Dataset Card for AutoTrain Evaluator
This repository contains model predictions generated by AutoTrain for the following task and dataset:
Task: Summarization Model: AleBurzio/long-t5-base-govreport Dataset: ag_news Config: default Split: test
To run new evaluation jobs, visit Hugging Face's automatic model evaluator.
Contributions
Thanks to @AdinaY for evaluating this model.
autoevaluate/autoeval-staging-eval-project-ag_news-22fb867e-11605544 dataset hosted on Hugging Face and contributed by the HF Datasets community
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
This dataset was created by GPT-4o and other public datasets. Therefore, we follow the OpenAI API terms of use and license for each dataset. public datasets
abisee/cnn_dailymail fancyzhx/ag_news JulesBelveze/tldr_news HuggingFaceH4/instruction-dataset
Dataset Card for AutoTrain Evaluator
This repository contains model predictions generated by AutoTrain for the following task and dataset:
Task: Multi-class Text Classification Model: nateraw/bert-base-uncased-ag-news Dataset: ag_news
To run new evaluation jobs, visit Hugging Face's automatic evaluation service.
Contributions
Thanks to @abhishek for evaluating this model.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Dataset Card for "ag_news"
Dataset Summary
AG is a collection of more than 1 million news articles. News articles have been gathered from more than 2000 news sources by ComeToMyHead in more than 1 year of activity. ComeToMyHead is an academic news search engine which has been running since July, 2004. The dataset is provided by the academic comunity for research purposes in data mining (clustering, classification, etc), information retrieval (ranking, search, etc)… See the full description on the dataset page: https://huggingface.co/datasets/wangrongsheng/ag_news.