3 datasets found
  1. Yahoo! Answers Topic Classification

    • kaggle.com
    zip
    Updated Jun 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhavik Ardeshna (2022). Yahoo! Answers Topic Classification [Dataset]. https://www.kaggle.com/datasets/bhavikardeshna/yahoo-email-classification
    Explore at:
    zip(324007831 bytes)Available download formats
    Dataset updated
    Jun 30, 2022
    Authors
    Bhavik Ardeshna
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The Yahoo! Answers topic classification dataset is constructed using the 10 largest main categories. Each class contains 140,000 training samples and 6,000 testing samples. Therefore, the total number of training samples is 1,400,000, and testing samples are 60,000 in this dataset. From all the answers and other meta-information, we only used the best answer content and the main category information.

    • Society & Culture
    • Science & Mathematics
    • Health
    • Education & Reference
    • Computers & Internet
    • Sports
    • Business & Finance
    • Entertainment & Music
    • Family & Relationships
    • Politics & Government

    The Yahoo! Answers topic classification dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the above dataset. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, and Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015)

  2. O

    yahoo-answers-qa

    • opendatalab.com
    zip
    Updated Jan 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). yahoo-answers-qa [Dataset]. https://opendatalab.com/OpenDataLab/yahoo-answers-qa
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 10, 2024
    Description

    The Yahoo! Answers topic classification dataset is constructed using 10 largest main categories. Each class contains 140,000 training samples and 6,000 testing samples. Therefore, the total number of training samples is 1,400,000 and testing samples 60,000 in this dataset. From all the answers and other meta-information, we only used the best answer content and the main category information.

  3. O

    yahoo-answers-topics

    • opendatalab.com
    zip
    Updated Jan 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). yahoo-answers-topics [Dataset]. https://opendatalab.com/OpenDataLab/yahoo-answers-topics
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 9, 2024
    Description

    The Yahoo! Answers topic classification dataset is constructed using 10 largest main categories. Each class contains 140,000 training samples and 6,000 testing samples. Therefore, the total number of training samples is 1,400,000 and testing samples 60,000 in this dataset. From all the answers and other meta-information, we only used the best answer content and the main category information.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Bhavik Ardeshna (2022). Yahoo! Answers Topic Classification [Dataset]. https://www.kaggle.com/datasets/bhavikardeshna/yahoo-email-classification
Organization logo

Yahoo! Answers Topic Classification

The Yahoo! dataset is constructed using 10 largest main categories.

Explore at:
50 scholarly articles cite this dataset (View in Google Scholar)
zip(324007831 bytes)Available download formats
Dataset updated
Jun 30, 2022
Authors
Bhavik Ardeshna
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

The Yahoo! Answers topic classification dataset is constructed using the 10 largest main categories. Each class contains 140,000 training samples and 6,000 testing samples. Therefore, the total number of training samples is 1,400,000, and testing samples are 60,000 in this dataset. From all the answers and other meta-information, we only used the best answer content and the main category information.

  • Society & Culture
  • Science & Mathematics
  • Health
  • Education & Reference
  • Computers & Internet
  • Sports
  • Business & Finance
  • Entertainment & Music
  • Family & Relationships
  • Politics & Government

The Yahoo! Answers topic classification dataset is constructed by Xiang Zhang (xiang.zhang@nyu.edu) from the above dataset. It is used as a text classification benchmark in the following paper: Xiang Zhang, Junbo Zhao, and Yann LeCun. Character-level Convolutional Networks for Text Classification. Advances in Neural Information Processing Systems 28 (NIPS 2015)

Search
Clear search
Close search
Google apps
Main menu