Facebook
TwitterDataset Card for English quotes
I-Dataset Summary
english_quotes is a dataset of all the quotes retrieved from goodreads quotes. This dataset can be used for multi-label text classification and text generation. The content of each quote is in English and concerns the domain of datasets for NLP and beyond.
II-Supported Tasks and Leaderboards
Multi-label text classification : The dataset can be used to train a model for text-classification, which consists ofโฆ See the full description on the dataset page: https://huggingface.co/datasets/Abirate/english_quotes.
Facebook
Twitteradeo/english-quotes-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Vijay J0shi
Released under MIT
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a paraphrase of https://huggingface.co/datasets/Abirate/english_quotes using the google/gemma-2-2b-it model. The license follows the original dataset's Creative Commons Attribution 4.0 International License. Paraphrasing was conducted using text2dataset.
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
There aren't any large, public datasets of quotes to be found online (at the time of writing). So I decided to create my own by parsing and cleaning up a Wikiquote data dump. To create your own dataset with different languages and cutoff lengths, check out my Github repository.
quotes-100-en.json
A JSON file containing english quotes less than 100 characters, scraped from Wikiquote.
Huge thanks to all the contributors to Wikiquote, and the Wikimedia Foundation.
Analysis and interpretation of quotes from important historical figures.
Facebook
TwitterThis dataset is the same as Abirate/english_quotes, but I sanitized the author and sanitized the text to avoid weird characters. from ftfy import fix_encoding from datasets import load_dataset
def correct_encoding(examples): quote = examples["quote"] author = examples["author"]
# remove trailing comma from authors and fix encoding
author = author.rstrip(",")
author = fix_encoding(author)
examples["author"] = author
# fix encoding
quote = fix_encoding(quote)โฆ See the full description on the dataset page: https://huggingface.co/datasets/tengomucho/english_quotes_sanitized.
Facebook
TwitterThis dataset was created using english-quotes dataset and SQL Console: Query
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a translation of https://huggingface.co/datasets/Abirate/english_quotes into Japanese using the llm-jp/llm-jp-3-3.7b-instruct model. The license follows the original dataset's Creative Commons Attribution 4.0 International License. The translation was performed using text2dataset.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
USED of tahamajs/medicine_ds_persian for .parquet file
USED of Alijafarixcs2/persian-it-llama2-2k for .parquet file
USED of Abirate/english_quotes for .jsonl file
SEVERAL Markdown (.md) files have been added to the dataset; Language: English.
The biggest update is coming.
Pytroch UPDATE
UPLOADED SOME IMAGES IN PATH pytorch_directml_cpu_optimized_low_end_pc
17 months left ("The time was changed and postponed, Sorry!.")
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterDataset Card for English quotes
I-Dataset Summary
english_quotes is a dataset of all the quotes retrieved from goodreads quotes. This dataset can be used for multi-label text classification and text generation. The content of each quote is in English and concerns the domain of datasets for NLP and beyond.
II-Supported Tasks and Leaderboards
Multi-label text classification : The dataset can be used to train a model for text-classification, which consists ofโฆ See the full description on the dataset page: https://huggingface.co/datasets/Abirate/english_quotes.