100+ datasets found
  1. h

    programming-jokes-dataset

    • huggingface.co
    Updated Feb 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rayhana Rafiai (2025). programming-jokes-dataset [Dataset]. https://huggingface.co/datasets/rayhanti/programming-jokes-dataset
    Explore at:
    Dataset updated
    Feb 7, 2025
    Authors
    Rayhana Rafiai
    Description

    rayhanti/programming-jokes-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  2. h

    short_jokes

    • huggingface.co
    Updated Feb 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    yuvraj sharma (2024). short_jokes [Dataset]. https://huggingface.co/datasets/ysharma/short_jokes
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 22, 2024
    Authors
    yuvraj sharma
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Context Generating humor is a complex task in the domain of machine learning, and it requires the models to understand the deep semantic meaning of a joke in order to generate new ones. Such problems, however, are difficult to solve due to a number of reasons, one of which is the lack of a database that gives an elaborate list of jokes. Thus, a large corpus of over 0.2 million jokes has been collected by scraping several websites containing funny and short jokes. You can visit the Github… See the full description on the dataset page: https://huggingface.co/datasets/ysharma/short_jokes.

  3. P

    Jester (Jokes) Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kenneth Y. Goldberg; Theresa Roeder; Dhruv Gupta; Chris Perkins, Jester (Jokes) Dataset [Dataset]. https://paperswithcode.com/dataset/jester
    Explore at:
    Authors
    Kenneth Y. Goldberg; Theresa Roeder; Dhruv Gupta; Chris Perkins
    Description

    6.5 million anonymous ratings of jokes by users of the Jester Joke Recommender System.

  4. d

    Corpus of daily jokes from the 24ur.com portal Šale24 1.0 - Dataset - B2FIND...

    • b2find.dkrz.de
    Updated Jan 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Corpus of daily jokes from the 24ur.com portal Šale24 1.0 - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/c29f095d-fa29-59aa-b494-c85caa0622c4
    Explore at:
    Dataset updated
    Jan 15, 2025
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This is a corpus of 1915 "jokes of the day" ("šala dneva") published by the Slovenian news portal 24ur.com. The jokes were scraped from their archive on September 18th, 2024. The initial list is lightly curated: shorter texts found in the original collection were removed from the corpus since they appear to be illustration captions without the accompanying illustrations. Readers of the news portal vote on the jokes themselves with thumbs up and thumbs down buttons. The voting results are included as metadata with each joke. Several jokes have been published more than once. Each joke (distinguished based on exact text matches) is identified by a hash of its text and presents a list of voting results for every instance of its publication. The normalised_text field contains text with punctuation corrections. For now, this is limited to replacing '' (two consecutive apostrophes U+0027) with " (a single straight/dumb/vertical quotation mark U+0022). The former (two apostrophes) is consistently used in place of the latter in the original corpus. Based on the name ("Šala dneva" i.e. "Joke of the day") and observed frequency of posting during September 2024 we assume each entry corresponds to a day starting from the day of data collection counting backwards. Each voting event for has an associated estimated publication date calculated with the above algorithm. The jokes are linguistically annotated with CLASSLA-Stanza (https://github.com/clarinsi/classla), using the models for standard Slovenian. The JSONL file contains entries representing individual jokes containing: - a hash of the original joke text used for duplicate identification (key: hash) - original scraped text (key: original_text) - normalised text (key: normalised_text) - linguistically annotated normalised text in CoNLL-U format (key: processed_text) - a list of vote objects containing joke vote metadata (key: votes) - votes for (key: votes.for) - votes against (key: votes.against) - estimated dates of joke publication and voting (key: estimated_date)

  5. w

    Books called Best teenage jokes

    • workwithdata.com
    Updated Oct 8, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Books called Best teenage jokes [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Best+teenage+jokes
    Explore at:
    Dataset updated
    Oct 8, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books and is filtered where the book is Best teenage jokes, featuring 7 columns including author, BNB id, book, book publisher, and ISBN. The preview is ordered by publication date (descending).

  6. Offense Classification Jokes

    • kaggle.com
    Updated Oct 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Avneet Singh (2024). Offense Classification Jokes [Dataset]. https://www.kaggle.com/datasets/avneets2103/offense-classification-jokes/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 10, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Avneet Singh
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Avneet Singh

    Released under Apache 2.0

    Contents

  7. Aruba: Festive, carnival or other entertainment articles, including...

    • app.indexbox.io
    Updated Jan 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IndexBox AI Platform (2025). Aruba: Festive, carnival or other entertainment articles, including conjuring tricks and novelty jokes 2007-2024 [Dataset]. https://app.indexbox.io/table/9505/533/
    Explore at:
    Dataset updated
    Jan 4, 2025
    Dataset provided by
    IndexBox
    Authors
    IndexBox AI Platform
    License

    Attribution-NoDerivs 3.0 (CC BY-ND 3.0)https://creativecommons.org/licenses/by-nd/3.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2007 - Dec 31, 2024
    Area covered
    Aruba
    Description

    Statistics illustrates consumption, production, prices, and trade of Festive, carnival or other entertainment articles, including conjuring tricks and novelty jokes in Aruba from 2007 to 2024.

  8. w

    Subjects of Space facts & jokes book

    • workwithdata.com
    Updated May 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Subjects of Space facts & jokes book [Dataset]. https://www.workwithdata.com/datasets/book-subjects?f=1&fcol0=book&fop0=%3D&fval0=Space+facts+%26+jokes+book
    Explore at:
    Dataset updated
    May 22, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book subjects and is filtered where the books is Space facts & jokes book, featuring 10 columns including authors, average publication date, book publishers, book subject, and books. The preview is ordered by number of books (descending).

  9. h

    jokes-pizza

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    jokes-pizza [Dataset]. https://huggingface.co/datasets/Ayush-Singh/jokes-pizza
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Ayush Singh
    Description

    Ayush-Singh/jokes-pizza dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. h

    joke_explaination

    • huggingface.co
    Updated Aug 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    joke_explaination [Dataset]. https://huggingface.co/datasets/theblackcat102/joke_explaination
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 19, 2023
    Authors
    theblackcat102
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for Dataset Name

      Dataset Summary
    

    Corpus for testing whether your LLM can explain the joke well. But this is a rather small dataset, if someone can point to a larger ones would be very nice.

      Languages
    

    English

      Dataset Structure
    
    
    
    
    
    
    
      Data Fields
    

    url : link to the explaination

    joke : the original joke

    explaination : the explaination of the joke

      Data Splits
    

    Since its so small, there's no splits… See the full description on the dataset page: https://huggingface.co/datasets/theblackcat102/joke_explaination.

  11. w

    adult-jokes.net - Historical whois Lookup

    • whoisdatacenter.com
    csv
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AllHeart Web Inc, adult-jokes.net - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/index.php/domain/adult-jokes.net/
    Explore at:
    csvAvailable download formats
    Dataset authored and provided by
    AllHeart Web Inc
    License

    https://whoisdatacenter.com/index.php/terms-of-use/https://whoisdatacenter.com/index.php/terms-of-use/

    Time period covered
    Mar 15, 1985 - Feb 20, 2025
    Description

    Explore the historical Whois records related to adult-jokes.net (Domain). Get insights into ownership history and changes over time.

  12. h

    jokes-new

    • huggingface.co
    Updated Jan 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayush Singh (2025). jokes-new [Dataset]. https://huggingface.co/datasets/Ayush-Singh/jokes-new
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 6, 2025
    Authors
    Ayush Singh
    Description

    Ayush-Singh/jokes-new dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. Martinique: Festive, carnival or other entertainment articles, including...

    • app.indexbox.io
    Updated Mar 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IndexBox AI Platform (2021). Martinique: Festive, carnival or other entertainment articles, including conjuring tricks and novelty jokes 2007-2024 [Dataset]. https://app.indexbox.io/table/9505/474/
    Explore at:
    Dataset updated
    Mar 17, 2021
    Dataset provided by
    IndexBox
    Authors
    IndexBox AI Platform
    License

    Attribution-NoDerivs 3.0 (CC BY-ND 3.0)https://creativecommons.org/licenses/by-nd/3.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2007 - Dec 31, 2024
    Area covered
    Martinique
    Description

    Statistics illustrates consumption, production, prices, and trade of Festive, carnival or other entertainment articles, including conjuring tricks and novelty jokes in Martinique from 2007 to 2024.

  14. Funny Story

    • search.datacite.org
    Updated May 13, 1938
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Hishon; Brigid Browne; John Browne (1938). Funny Story [Dataset]. http://doi.org/10.7925/drs1.duchas_5180537
    Explore at:
    Dataset updated
    May 13, 1938
    Dataset provided by
    DataCitehttps://www.datacite.org/
    National Folklore Collection, University College Dublin
    Authors
    Daniel Hishon; Brigid Browne; John Browne
    License

    http://n2t.net/ark:/87925/h1cc0xm5http://n2t.net/ark:/87925/h1cc0xm5

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Story collected by Brigid Browne, a student at Dromina, Ráth Luirc school (Dromina, Co. Cork) from informant John Browne.

  15. w

    Books called Really, really gross jokes, riddles, and tongue twisters

    • workwithdata.com
    Updated Jul 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Books called Really, really gross jokes, riddles, and tongue twisters [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Really%2C+really+gross+jokes%2C+riddles%2C+and+tongue+twisters
    Explore at:
    Dataset updated
    Jul 18, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books and is filtered where the book is Really, really gross jokes, riddles, and tongue twisters, featuring 7 columns including author, BNB id, book, book publisher, and ISBN. The preview is ordered by publication date (descending).

  16. Funny Stories

    • search.datacite.org
    Updated 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emerentia; Betty Feely (2017). Funny Stories [Dataset]. http://doi.org/10.7925/drs1.duchas_4650595
    Explore at:
    Dataset updated
    2017
    Dataset provided by
    DataCitehttps://www.datacite.org/
    National Folklore Collection, University College Dublin
    Authors
    Emerentia; Betty Feely
    License

    http://n2t.net/ark:/87925/h1cc0xm5http://n2t.net/ark:/87925/h1cc0xm5

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Story collected by Betty Feely, a student at An Clochar, Cara Droma Ruisc school (Carrick-on-Shannon, Co. Leitrim) (no informant identified).

  17. During the fight for Independence in this country ...

    • search.datacite.org
    Updated 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ml. Ó Gamhna; John Leslie (2017). During the fight for Independence in this country ... [Dataset]. http://doi.org/10.7925/drs1.duchas_5121691
    Explore at:
    Dataset updated
    2017
    Dataset provided by
    DataCitehttps://www.datacite.org/
    National Folklore Collection, University College Dublin
    Authors
    Ml. Ó Gamhna; John Leslie
    License

    http://n2t.net/ark:/87925/h1cc0xm5http://n2t.net/ark:/87925/h1cc0xm5

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Story collected by a student at Lismacaffry school (Lismacaffry, Co. Westmeath) from informant John Leslie.

  18. h

    alpaca-bulgarian-jokes

    • huggingface.co
    Updated Nov 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nikola (2024). alpaca-bulgarian-jokes [Dataset]. https://huggingface.co/datasets/vislupus/alpaca-bulgarian-jokes
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 4, 2024
    Authors
    Nikola
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    България
    Description

    Bulgarian Jokes Dataset

      Overview
    

    The Bulgarian Jokes Dataset is a collection of Bulgarian-language jokes gathered and prepared for use in training and fine-tuning natural language processing (NLP) models. This dataset is designed to help researchers and developers build models capable of understanding and generating humorous content in Bulgarian.

      Dataset Structure
    

    The dataset is structured in a format suitable for NLP training and fine-tuning tasks… See the full description on the dataset page: https://huggingface.co/datasets/vislupus/alpaca-bulgarian-jokes.

  19. w

    jokes-plus.com - Historical whois Lookup

    • whoisdatacenter.com
    csv
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AllHeart Web Inc, jokes-plus.com - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/jokes-plus.com/
    Explore at:
    csvAvailable download formats
    Dataset authored and provided by
    AllHeart Web Inc
    License

    https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/

    Time period covered
    Mar 15, 1985 - Mar 26, 2025
    Description

    Explore the historical Whois records related to jokes-plus.com (Domain). Get insights into ownership history and changes over time.

  20. w

    Silly jokes

    • workwithdata.com
    Updated May 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Silly jokes [Dataset]. https://www.workwithdata.com/object/silly-jokes-book-by-claire-fletcher-0000
    Explore at:
    Dataset updated
    May 3, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Silly jokes is a book. It was written by Claire Fletcher and published by Helen Exley Gift books in 2010.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rayhana Rafiai (2025). programming-jokes-dataset [Dataset]. https://huggingface.co/datasets/rayhanti/programming-jokes-dataset

programming-jokes-dataset

rayhanti/programming-jokes-dataset

Explore at:
Dataset updated
Feb 7, 2025
Authors
Rayhana Rafiai
Description

rayhanti/programming-jokes-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

Search
Clear search
Close search
Google apps
Main menu