100+ datasets found
  1. h

    twt-kaggle-data

    • huggingface.co
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    megha manoj (2023). twt-kaggle-data [Dataset]. https://huggingface.co/datasets/mochi-skz/twt-kaggle-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 8, 2023
    Authors
    megha manoj
    Description

    mochi-skz/twt-kaggle-data dataset hosted on Hugging Face and contributed by the HF Datasets community

  2. h

    test-dataset-kaggle

    • huggingface.co
    Updated Feb 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gholamreza Dar (2024). test-dataset-kaggle [Dataset]. https://huggingface.co/datasets/Gholamreza/test-dataset-kaggle
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 15, 2024
    Authors
    Gholamreza Dar
    Description

    Gholamreza/test-dataset-kaggle dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. data

    • huggingface.co
    Updated Jul 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaggle MAP (2025). data [Dataset]. https://huggingface.co/datasets/kaggle-map/data
    Explore at:
    Dataset updated
    Jul 27, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kaggle MAP
    Description

    kaggle-map/data dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. TLC Taxi Zone

    • kaggle.com
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammad Reza Mashoufi (2024). TLC Taxi Zone [Dataset]. https://www.kaggle.com/mohammadrezamashoufi/tlc-taxi-zone/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 25, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mohammad Reza Mashoufi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Mohammad Reza Mashoufi

    Released under MIT

    Contents

  5. h

    plant-kaggle-seg-data

    • huggingface.co
    Updated May 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jung (2024). plant-kaggle-seg-data [Dataset]. https://huggingface.co/datasets/Juliekyungyoon/plant-kaggle-seg-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 29, 2024
    Authors
    Jung
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Juliekyungyoon/plant-kaggle-seg-data dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. video example for ppe red zone

    • kaggle.com
    Updated Sep 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HinePo (2023). video example for ppe red zone [Dataset]. https://www.kaggle.com/datasets/hinepo/video-example-for-ppe-red-zone/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 10, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    HinePo
    Description

    Dataset

    This dataset was created by HinePo

    Contents

  7. h

    kaggle-toxic-annotated

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas capelle, kaggle-toxic-annotated [Dataset]. https://huggingface.co/datasets/tcapelle/kaggle-toxic-annotated
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Thomas capelle
    Description

    Kaggle toxic dataset annotated with gpt-4o-mini with the same prompt used to annotate Toxic-Commons Celadon

  8. nyc_taxi_zones_shape

    • kaggle.com
    Updated Jun 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ahmadreza rostamani (2024). nyc_taxi_zones_shape [Dataset]. https://www.kaggle.com/ahmadrezarostamani/nyc-taxi-zones-shape/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 18, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ahmadreza rostamani
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    New York
    Description

    Dataset

    This dataset was created by ahmadreza rostamani

    Released under Apache 2.0

    Contents

  9. h

    kaggle-entity-annotated-corpus-ner-dataset

    • huggingface.co
    Updated Jul 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rafael Arias Calles (2022). kaggle-entity-annotated-corpus-ner-dataset [Dataset]. https://huggingface.co/datasets/rjac/kaggle-entity-annotated-corpus-ner-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 10, 2022
    Authors
    Rafael Arias Calles
    License

    https://choosealicense.com/licenses/odbl/https://choosealicense.com/licenses/odbl/

    Description

    Date: 2022-07-10 Files: ner_dataset.csv Source: Kaggle entity annotated corpus notes: The dataset only contains the tokens and ner tag labels. Labels are uppercase.

      About Dataset
    

    from Kaggle Datasets

      Context
    

    Annotated Corpus for Named Entity Recognition using GMB(Groningen Meaning Bank) corpus for entity classification with enhanced and popular features by Natural Language Processing applied to the data set. Tip: Use Pandas Dataframe to load dataset if using Python forโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/rjac/kaggle-entity-annotated-corpus-ner-dataset.

  10. h

    kaggle-mbti

    • huggingface.co
    Updated Jul 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jing Jie Tan (2024). kaggle-mbti [Dataset]. http://doi.org/10.57967/hf/3955
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 24, 2024
    Authors
    Jing Jie Tan
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Personality Dataset

    Essays https://huggingface.co/datasets/jingjietan/essays-big5 MBTI https://huggingface.co/datasets/jingjietan/kaggle-mbti Pandora https://huggingface.co/datasets/jingjietan/pandora-big5 Please contact jingjietan.com for another dataset. Cite: @software{jingjietan-apr-dataset, author = {Jing Jie, Tan}, title = {Personality Kaggle Dataset Splitting}, url = {https://huggingface.co/datasets/jingjietan/kaggle-mbti}, version = {1.0.0}, year = {2024} }

  11. puerto_OSM_data

    • kaggle.com
    Updated Mar 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Inhoi (2020). puerto_OSM_data [Dataset]. https://www.kaggle.com/inhoii/puerto-road/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 22, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Inhoi
    Description

    Dataset

    This dataset was created by Inhoi

    Contents

  12. issues-kaggle-notebooks

    • huggingface.co
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugging Face Smol Models Research (2025). issues-kaggle-notebooks [Dataset]. https://huggingface.co/datasets/HuggingFaceTB/issues-kaggle-notebooks
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Hugging Face Smol Models Research
    Description

    GitHub Issues & Kaggle Notebooks

      Description
    

    GitHub Issues & Kaggle Notebooks is a collection of two code datasets intended for language models training, they are sourced from GitHub issues and notebooks in Kaggle platform. These datasets are a modified part of the StarCoder2 model training corpus, precisely the bigcode/StarCoder2-Extras dataset. We reformat the samples to remove StarCoder2's special tokens and use natural text to delimit comments in issues and displayโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceTB/issues-kaggle-notebooks.

  13. Online Sales Dataset - Popular Marketplace Data

    • kaggle.com
    Updated May 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ShreyanshVerma27 (2024). Online Sales Dataset - Popular Marketplace Data [Dataset]. https://www.kaggle.com/datasets/shreyanshverma27/online-sales-dataset-popular-marketplace-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 25, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ShreyanshVerma27
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides a comprehensive overview of online sales transactions across different product categories. Each row represents a single transaction with detailed information such as the order ID, date, category, product name, quantity sold, unit price, total price, region, and payment method.

    Columns:

    • Order ID: Unique identifier for each sales order.
    • Date:Date of the sales transaction.
    • Category:Broad category of the product sold (e.g., Electronics, Home Appliances, Clothing, Books, Beauty Products, Sports).
    • Product Name:Specific name or model of the product sold.
    • Quantity:Number of units of the product sold in the transaction.
    • Unit Price:Price of one unit of the product.
    • Total Price: Total revenue generated from the sales transaction (Quantity * Unit Price).
    • Region:Geographic region where the transaction occurred (e.g., North America, Europe, Asia).
    • Payment Method: Method used for payment (e.g., Credit Card, PayPal, Debit Card).

    Insights:

    • 1. Analyze sales trends over time to identify seasonal patterns or growth opportunities.
    • 2. Explore the popularity of different product categories across regions.
    • 3. Investigate the impact of payment methods on sales volume or revenue.
    • 4. Identify top-selling products within each category to optimize inventory and marketing strategies.
    • 5. Evaluate the performance of specific products or categories in different regions to tailor marketing campaigns accordingly.
  14. h

    kinyarwada-kaggle

    • huggingface.co
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Babs Technologies (2025). kinyarwada-kaggle [Dataset]. https://huggingface.co/datasets/babs/kinyarwada-kaggle
    Explore at:
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Babs Technologies
    Description

    babs/kinyarwada-kaggle dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. h

    kaggle

    • huggingface.co
    Updated Feb 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmad Khan (2024). kaggle [Dataset]. https://huggingface.co/datasets/ahmadkhan1022/kaggle
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 2, 2024
    Dataset authored and provided by
    Ahmad Khan
    License

    https://choosealicense.com/licenses/pddl/https://choosealicense.com/licenses/pddl/

    Description

    Dataset Card for MergedDataset

      Dataset Summary
    
    
    
    
    
      Supported Tasks and Leaderboards
    

    [More Information Needed]

      Languages
    

    [More Information Needed]

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    [More Information Needed]

      Data Fields
    

    [More Information Needed]

      Data Splits
    

    [More Information Needed]

      Dataset Creation
    
    
    
    
    
      Curation Rationale
    

    [More Information Needed]

      Source Data
    
    
    
    
    
      Initial Dataโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/ahmadkhan1022/kaggle.
    
  16. h

    Kaggle-Mental-Health-Survey-Data

    • huggingface.co
    Updated Jul 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shanti flagg (2024). Kaggle-Mental-Health-Survey-Data [Dataset]. https://huggingface.co/datasets/sflagg/Kaggle-Mental-Health-Survey-Data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 21, 2024
    Authors
    shanti flagg
    Description

    sflagg/Kaggle-Mental-Health-Survey-Data dataset hosted on Hugging Face and contributed by the HF Datasets community

  17. h

    kaggle-native

    • huggingface.co
    Updated Nov 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gayani Nanayakkara (2024). kaggle-native [Dataset]. https://huggingface.co/datasets/gayanin/kaggle-native
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 4, 2024
    Authors
    Gayani Nanayakkara
    Description

    gayanin/kaggle-native dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. World_Timezone_dataset

    • kaggle.com
    zip
    Updated Mar 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adnan Dodmani (2021). World_Timezone_dataset [Dataset]. https://www.kaggle.com/adnandodmani/world-timezone-dataset
    Explore at:
    zip(1648 bytes)Available download formats
    Dataset updated
    Mar 18, 2021
    Authors
    Adnan Dodmani
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    World
    Description

    Context

    The dataset contains the Timezone for 90 countries.

    Content

    Column Description:-

    Source Timezone - This contains the information such as name of the country, time zone & its associated information. We will be working mostly with this column.

    Acknowledgements

    1. Before solving the problem make sure you explore other library for analysis. Conventional libraries ain't gonna help.

    Inspiration

    I have uploaded this dataset as many a times we face problem when we work with datetime, timezone. Please make use of this dataste & learn from it , share with your peers to grow our community.

  19. h

    kaggle-native-v8-vocab-noised

    • huggingface.co
    Updated Sep 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gayani Nanayakkara (2024). kaggle-native-v8-vocab-noised [Dataset]. https://huggingface.co/datasets/gayanin/kaggle-native-v8-vocab-noised
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 23, 2024
    Authors
    Gayani Nanayakkara
    Description

    gayanin/kaggle-native-v8-vocab-noised dataset hosted on Hugging Face and contributed by the HF Datasets community

  20. aime_filtered

    • huggingface.co
    Updated Aug 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaggle winners (2024). aime_filtered [Dataset]. https://huggingface.co/datasets/kaggle-aimo/aime_filtered
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 29, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kaggle winners
    Description

    kaggle-aimo/aime_filtered dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
megha manoj (2023). twt-kaggle-data [Dataset]. https://huggingface.co/datasets/mochi-skz/twt-kaggle-data

twt-kaggle-data

mochi-skz/twt-kaggle-data

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 8, 2023
Authors
megha manoj
Description

mochi-skz/twt-kaggle-data dataset hosted on Hugging Face and contributed by the HF Datasets community

Search
Clear search
Close search
Google apps
Main menu