61 datasets found
  1. issues-kaggle-notebooks

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugging Face Smol Models Research, issues-kaggle-notebooks [Dataset]. https://huggingface.co/datasets/HuggingFaceTB/issues-kaggle-notebooks
    Explore at:
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Hugging Face Smol Models Research
    Description

    GitHub Issues & Kaggle Notebooks

      Description
    

    GitHub Issues & Kaggle Notebooks is a collection of two code datasets intended for language models training, they are sourced from GitHub issues and notebooks in Kaggle platform. These datasets are a modified part of the StarCoder2 model training corpus, precisely the bigcode/StarCoder2-Extras dataset. We reformat the samples to remove StarCoder2's special tokens and use natural text to delimit comments in issues and display… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceTB/issues-kaggle-notebooks.

  2. h

    kaggle-recipe-categorized-chunk-8

    • huggingface.co
    Updated Sep 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeff Schmitz (2024). kaggle-recipe-categorized-chunk-8 [Dataset]. https://huggingface.co/datasets/Schmitz005/kaggle-recipe-categorized-chunk-8
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 11, 2024
    Authors
    Jeff Schmitz
    Description

    Schmitz005/kaggle-recipe-categorized-chunk-8 dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. R

    Damaged Roads Alvaro Basily Kaggle Dataset

    • universe.roboflow.com
    zip
    Updated Dec 10, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Final Project (2022). Damaged Roads Alvaro Basily Kaggle Dataset [Dataset]. https://universe.roboflow.com/final-project-vs0cw/damaged-roads-alvaro-basily-kaggle
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 10, 2022
    Dataset authored and provided by
    Final Project
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Variables measured
    Damaged Roads Bounding Boxes
    Description

    Damaged Roads Alvaro Basily Kaggle

    ## Overview
    
    Damaged Roads Alvaro Basily Kaggle is a dataset for object detection tasks - it contains Damaged Roads annotations for 3,321 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
    
  4. Meta Kaggle Code

    • kaggle.com
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaggle (2025). Meta Kaggle Code [Dataset]. https://www.kaggle.com/datasets/kaggle/meta-kaggle-code/code
    Explore at:
    Dataset updated
    Jun 5, 2025
    Dataset authored and provided by
    Kagglehttp://kaggle.com/
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Explore our public notebook content!

    Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.

    Why we’re releasing this dataset

    By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.

    Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.

    The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!

    Sensitive data

    While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.

    Joining with Meta Kaggle

    The files contained here are a subset of the KernelVersions in Meta Kaggle. The file names match the ids in the KernelVersions csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.

    File organization

    The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.

    The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays

    Questions / Comments

    We love feedback! Let us know in the Discussion tab.

    Happy Kaggling!

  5. Oracle Database metrics

    • kaggle.com
    Updated Aug 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Timerkhanov Yuriy (2020). Oracle Database metrics [Dataset]. https://www.kaggle.com/datasets/timerkhanovyuriy/oracle-database-metrics
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 20, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Timerkhanov Yuriy
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Timerkhanov Yuriy

    Released under CC0: Public Domain

    Contents

  6. Airport Luggage Dataset

    • universe.roboflow.com
    Updated Jan 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roboflow Madi (2023). Airport Luggage Dataset [Dataset]. https://universe.roboflow.com/roboflow-madi/airport-luggage
    Explore at:
    Dataset updated
    Jan 22, 2023
    Dataset provided by
    Roboflow
    Authors
    Roboflow Madi
    Variables measured
    Luggage Bounding Boxes
    Description

    Some images were collected from Kaggle: https://www.kaggle.com/datasets/dataclusterlabs/suitcaseluggage-dataset

    More images were also collected from the following Roboflow Universe projects: * https://universe.roboflow.com/ali-ahmad-kyfzj/baggage-rvbtb * https://universe.roboflow.com/luggage-7rqr6/luggage-kcuiy *****

  7. Kaggle methane laser meaurement: array animation

    • ecat.ga.gov.au
    Updated Jan 1, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Commonwealth of Australia (Geoscience Australia) (2016). Kaggle methane laser meaurement: array animation [Dataset]. https://ecat.ga.gov.au/geonetwork/srv/api/records/29e3e457-cd26-7979-e053-10a3070a8952
    Explore at:
    Dataset updated
    Jan 1, 2016
    Dataset provided by
    Geoscience Australiahttp://ga.gov.au/
    Area covered
    Pacific Ocean, North Pacific Ocean
    Description

    Animation for Kaggle showing a plume moving across an array of methane laser measurement paths

  8. Kaggle methane laser measurements - fan animation

    • ecat.ga.gov.au
    Updated Jan 1, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Commonwealth of Australia (Geoscience Australia) (2016). Kaggle methane laser measurements - fan animation [Dataset]. https://ecat.ga.gov.au/geonetwork/srv/api/records/29e3e457-cd27-7979-e053-10a3070a8952
    Explore at:
    Dataset updated
    Jan 1, 2016
    Dataset provided by
    Geoscience Australiahttp://ga.gov.au/
    Description

    Animation for Kaggle showing laser path measurements of methane over a plume of methane gas. Reflectors arranged in a fan configuration.

  9. Fraud Detection - Financial transactions

    • find.data.gov.scot
    csv
    Updated Mar 14, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deloitte Datathon 2018 (uSmart) (2018). Fraud Detection - Financial transactions [Dataset]. https://find.data.gov.scot/datasets/39167
    Explore at:
    csv(470.6714 MB)Available download formats
    Dataset updated
    Mar 14, 2018
    Dataset provided by
    Deloittehttps://deloitte.com/
    Description

    Synthetic transactional data with labels for fraud detection. For more information, see: https://www.kaggle.com/ntnu-testimon/paysim1/version/2

  10. Data from: San Francisco Open Data

    • kaggle.com
    Updated Mar 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DataSF (2019). San Francisco Open Data [Dataset]. https://www.kaggle.com/datasets/datasf/san-francisco
    Explore at:
    Dataset updated
    Mar 20, 2019
    Dataset authored and provided by
    DataSF
    Description

    Context

    DataSF seeks to transform the way that the City of San Francisco works -- through the use of data.

    https://datasf.org/about/

    Content

    This dataset contains the following tables: ['311_service_requests', 'bikeshare_stations', 'bikeshare_status', 'bikeshare_trips', 'film_locations', 'sffd_service_calls', 'sfpd_incidents', 'street_trees']

    • This data includes all San Francisco 311 service requests from July 2008 to the present, and is updated daily. 311 is a non-emergency number that provides access to non-emergency municipal services.
    • This data includes fire unit responses to calls from April 2000 to present and is updated daily. Data contains the call number, incident number, address, unit identifier, call type, and disposition. Relevant time intervals are also included. Because this dataset is based on responses, and most calls involved multiple fire units, there are multiple records for each call number. Addresses are associated with a block number, intersection or call box.
    • This data includes incidents from the San Francisco Police Department (SFPD) Crime Incident Reporting system, from January 2003 until the present (2 weeks ago from current date). The dataset is updated daily. Please note: the SFPD has implemented a new system for tracking crime. This dataset is still sourced from the old system, which is in the process of being retired (a multi-year process).
    • This data includes a list of San Francisco Department of Public Works maintained street trees including: planting date, species, and location. Data includes 1955 to present.

    This dataset is deprecated and not being updated.

    Fork this kernel to get started with this dataset.

    Acknowledgements

    http://datasf.org/

    Dataset Source: SF OpenData. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://sfgov.org/ - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    Banner Photo by @meric from Unplash.

    Inspiration

    Which neighborhoods have the highest proportion of offensive graffiti?

    Which complaint is most likely to be made using Twitter and in which neighborhood?

    What are the most complained about Muni stops in San Francisco?

    What are the top 10 incident types that the San Francisco Fire Department responds to?

    How many medical incidents and structure fires are there in each neighborhood?

    What’s the average response time for each type of dispatched vehicle?

    Which category of police incidents have historically been the most common in San Francisco?

    What were the most common police incidents in the category of LARCENY/THEFT in 2016?

    Which non-criminal incidents saw the biggest reporting change from 2015 to 2016?

    What is the average tree diameter?

    What is the highest number of a particular species of tree planted in a single year?

    Which San Francisco locations feature the largest number of trees?

  11. Gender Dataset

    • universe.roboflow.com
    zip
    Updated Sep 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seeed Studio (2023). Gender Dataset [Dataset]. https://universe.roboflow.com/seeed-studio-e2fso/gender-8vbxd/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 22, 2023
    Dataset authored and provided by
    Seeed Studio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Female Male
    Description
  12. docornot

    • huggingface.co
    Updated May 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mozilla (2024). docornot [Dataset]. https://huggingface.co/datasets/Mozilla/docornot
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 5, 2024
    Dataset provided by
    Mozillahttp://mozilla.org/
    Authors
    mozilla
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    The DocOrNot dataset contains 50% of images that are pictures, and 50% that are documents. It was built using 8k images from each one of these sources:

    RVL CDIP (Small) - https://www.kaggle.com/datasets/uditamin/rvl-cdip-small - license: https://www.industrydocuments.ucsf.edu/help/copyright/ Flickr8k - https://www.kaggle.com/datasets/adityajn105/flickr8k - license: https://creativecommons.org/publicdomain/zero/1.0/

    It can be used to train a model and classify an image as being a picture or a… See the full description on the dataset page: https://huggingface.co/datasets/Mozilla/docornot.

  13. RSNA Intracranial Hemorrhage Detection

    • registry.opendata.aws
    Updated Aug 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Radiological Society of North America (https://www.rsna.org/) (2024). RSNA Intracranial Hemorrhage Detection [Dataset]. https://registry.opendata.aws/rsna-intracranial-hemorrhage-detection/
    Explore at:
    Dataset updated
    Aug 1, 2024
    Dataset provided by
    Radiological Society of North America
    Description

    RSNA assembled this dataset in 2019 for the RSNA Intracranial Hemorrhage Detection AI Challenge (https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection/). De-identified head CT studies were provided by four research institutions. A group of over 60 volunteer expert radiologists recruited by RSNA and the American Society of Neuroradiology labeled over 25,000 exams for the presence and subtype classification of acute intracranial hemorrhage.

  14. potholes, cracks and openmanholes (Road Hazards)

    • kaggle.com
    Updated Feb 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sabid Rahman (2025). potholes, cracks and openmanholes (Road Hazards) [Dataset]. http://doi.org/10.34740/kaggle/dsv/10834063
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 23, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sabid Rahman
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F23345571%2F4471e4ade50676d782d4787f77aa08ad%2F1000_F_256252609_6WIHRGbpzSaVQwioubxwgXdSJTNONNcK.jpg?generation=1739209341333909&alt=media" alt="">

    This dataset contains 2,700 images focused on detecting potholes, cracks, and open manholes on roads. It has been augmented to enhance the variety and robustness of the data. The images are organized into training and validation sets, with three distinct categories:

    • Potholes: class 0
    • Cracks: class 1
    • Open Manholes: class 2

    Included in the Dataset: - Bounding Box Annotations in YOLO Format (.txt files) - Format: YOLOv8 & YOLO11 compatible - Purpose: Ready for training YOLO-based object detection models

    • Folder Structure Organized into:

      • train/ folder
      • valid/ folder
      • Class-specific folders
      • An all_classes/ folder for combined access Benefit: Easy access for training, validation, and augmentation tasks
    • Dual Format Support

      • COCO JSON Annotations Included -Compatible with models like Faster R-CNN Enables flexibility across different object detection frameworks
    • Use Cases Targeted

      • Model training
      • Model testing
      • Custom data augmentation
      • Specific focus: Road safety and infrastructure detection

    Here's a clear breakdown of the folder structure:

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F23345571%2F023b40c98bf858c58394d6ed2393bfc3%2FScreenshot%202025-05-01%20202438.png?generation=1746109541780835&alt=media" alt="">

  15. Car Crash or Collision Prediction Dataset

    • kaggle.com
    Updated Aug 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md. Fahim Bin Amin (2024). Car Crash or Collision Prediction Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/9268756
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 28, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Md. Fahim Bin Amin
    Description

    Car Crash or Collision Prediction Dataset (Use ONLY the Compressed folder)

    Source and Description

    All of the images are from 100K Dashcam videos. It is collected from the BDD100K dataset.

    The images are separated from the videos within 5-second intervals as individual frames.

    Data Count

    This dataset contains 10,000 images.

    The annotation has been provided in the xlsx file as well.

    Classes

    The dataset contains 2 classes. They are given below:

    Class RepresentationClass
    yCollision/Accident
    nNo Collision/No Accident
  16. Health Nutrition and Population Statistics

    • datacatalog1.worldbank.org
    • kaggle.com
    databank, utf-8
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HealthStats, World Bank Group, Health Nutrition and Population Statistics [Dataset]. https://datacatalog1.worldbank.org/search/dataset/0037652/Health-Nutrition-and-Population-Statistics
    Explore at:
    utf-8, databankAvailable download formats
    Dataset provided by
    World Bank Grouphttp://www.worldbank.org/
    World Bankhttp://worldbank.org/
    License

    https://datacatalog1.worldbank.org/public-licenses?fragment=cchttps://datacatalog1.worldbank.org/public-licenses?fragment=cc

    Description

    Health Nutrition and Population Statistics database provides key health, nutrition and population statistics gathered from a variety of international and national sources. Themes include global surgery, health financing, HIV/AIDS, immunization, infectious diseases, medical resources and usage, noncommunicable diseases, nutrition, population dynamics, reproductive health, universal health coverage, and water and sanitation.

  17. Loan Prediction with 3 Problem Statement

    • kaggle.com
    Updated Sep 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yashpal (2022). Loan Prediction with 3 Problem Statement [Dataset]. https://www.kaggle.com/datasets/yashpaloswal/loan-prediction-with-3-problem-statement
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 3, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Yashpal
    Description

    The data contains client loan data and whether there loan got approved or not. The main goal is to find out loan approval prediction over testing data using model (created using training data)

  18. Kenya - Food Prices

    • data.wu.ac.at
    • cloud.csiss.gmu.edu
    csv
    Updated Oct 4, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WFP - World Food Programme (2018). Kenya - Food Prices [Dataset]. https://data.wu.ac.at/schema/data_humdata_org/ZTBkM2ZiYTYtZjlhMi00NWQ3LWI5NDktMTQwYzQ1NTE5N2Zm
    Explore at:
    csv(523604.0), csv(126113.0)Available download formats
    Dataset updated
    Oct 4, 2018
    Dataset provided by
    World Food Programmehttp://da.wfp.org/
    Description

    This dataset contains Food Prices data for Kenya. Food prices data comes from the World Food Programme and covers foods such as maize, rice, beans, fish, and sugar for 76 countries and some 1,500 markets. It is updated weekly but contains to a large extent monthly data. The data goes back as far as 1992 for a few countries, although many countries started reporting from 2003 or thereafter.

  19. 🛍️ Fashion Retail Sales Dataset

    • kaggle.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atharva Soundankar (2025). 🛍️ Fashion Retail Sales Dataset [Dataset]. https://www.kaggle.com/datasets/atharvasoundankar/fashion-retail-sales
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Atharva Soundankar
    Description

    📜 Dataset Overview

    This dataset contains 3,400 records of fashion retail sales, capturing various details about customer purchases, including item details, purchase amounts, ratings, and payment methods. It is useful for analyzing customer buying behavior, product popularity, and payment preferences.

    📂 Dataset Details

    Column NameData TypeNon-Null CountDescription
    Customer Reference IDInteger3,400A unique identifier for each customer.
    Item PurchasedString3,400The name of the fashion item purchased.
    Purchase Amount (USD)Float2,750The purchase price of the item in USD (650 missing values).
    Date PurchaseString3,400The date on which the purchase was made (format: DD-MM-YYYY).
    Review RatingFloat3,076The customer review rating (scale: 1 to 5, 324 missing values).
    Payment MethodString3,400The payment method used (e.g., Credit Card, Cash).

    🔍 Key Insights

    • The dataset contains 3,400 transactions.
    • Missing values are present in:
      • Purchase Amount (USD): 650 missing values
      • Review Rating: 324 missing values
    • Payment Method includes multiple categories, allowing analysis of payment trends.
    • Date Purchase is in DD-MM-YYYY format, which can be useful for time-series analysis.
    • The dataset can help analyze sales trends, customer preferences, and payment behaviors in the fashion retail industry.

    📊 Potential Use Cases

    • Sales Analysis: Understanding which fashion items are selling the most.
    • Customer Insights: Analyzing purchase behaviors and spending patterns.
    • Trend Forecasting: Identifying seasonal trends in fashion retail.
    • Payment Method Preferences: Understanding how customers prefer to pay.
  20. Pakistan - Food Prices

    • data.wu.ac.at
    • data.humdata.org
    • +2more
    csv
    Updated Oct 4, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WFP - World Food Programme (2018). Pakistan - Food Prices [Dataset]. https://data.wu.ac.at/schema/data_humdata_org/MTI1NDU1ZmYtYzhhOC00ZjRiLTkxOTAtYmYwZTg0NTU5ZGM3
    Explore at:
    csv(171690.0), csv(709048.0)Available download formats
    Dataset updated
    Oct 4, 2018
    Dataset provided by
    World Food Programmehttp://da.wfp.org/
    Description

    This dataset contains Food Prices data for Pakistan. Food prices data comes from the World Food Programme and covers foods such as maize, rice, beans, fish, and sugar for 76 countries and some 1,500 markets. It is updated weekly but contains to a large extent monthly data. The data goes back as far as 1992 for a few countries, although many countries started reporting from 2003 or thereafter.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Hugging Face Smol Models Research, issues-kaggle-notebooks [Dataset]. https://huggingface.co/datasets/HuggingFaceTB/issues-kaggle-notebooks
Organization logo

issues-kaggle-notebooks

HuggingFaceTB/issues-kaggle-notebooks

Explore at:
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face Smol Models Research
Description

GitHub Issues & Kaggle Notebooks

  Description

GitHub Issues & Kaggle Notebooks is a collection of two code datasets intended for language models training, they are sourced from GitHub issues and notebooks in Kaggle platform. These datasets are a modified part of the StarCoder2 model training corpus, precisely the bigcode/StarCoder2-Extras dataset. We reformat the samples to remove StarCoder2's special tokens and use natural text to delimit comments in issues and display… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceTB/issues-kaggle-notebooks.

Search
Clear search
Close search
Google apps
Main menu