100+ datasets found

issues-kaggle-notebooks
huggingface.co
Updated Aug 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hugging Face Smol Models Research (2025). issues-kaggle-notebooks [Dataset]. https://huggingface.co/datasets/HuggingFaceTB/issues-kaggle-notebooks
Explore at:
Dataset updated
Aug 12, 2025
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face Smol Models Research
Description
GitHub Issues & Kaggle Notebooks

Description

GitHub Issues & Kaggle Notebooks is a collection of two code datasets intended for language models training, they are sourced from GitHub issues and notebooks in Kaggle platform. These datasets are a modified part of the StarCoder2 model training corpus, precisely the bigcode/StarCoder2-Extras dataset. We reformat the samples to remove StarCoder2's special tokens and use natural text to delimit comments in issues and display… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceTB/issues-kaggle-notebooks.
Meta Kaggle Code
kaggle.com
zip
Updated Nov 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaggle (2025). Meta Kaggle Code [Dataset]. https://www.kaggle.com/datasets/kaggle/meta-kaggle-code/code
Explore at:
zip(167219625372 bytes)Available download formats
Dataset updated
Nov 27, 2025
Dataset authored and provided by
Kagglehttp://kaggle.com/
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Explore our public notebook content!

Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.

Why we’re releasing this dataset

By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.

Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.

The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!

Sensitive data

While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.

Joining with Meta Kaggle

The files contained here are a subset of the KernelVersions in Meta Kaggle. The file names match the ids in the KernelVersions csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.

File organization

The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.

The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays

Questions / Comments

We love feedback! Let us know in the Discussion tab.

Happy Kaggling!
Top Kaggle Notebooks dataset: R
kaggle.com
zip
Updated Apr 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Md. Eyasen Arafat (2024). Top Kaggle Notebooks dataset: R [Dataset]. https://www.kaggle.com/datasets/mdearafat/top-kaggle-notebooks-dataset-r
Explore at:
zip(37357 bytes)Available download formats
Dataset updated
Apr 24, 2024
Authors
Md. Eyasen Arafat
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Source: Kaggle Content: Information about R notebooks Ranking: Top 500 (criteria, OUTPUTS: Visualizations) Programming Language: R Last Update: April 23, 2024, at 7:32 AM GMT+6

This dataset can be useful for exploring popular R notebooks on Kaggle, finding inspiration for your own projects, and learning from other data scientists. By looking at the notebooks with high upvotes, views, and medals, you can get an idea of what topics are trending and what makes a successful Kaggle Notebook.
Data from: KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle
zenodo.org
dataon.kisti.re.kr
+1more
bin, bz2, pdf
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luigi Quaranta; Fabio Calefato; Fabio Calefato; Filippo Lanubile; Filippo Lanubile; Luigi Quaranta (2024). KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle [Dataset]. http://doi.org/10.5281/zenodo.4468523
Explore at:
bz2, pdf, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4468523
Dataset updated
Jul 19, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Luigi Quaranta; Fabio Calefato; Fabio Calefato; Filippo Lanubile; Filippo Lanubile; Luigi Quaranta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
KGTorrent is a dataset of Python Jupyter notebooks from the Kaggle platform.

The dataset is accompanied by a MySQL database containing metadata about the notebooks and the activity of Kaggle users on the platform. The information to build the MySQL database has been derived from Meta Kaggle, a publicly available dataset containing Kaggle metadata.

In this package, we share the complete KGTorrent dataset (consisting of the dataset itself plus its companion database), as well as the specific version of Meta Kaggle used to build the database.

More specifically, the package comprises the following three compressed archives:

KGT_dataset.tar.bz2, the dataset of Jupyter notebooks;

KGTorrent_dump_10-2020.sql.tar.bz2, the dump of the MySQL companion database;

MetaKaggle27Oct2020.tar.bz2, a copy of the Meta Kaggle version used to build the database.

Moreover, we include KGTorrent_logical_schema.pdf, the logical schema of the KGTorrent MySQL database.
kaggle-notebooks-edu-v0
huggingface.co
Updated May 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jupyter Agent (2025). kaggle-notebooks-edu-v0 [Dataset]. https://huggingface.co/datasets/jupyter-agent/kaggle-notebooks-edu-v0
Explore at:
Dataset updated
May 31, 2025
Dataset provided by
Project Jupyterhttps://jupyter.org/
Authors
Jupyter Agent
Description
Kaggle Notebooks LLM Filtered

Model: meta-llama/Meta-Llama-3.1-70B-Instruct Sample: 12,400 Source dataset: data-agents/kaggle-notebooks Prompt:

Below is an extract from a Jupyter notebook. Evaluate whether it has a high analysis value and could help a data scientist.

The notebooks are formatted with the following tokens:

START

Here comes markdown content

Here comes python code

Here comes code output

More… See the full description on the dataset page: https://huggingface.co/datasets/jupyter-agent/kaggle-notebooks-edu-v0.
Kaggle Notebooks Ranking
kaggle.com
zip
Updated Jan 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vivo Vinco (2022). Kaggle Notebooks Ranking [Dataset]. https://www.kaggle.com/datasets/vivovinco/kaggle-notebooks-ranking
Explore at:
zip(44706 bytes)Available download formats
Dataset updated
Jan 21, 2022
Authors
Vivo Vinco
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Context

This dataset contains Kaggle ranking of notebooks.

Content

+3000 rows and 8 columns. Columns' description are listed below.

Rank : Rank of the user

Tier : Grandmaster, Master or Expert

Username : Name of the user

Join Date : Year of join

Gold Medals : Number of gold medals

Silver Medals : Number of silver medals

Bronze Medals : Number of bronze medals

Points : Total points

Acknowledgements

Data from Kaggle. Image from Wikiwand.

If you're reading this, please upvote.
h
kaggle-notebooks-conversations-hq
huggingface.co
Updated Mar 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Computer Intelligence Project (2025). kaggle-notebooks-conversations-hq [Dataset]. https://huggingface.co/datasets/bigcomputer/kaggle-notebooks-conversations-hq
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 7, 2025
Dataset authored and provided by
Computer Intelligence Project
Description
bigcomputer/kaggle-notebooks-conversations-hq dataset hosted on Hugging Face and contributed by the HF Datasets community

Arcade Natural Language to Code Challenge

kaggle.com

zip

Updated Feb 22, 2023

Facebook

Twitter

Click to copy link

Link copied

Cite

Google AI (2023). Arcade Natural Language to Code Challenge [Dataset]. https://www.kaggle.com/datasets/googleai/arcade-nl2code-dataset

Explore at:

zip(3921922 bytes)Available download formats

Dataset updated

Feb 22, 2023

Dataset authored and provided by

Google AI

Description

Arcade: Natural Language to Code Generation in Interactive Computing Notebooks

Arcade is a collection of natural language to code problems on interactive data science notebooks. Each problem features an NL intent as problem specification, a reference code solution, and preceding notebook context (Markdown or code cells). Arcade can be used to evaluate the accuracies of code large language models in generating data science programs given natural language instructions. Please read our paper for more details.

Note👉 This Kaggle dataset only contains the dataset files of Arcade. Refer to our main Github repository for detailed instructions to use this dataset.

Folder Structure

Below is the structure of its content:

└── ./
  ├── existing_tasks # Problems derived from existing data science notebooks on Github/
  │  ├── metadata.json # Metadata by `build_existing_tasks_split.py` to reproduce this split.
  │  ├── artifacts/ # Folder that stores dependent ML datasets to execute the problems, created by running `build_existing_tasks_split.py`
  │  └── derived_datasets/ # Folder for preprocessed datasets used for prompting experiments.
  ├── new_tasks/
  │  ├── dataset.json # Original, unprepossessed dataset
  │  ├── kaggle_dataset_provenance.csv # Metadata of the Kaggle datasets used to build this split.
  │  ├── artifacts/ # Folder that stores dependent ML Kaggle datasets to execute the problems, created by running `build_new_tasks_split.py`
  │  └── derived_datasets/ # Folder for preprocessed datasets used for prompting experiments.
  └── checksums.txt # Table of MD5 checksums of dataset files.

Dataset File Structure

All the dataset '*.json' files follow the same structure. Each dataset file is a Json-serialized list of Episodes. Each episode corresponds to a notebook annotated with NL-to-code problems. The structure of an episode is documented below:

{
  "notebook_name": "Name of the notebook.",
  "work_dir": "Path to the dependent data artifacts (e.g., ML datasets) to execute the notebook.",
  "annotator": "Anonymized annotator Id."
  "turns": [
    # A list of natural language to code examples using the current notebook context.
    {
      "input": "Prompt to a code generation model.",
      "turn": {
        "intent": {
          "value": "Annotated NL intent for the current turn.",
          "is_cell_intent": "Metadata used for the existing tasks split to indicate if the code solution is only part of an existing code cell.",
          "cell_idx": "Index of the intent Markdown cell.",
          "line_span": "Line span of the intent.",
          "not_sure": "Annotation confidence.",
          "output_variables": "List of variable names denoting the output. If None, use the output of the last line of code as the output of the problem.",
        },
        "code": {
          "value": "Reference code solution.",
          "cell_idx": "Cell index of the code cell containing the solution.",
          "num_lines": "Number of lines in the reference solution.",
          "line_span": "Line span.",
        },
        "code_context": "Context code (all code cells before this problem) that need to be executed before executing the reference/predicted programs.",
        "delta_code_context": "Delta context code between the last problem in this notebook and the current problem, useful for incremental execution.",
        "metadata": {
          "annotator_id": "Annotator Id",
          "num_code_lines": "Metadata, please ignore.",
          "utterance_without_output_spec": "Annotated NL intent without output specification. Refer to the paper for details.",
        },
      },
      "notebook": "Field intended to store the Json-serialized Jupyter notebook. Not used for now since the notebook can be reconstructed from other metadata in this file.",
      "metadata": {
        # A dict of metadata of this turn.
        "context_cells": [ # A list of context cells before the problem.
          {
            "cell_type": "code|markdown",
            "source": "Cell content."
          },
        ],
        "delta_cell_num": "Number of preceding context cells between the prior turn and the current turn.",
        # The following fields only occur in datasets inlined with schema descriptions.
        "context_cell_num": "Number of context cells in the prompt after inserting schema descriptions and left-truncation.",
        "inten...

No Data Sources
kaggle.com
zip
Updated Apr 12, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaggle (2017). No Data Sources [Dataset]. https://www.kaggle.com/kaggle/no-data-sources
Explore at:
zip(159 bytes)Available download formats
Dataset updated
Apr 12, 2017
Dataset authored and provided by
Kagglehttp://kaggle.com/
Description
This isn't a dataset, it is a collection of kernels written on Kaggle that use no data at all.
h
kaggle-notebooks-outputs-filtered-25
huggingface.co
Updated Dec 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Computer Intelligence Project (2024). kaggle-notebooks-outputs-filtered-25 [Dataset]. https://huggingface.co/datasets/bigcomputer/kaggle-notebooks-outputs-filtered-25
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 28, 2024
Dataset authored and provided by
Computer Intelligence Project
Description
bigcomputer/kaggle-notebooks-outputs-filtered-25 dataset hosted on Hugging Face and contributed by the HF Datasets community
customCSS
kaggle.com
zip
Updated Apr 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
black_panther (2021). customCSS [Dataset]. https://www.kaggle.com/harshitkhandelwal/customcss
Explore at:
zip(1791 bytes)Available download formats
Dataset updated
Apr 6, 2021
Authors
black_panther
Description
These files can be used for adding custom effects to the notebook either in kaggle or locally
Kaggle Notebook User Rankings
kaggle.com
zip
Updated Aug 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
James Trotman (2021). Kaggle Notebook User Rankings [Dataset]. https://www.kaggle.com/jtrotman/kaggle-notebook-user-rankings
Explore at:
zip(66237 bytes)Available download formats
Dataset updated
Aug 17, 2021
Authors
James Trotman
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Created just for this: How To Compute Notebook Author Rankings

But feel free to re-use it if you wish!
kaggle-notebook-requirements
kaggle.com
zip
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
yama (2022). kaggle-notebook-requirements [Dataset]. https://www.kaggle.com/datasets/askeeee/kagglenotebookrequirements
Explore at:
zip(9392 bytes)Available download formats
Dataset updated
Dec 6, 2022
Authors
yama
Description
Dataset

This dataset was created by yama

Contents
Images
kaggle.com
zip
Updated Apr 12, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brenda N (2022). Images [Dataset]. https://www.kaggle.com/brendan45774/film-image
Explore at:
zip(1751971 bytes)Available download formats
Dataset updated
Apr 12, 2022
Authors
Brenda N
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

I used film image from this dataset for my EDA Tutorial Hollywood Movies to make a word cloud.

Bunny, meerkat, panda, quokka, and zebra is used for my HOG Features - Histogram of Oriented Gradients notebook. Please make sure to upvote them!
images
kaggle.com
zip
Updated Jan 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
George Zoto (2021). images [Dataset]. https://www.kaggle.com/georgezoto/images
Explore at:
zip(37083900 bytes)Available download formats
Dataset updated
Jan 21, 2021
Authors
George Zoto
Description
Context and Content

Images used in several of my Kaggle notebooks, each source is mentioned explicitly.
Assessing Computational Notebook Understandability through Code Metrics...
zenodo.org
zip
Updated Oct 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mojtaba Mostafavi; Mojtaba Mostafavi (2023). Assessing Computational Notebook Understandability through Code Metrics Analysis [Dataset]. http://doi.org/10.5281/zenodo.8435192
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8435192
Dataset updated
Oct 12, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mojtaba Mostafavi; Mojtaba Mostafavi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Computational notebooks have become the primary coding environment for data scientists. Despite their popularity, research on the code quality of these notebooks is still in its infancy, and the code shared in these notebooks is often of poor quality. Considering the importance of maintenance and reusability, it is crucial to pay attention to the comprehension of the notebook code and identify the notebook metrics that play a significant role in their comprehension. The level of code comprehension is a qualitative variable closely associated with the user's opinion about the code. Previous studies have typically employed two approaches to measure it. One approach involves using limited questionnaire methods to review a small number of code pieces. Another approach relies solely on metadata, such as the number of likes and user votes for a project in the software repository. In our approach, we enhanced the measurement of the understandability level of notebook code by leveraging user comments within a software repository. As a case study, we started with 248,761 Kaggle Jupyter notebooks introduced in previous studies and their relevant metadata. To identify user comments associated with code comprehension within the notebooks, we utilized a fine-tuned DistillBERT transformer. We established a \emph{user comment based criterion} for measuring code understandability by considering the number of code understandability-related comments, the upvotes on those comments, the total views of the notebook, and the total upvotes received by the notebook. This criterion has proven to be more effective than alternative methods, making it the ground truth for evaluating the code comprehension of our notebook set. In addition, we collected a total of 34 metrics for 10,857 notebooks, categorized as script-based and notebook-based metrics. These metrics were utilized as features in our dataset. Using the Random Forest classifier, our predictive model achieved 85% accuracy in predicting code comprehension levels in computational notebooks, identifying developer expertise and markdown-based metrics as key factors.
Meta_Kaggle_Scripts_cleaned_dataset
kaggle.com
zip
Updated Jul 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarvpreet Kaur (2025). Meta_Kaggle_Scripts_cleaned_dataset [Dataset]. https://www.kaggle.com/datasets/sarvpreetkaur22/meta-kaggle-scripts-cleaned-dataset/versions/1
Explore at:
zip(60101160 bytes)Available download formats
Dataset updated
Jul 17, 2025
Authors
Sarvpreet Kaur
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
📝 Description:

This is a filtered and cleaned version of Scripts.csv containing only relevant text columns for NLP modeling style analysis.

✅ Includes: ScriptId, UserId, ScriptTitle, ScriptMarkdown ✅ Null titles or markdowns removed ✅ Use for analyzing: • Modeling trends by region • Common techniques (e.g., LSTM, XGBoost, ResNet) • Word clouds and keyword frequency
Z
Data from: Dataset of paper "Why do Machine Learning Notebooks Crash?"
data-staging.niaid.nih.gov
nde-dev.biothings.io
Updated Mar 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wang, Yiran; Meijer, Willem; López, José Antonio Hernández; Nilsson, Ulf; Varró, Dániel (2025). Dataset of paper "Why do Machine Learning Notebooks Crash?" [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_14070487
Explore at:
Dataset updated
Mar 12, 2025
Dataset provided by
Linköping University
Authors
Wang, Yiran; Meijer, Willem; López, José Antonio Hernández; Nilsson, Ulf; Varró, Dániel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
All the related data of our paper "Why do Machine Learning Notebooks Crash?" includes:

GitHub and Kaggle notebooks that contain error outputs.

GitHub notebooks are from The Stack repository[1].

Kaggle notebooks are public notebooks on Kaggle platform from year 2023, downloaded via KGTorrent[2].

Identified programming language results of GitHub notebooks.

Identified ML library results from Kaggle notebooks.

Datasets of crashes from GitHub and Kaggle.

Clustering results of crashes from all crashes, and from GitHub and Kaggle respectively.

Sampled crashes and associated notebooks (organized by cluster id).

Manual labeling and reviewing results.

Reproducing results.

The related code repository can be found here.
models
kaggle.com
zip
Updated Nov 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
George Zoto (2021). models [Dataset]. https://www.kaggle.com/datasets/georgezoto/models
Explore at:
zip(720940582 bytes)Available download formats
Dataset updated
Nov 17, 2021
Authors
George Zoto
Description
Models hosted in this dataset

Petals_to_the_Metal.* models come from my popular notebook Computer Vision - Petals to the Metal🌻🌸🌹here In this notebook I take a step by step approach to: - Understand how TPUs work and how to use them ✅ - Explore transfer learning with 10+ models pretrained on either imagenet or noisy-student and evaluate their performance ✅ - Explore training large CNN models from scratch and evaluate their performance ✅ - Explore 10+ hyperparameter tuning methods and evaluate their performance ✅ - Explore 25+ combinations of models and tuning methods above and evaluate their performance ✅ - Ensemble models with loaded weights and evaluate their performance ✅ - Build a great looking vizualization that captures and highlights model + tuning performance ✅

Model naming convention - all trained in TensorFlow 2.2

The naming convention I follow in this dataset is: Kaggle Competition-[Tuning]-Model Name.[h5|tflite] For example:
Petals_to_the_Metal-DenseNet201.h5 is a saved Keras model for the Petals to the Metal Kaggle competition using a pretrained DenseNet201 model on imagenet. Petals_to_the_Metal-EfficientNetB7.h5 is a saved Keras model for the Petals to the Metal Kaggle competition using a pretrained EfficientNetB7 model on noisy-student. Petals_to_the_Metal-70K_images-trainable_True-DenseNet201.h5 is a saved Keras model for the Petals to the Metal Kaggle competition starting with a pretrained DenseNet201 model on imagenet but performing end to end training (trainable_True). It also uses 70K (5x more than the standard models) images from other flower datasets. Petals_to_the_Metal-70K_images-trainable_True-MobileNetV2.tflite is a TFLite model for the Petals to the Metal Kaggle competition converted from the Petals_to_the_Metal-70K_images-trainable_True-MobileNetV2.h5 model.

See my notebook above for more information.

Best 😀 George
kaggle-compatible diffvg install
kaggle.com
zip
Updated May 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kevin Clark (2025). kaggle-compatible diffvg install [Dataset]. https://www.kaggle.com/datasets/kevlark/diffvg
Explore at:
zip(16963007 bytes)Available download formats
Dataset updated
May 3, 2025
Authors
Kevin Clark
Description
Dataset

This dataset was created by Kevin Clark

Contents

Facebook

Twitter

Click to copy link

Link copied

Cite

Hugging Face Smol Models Research (2025). issues-kaggle-notebooks [Dataset]. https://huggingface.co/datasets/HuggingFaceTB/issues-kaggle-notebooks

issues-kaggle-notebooks

HuggingFaceTB/issues-kaggle-notebooks

Explore at:

Dataset updated

Aug 12, 2025

Dataset provided by

Hugging Facehttps://huggingface.co/

Authors

Hugging Face Smol Models Research

Description

GitHub Issues & Kaggle Notebooks

  Description

GitHub Issues & Kaggle Notebooks is a collection of two code datasets intended for language models training, they are sourced from GitHub issues and notebooks in Kaggle platform. These datasets are a modified part of the StarCoder2 model training corpus, precisely the bigcode/StarCoder2-Extras dataset. We reformat the samples to remove StarCoder2's special tokens and use natural text to delimit comments in issues and display… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceTB/issues-kaggle-notebooks.

Clear search

Close search

Google apps

Main menu

issues-kaggle-notebooks

Meta Kaggle Code

Explore our public notebook content!

Why we’re releasing this dataset

Sensitive data

Joining with Meta Kaggle

File organization

Questions / Comments

Top Kaggle Notebooks dataset: R

Data from: KGTorrent: A Dataset of Python Jupyter Notebooks from Kaggle

kaggle-notebooks-edu-v0

Here comes markdown content

Here comes python code

Here comes code output

More… See the full description on the dataset page: https://huggingface.co/datasets/jupyter-agent/kaggle-notebooks-edu-v0.

Kaggle Notebooks Ranking

Context

Content

Acknowledgements

kaggle-notebooks-conversations-hq

Arcade Natural Language to Code Challenge

Arcade: Natural Language to Code Generation in Interactive Computing Notebooks

Folder Structure

Dataset File Structure

No Data Sources

kaggle-notebooks-outputs-filtered-25

customCSS

Kaggle Notebook User Rankings

kaggle-notebook-requirements

Dataset

Contents

Images

Context

images

Context and Content

Assessing Computational Notebook Understandability through Code Metrics...

Meta_Kaggle_Scripts_cleaned_dataset

📝 Description:

Data from: Dataset of paper "Why do Machine Learning Notebooks Crash?"

models

Models hosted in this dataset

Model naming convention - all trained in TensorFlow 2.2

kaggle-compatible diffvg install

Dataset

Contents

issues-kaggle-notebooks

HuggingFaceTB/issues-kaggle-notebooks