100+ datasets found

Kaggle's Most Used Packages & Method Calls

kaggle.com

zip

Updated Jun 13, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

TheItCrow (2025). Kaggle's Most Used Packages & Method Calls [Dataset]. https://www.kaggle.com/datasets/kevinbnisch/kaggles-most-used-packages-and-method-calls

Explore at:

zip(2405388375 bytes)Available download formats

Dataset updated

Jun 13, 2025

Authors

TheItCrow

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Enriching the Meta-Kaggle dataset using the Meta Kaggle Code to extract all Imports (for both R and Python) and Method Calls (only Python) as lists, which are then added to the KernelVersions.csv file as the columns Imports and MethodCalls.

Most Imported R Packages	Most Imported Python Packages
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F17421843%2F5bb95536aa5d8092d56f526aa04c8cd1%2Foutput.png?generation=1749374431744993&alt=media" alt="">	https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F17421843%2Fa3d9a02ae0b314bfa6b3eb411c405ec0%2Foutput1.png?generation=1749374439690291&alt=media" alt="">

We perform this extraction using the following three regex patterns:

PYTHON_IMPORT_REGEX = re.compile(r'(?:from\s+([a-zA-Z0-9_\.]+)\s+import|import\s+([a-zA-Z0-9_\.]+))')
PYTHON_METHOD_REGEX = *I wish I could add the regex here but kaggle kinda breaks if I do lol*
R_IMPORT_REGEX = re.compile(r'(?:library|require)\((?:[\'"]?)([a-zA-Z0-9_.]+)(?:[\'"]?)\)')

This dataset was created on 06-06-2025. Since the computation required for this process is very resource-intensive and cannot be run on a Kaggle kernel, it is not scheduled. A notebook demonstrating how to create this dataset and what insights it provides can be found here.

Code4ML 2.0

zenodo.org

csv, txt

Updated May 19, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Anonimous authors; Anonimous authors (2025). Code4ML 2.0 [Dataset]. http://doi.org/10.5281/zenodo.15465737

Explore at:

csv, txtAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.15465737

Dataset updated

May 19, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Anonimous authors; Anonimous authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This is an enriched version of the Code4ML dataset, a large-scale corpus of annotated Python code snippets, competition summaries, and data descriptions sourced from Kaggle. The initial release includes approximately 2.5 million snippets of machine learning code extracted from around 100,000 Jupyter notebooks. A portion of these snippets has been manually annotated by human assessors through a custom-built, user-friendly interface designed for this task.

The original dataset is organized into multiple CSV files, each containing structured data on different entities:

code_blocks.csv: Contains raw code snippets extracted from Kaggle.
kernels_meta.csv: Metadata for the notebooks (kernels) from which the code snippets were derived.
competitions_meta.csv: Metadata describing Kaggle competitions, including information about tasks and data.
markup_data.csv: Annotated code blocks with semantic types, allowing deeper analysis of code structure.
vertices.csv: A mapping from numeric IDs to semantic types and subclasses, used to interpret annotated code blocks.

Table 1. code_blocks.csv structure

Column	Description
code_blocks_index	Global index linking code blocks to markup_data.csv.
kernel_id	Identifier for the Kaggle Jupyter notebook from which the code block was extracted.
code_block_id	Position of the code block within the notebook.
code_block	The actual machine learning code snippet.

Table 2. kernels_meta.csv structure

Column	Description
kernel_id	Identifier for the Kaggle Jupyter notebook.
kaggle_score	Performance metric of the notebook.
kaggle_comments	Number of comments on the notebook.
kaggle_upvotes	Number of upvotes the notebook received.
kernel_link	URL to the notebook.
comp_name	Name of the associated Kaggle competition.

Table 3. competitions_meta.csv structure

Column	Description
comp_name	Name of the Kaggle competition.
description	Overview of the competition task.
data_type	Type of data used in the competition.
comp_type	Classification of the competition.
subtitle	Short description of the task.
EvaluationAlgorithmAbbreviation	Metric used for assessing competition submissions.
data_sources	Links to datasets used.
metric type	Class label for the assessment metric.

Table 4. markup_data.csv structure

Column	Description
code_block	Machine learning code block.
too_long	Flag indicating whether the block spans multiple semantic types.
marks	Confidence level of the annotation.
graph_vertex_id	ID of the semantic type.

The dataset allows mapping between these tables. For example:

code_blocks.csv can be linked to kernels_meta.csv via the kernel_id column.
kernels_meta.csv is connected to competitions_meta.csv through comp_name. To maintain quality, kernels_meta.csv includes only notebooks with available Kaggle scores.

In addition, data_with_preds.csv contains automatically classified code blocks, with a mapping back to code_blocks.csvvia the code_blocks_index column.

Code4ML 2.0 Enhancements

The updated Code4ML 2.0 corpus introduces kernels extracted from Meta Kaggle Code. These kernels correspond to the kaggle competitions launched since 2020. The natural descriptions of the competitions are retrieved with the aim of LLM.

Notebooks in kernels_meta2.csv may not have a Kaggle score but include a leaderboard ranking (rank), providing additional context for evaluation.

competitions_meta_2.csv is enriched with data_cards, decsribing the data used in the competitions.

Applications

The Code4ML 2.0 corpus is a versatile resource, enabling training and evaluation of models in areas such as:

Code generation
Code understanding
Natural language processing of code-related tasks

Kaggle Survey 2022 kernel stats
kaggle.com
zip
Updated Nov 12, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Ryzhkov (2022). Kaggle Survey 2022 kernel stats [Dataset]. https://www.kaggle.com/datasets/alexryzhkov/kaggle-survey-2022-kernel-stats
Explore at:
zip(4854 bytes)Available download formats
Dataset updated
Nov 12, 2022
Authors
Alexander Ryzhkov
Description
Dataset

This dataset was created by Alexander Ryzhkov

Contents
Manoeuvring Kaggle Kernel and Data Environment
kaggle.com
zip
Updated Aug 30, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Regi (2018). Manoeuvring Kaggle Kernel and Data Environment [Dataset]. https://www.kaggle.com/regivm/kernel
Explore at:
zip(7410 bytes)Available download formats
Dataset updated
Aug 30, 2018
Authors
Regi
Description
Dataset

This dataset was created by Regi

Contents
Kaggles' top Kernels and Datasets
kaggle.com
zip
Updated Apr 17, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Liam Larsen (2017). Kaggles' top Kernels and Datasets [Dataset]. https://www.kaggle.com/datasets/kingburrito666/kaggles-top-kernels-and-datasets
Explore at:
zip(10729 bytes)Available download formats
Dataset updated
Apr 17, 2017
Authors
Liam Larsen
Description
Context

The reason I did this was because I wanted to know if there was a correlation between Kaggles' top Kernels and Datasets with its popularity. (wanted to know how to get tops, lol). I scrapped the data using DataMiner

Content

top-kernels has:

Name. name

Upvotes. upvotes

Language. language used

Comments. amount comments

IsNotebook. is it a notebook or a script

visuals. how many visuals?

top-datasets has:

Name. name

upvotes. number of upvotes

downloads. number of downloads

comments. number of comments

author. author name

updated. last updated

description. dataset description

Go kaggle team
ImagesForKernel
kaggle.com
zip
Updated Jul 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
deeplearner (2020). ImagesForKernel [Dataset]. https://www.kaggle.com/adarshpathak/imagesforkernel
Explore at:
zip(1424175 bytes)Available download formats
Dataset updated
Jul 9, 2020
Authors
deeplearner
Description
Dataset

This dataset was created by deeplearner

Contents
OS Kernel Anomaly Dataset
kaggle.com
zip
Updated May 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ziya (2025). OS Kernel Anomaly Dataset [Dataset]. https://www.kaggle.com/datasets/ziya07/os-kernel-anomaly-dataset
Explore at:
zip(15689 bytes)Available download formats
Dataset updated
May 5, 2025
Authors
Ziya
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset is designed to support research in anomaly detection for OS kernels, particularly in the context of power monitoring systems used in embedded environments. It simulates the interaction between system-level operations and power consumption behaviors, providing a rich set of features for training and evaluating hybrid models.

The dataset contains 1,000 records of yet realistic system behavior, including:

System call sequences

Power usage logs (in watts)

CPU and memory utilization

Process identifiers and names

Timestamps

Labeled entries (Normal or Anomaly)

Anomalies are injected using fuzzy testing principles to simulate abnormal power spikes, syscall irregularities, or excessive resource usage, mimicking real-world kernel faults or malicious activity. This dataset enables the development of robust models that can learn complex, uncertain system behavior patterns for enhanced security and stability of embedded power monitoring applications.
Kaggle Survey Challenge - All Kernels
kaggle.com
zip
Updated Nov 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KlemenVodopivec (2022). Kaggle Survey Challenge - All Kernels [Dataset]. https://www.kaggle.com/datasets/klemenvodopivec/kaggle-survey-challenge-all-kernels/data
Explore at:
zip(206438 bytes)Available download formats
Dataset updated
Nov 22, 2022
Authors
KlemenVodopivec
Description
Collections of kernels submissions for the Kaggle survey competitions from 2017 to 2022. As this data was collected during the 2022 survey competition, it does not contain all the kernels for year 2022 .
Tensorflow's Global and Operation level seeds
kaggle.com
zip
Updated May 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deepak Ahire (2023). Tensorflow's Global and Operation level seeds [Dataset]. https://www.kaggle.com/datasets/adeepak7/tensorflow-global-and-operation-level-seeds
Explore at:
zip(2984 bytes)Available download formats
Dataset updated
May 20, 2023
Authors
Deepak Ahire
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Description
This dataset contains the python files containing snippets required for the Kaggle kernel - https://www.kaggle.com/code/adeepak7/tensorflow-s-global-and-operation-level-seeds/

Since the kernel is around setting/re-setting global and local level seeds, the nullification of the effect of these seeds in the subsequent cells wasn't possible. Hence, the snippets have been provided as separate python files and these python files are executed independently in the separate cells.
Clean Meta Kaggle
kaggle.com
Updated Sep 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yoni Kremer (2023). Clean Meta Kaggle [Dataset]. https://www.kaggle.com/datasets/yonikremer/clean-meta-kaggle
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 8, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Yoni Kremer
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Cleaned Meta-Kaggle Dataset

The Original Dataset - Meta-Kaggle

Explore our public data on competitions, datasets, kernels (code / notebooks) and more Meta Kaggle may not be the Rosetta Stone of data science, but we do think there's a lot to learn (and plenty of fun to be had) from this collection of rich data about Kaggle’s community and activity.

Strategizing to become a Competitions Grandmaster? Wondering who, where, and what goes into a winning team? Choosing evaluation metrics for your next data science project? The kernels published using this data can help. We also hope they'll spark some lively Kaggler conversations and be a useful resource for the larger data science community.

https://i.imgur.com/2Egeb8R.png" alt="" title="a title">

This dataset is made available as CSV files through Kaggle Kernels. It contains tables on public activity from Competitions, Datasets, Kernels, Discussions, and more. The tables are updated daily.

Please note: This data is not a complete dump of our database. Rows, columns, and tables have been filtered out and transformed.

August 2023 update

In August 2023, we released Meta Kaggle for Code, a companion to Meta Kaggle containing public, Apache 2.0 licensed notebook data. View the dataset and instructions for how to join it with Meta Kaggle here

We also updated the license on Meta Kaggle from CC-BY-NC-SA to Apache 2.0.

The Problems with the Original Dataset

The original dataset is 32 CSV files, with 268 colums and 7GB of compressed data. Having so many tables and columns makes it hard to understand the data.

The data is not normalized, so when you join tables you get a lot of errors.

Some values refer to non-existing values in other tables. For example, the UserId column in the ForumMessages table has values that do not exist in the Users table.

There are missing values.

There are duplicate values.

There are values that are not valid. For example, Ids that are not positive integers.

The date and time columns are not in the right format.

Some columns only have the same value for all rows, so they are not useful.

The boolean columns have string values True or False.

Incorrect values for the Total columns. For example, the DatasetCount is not the total number of datasets with the Tag according to the DatasetTags table.

Users upvote their own messages.

The Solution

To handle so many tables and columns I use a relational database. I use MySQL, but you can use any relational database.

The steps to create the database are:

Creating the database tables with the right data types and constraints. I do that by running the db_abd_create_tables.sql script.

Downloading the CSV files from Kaggle using the Kaggle API.

Cleaning the data using pandas. I do that by running the clean_data.py script. The script does the following steps for each table:

Drops the columns that are not needed.

Converts each column to the right data type.

Replaces foreign keys that do not exist with NULL.

Replaces some of the missing values with default values.

Removes rows where there are missing values in the primary key/not null columns.

Removes duplicate rows.

Loading the data into the database using the LOAD DATA INFILE command.

Checks that the number of rows in the database tables is the same as the number of rows in the CSV files.

Adds foreign key constraints to the database tables. I do that by running the add_foreign_keys.sql script.

Update the Total columns in the database tables. I do that by running the update_totals.sql script.

Backup the database.
book kaggle
kaggle.com
zip
Updated Feb 27, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ravi Bharathi (2020). book kaggle [Dataset]. https://www.kaggle.com/ravibharathii/book-kaggle
Explore at:
zip(5615 bytes)Available download formats
Dataset updated
Feb 27, 2020
Authors
Ravi Bharathi
Description
Dataset

This dataset was created by Ravi Bharathi

Released under Data files © Original Authors

Contents
kaggle survey historical meta
kaggle.com
zip
Updated Nov 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
jak kajdsfs (2022). kaggle survey historical meta [Dataset]. https://www.kaggle.com/datasets/jakkajdsfs/kaggle-survey-historical-meta
Explore at:
zip(1527614294 bytes)Available download formats
Dataset updated
Nov 22, 2022
Authors
jak kajdsfs
Description
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12065493%2Fefa5252a24bf0bdd393156ad5778ed02%2Fkernel_dataclass.png?generation=1669324775217833&alt=media" alt="">

All in all, in the years 2017-2022 1822 kernels used the Kaggle Survey datasets. We have ordered our data into several distinct datasets, each of which was useful in obtaining answers to our questions on at least one of the topics. The obtained datasets are briefly overviewed below.

notebooks.zip

Contains 1822 raw notebooks saved as either ipynb or Rmd. 58 notebooks could not be executed neither in Python nor in R, so they were given the extension unknown_format.txt. The name of each file is the notebook_id as listed on kaggle.com and matches notebook_id in the file all_kernels.csv, which is described below. Among other things, this dataset was used to obtain a per-notebook list of imported libraries, as well as the questions that were addressed by each notebook.

all_kernels.csv

Each row of this dataset contains data about one of the 1822 kernels. The columns correspond to all the fields listed in the Kernel class image above. A more detailed overview of the columns can be found on the dataset's Kaggle page. # TODO

cleaned_kernels.csv

This is in effect the main dataset we used in our competition notebook. We took all_kernels.csv and removed from it 233 rows which described kernels which were just unchanged forks of other kernels.

all_questions.json

Contains all Kaggle Survey questions from the years 2017-2022. In the year 2017, the survey questions were unnumbered, so we numbered them ourselves, keeping the original order and using zero-based indexing. Surveys 2018-2022 have numbered questions, so the index was taken unchanged.

question_map.csv

Looking at survey questions over several years, one can note that certain questions repeat. For example, every year's survey contains a question What is your age. All such repetitions are captured in this dataset. For each unique question, the question number and the survey year where this question appears is given. The question numbers are described in the preceding paragraph sorted_questions_all.json. Certain questions are worded differently but functionally identical. If such questions were joined, a note was added, to alert other users of this dataset.
Kaggle Dataset Metadata Repository
kaggle.com
zip
Updated Nov 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ijaj Ahmed (2024). Kaggle Dataset Metadata Repository [Dataset]. https://www.kaggle.com/datasets/ijajdatanerd/kaggle-dataset-metadata-repository
Explore at:
zip(5122110 bytes)Available download formats
Dataset updated
Nov 16, 2024
Authors
Ijaj Ahmed
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13367141%2F444a868e669671faf9007822d6f2d348%2FAdd%20a%20heading.png?generation=1731775788329917&alt=media" alt="">

Kaggle Dataset Metadata Collection 📊

This dataset provides comprehensive metadata on various Kaggle datasets, offering detailed information about the dataset owners, creators, usage statistics, licensing, and more. It can help researchers, data scientists, and Kaggle enthusiasts quickly analyze the key attributes of different datasets on Kaggle. 📚

Dataset Overview:

Purpose: To provide detailed insights into Kaggle dataset metadata.

Content: Information related to the dataset's owner, creator, usage metrics, licensing, and more.

Target Audience: Data scientists, Kaggle competitors, and dataset curators.

Columns Description 📋

datasetUrl 🌐: The URL of the Kaggle dataset page. This directs you to the specific dataset's page on Kaggle.

ownerAvatarUrl 🖼️: The URL of the dataset owner's profile avatar on Kaggle.

ownerName 👤: The name of the dataset owner. This can be the individual or organization that created and maintains the dataset.

ownerUrl 🌍: A link to the Kaggle profile page of the dataset owner.

ownerUserId 💼: The unique user ID of the dataset owner on Kaggle.

ownerTier 🎖️: The ownership tier, such as "Tier 1" or "Tier 2," indicating the owner's status or level on Kaggle.

creatorName 👩‍💻: The name of the dataset creator, which could be different from the owner.

creatorUrl 🌍: A link to the Kaggle profile page of the dataset creator.

creatorUserId 💼: The unique user ID of the dataset creator.

scriptCount 📜: The number of scripts (kernels) associated with this dataset.

scriptsUrl 🔗: A link to the scripts (kernels) page for the dataset, where you can explore related code.

forumUrl 💬: The URL to the discussion forum for this dataset, where users can ask questions and share insights.

viewCount 👀: The number of views the dataset page has received on Kaggle.

downloadCount ⬇️: The number of times the dataset has been downloaded by users.

dateCreated 📅: The date when the dataset was first created and uploaded to Kaggle.

dateUpdated 🔄: The date when the dataset was last updated or modified.

voteButton 👍: The metadata for the dataset's vote button, showing how users interact with the dataset's quality ratings.

categories 🏷️: The categories or tags associated with the dataset, helping users filter datasets based on topics of interest (e.g., "Healthcare," "Finance").

licenseName 🛡️: The name of the license under which the dataset is shared (e.g., "CC0," "MIT License").

licenseShortName 🔑: A short form or abbreviation of the dataset's license name (e.g., "CC0" for Creative Commons Zero).

datasetSize 📦: The size of the dataset in terms of storage, typically measured in MB or GB.

commonFileTypes 📂: A list of common file types included in the dataset (e.g., .csv, .json, .xlsx).

downloadUrl ⬇️: A direct link to download the dataset files.

newKernelNotebookUrl 📝: A link to a new kernel or notebook related to this dataset, for those who wish to explore it programmatically.

newKernelScriptUrl 💻: A link to a new script for running computations or processing data related to the dataset.

usabilityRating 🌟: A rating or score representing how usable the dataset is, based on user feedback.

firestorePath 🔍: A reference to the path in Firestore where this dataset’s metadata is stored.

datasetSlug 🏷️: A URL-friendly version of the dataset name, typically used for URLs.

rank 📈: The dataset's rank based on certain metrics (e.g., downloads, votes, views).

datasource 🌐: The source or origin of the dataset (e.g., government data, private organizations).

medalUrl 🏅: A URL pointing to the dataset's medal or badge, indicating the dataset's quality or relevance.

hasHashLink 🔗: Indicates whether the dataset has a hash link for verifying data integrity.

ownerOrganizationId 🏢: The unique organization ID of the dataset's owner if the owner is an organization rather than an individual.

totalVotes 🗳️: The total number of votes the dataset has received from users, reflecting its popularity or quality.

category_names 📑: A comma-separated string of category names that represent the dataset’s classification.

This dataset is a valuable resource for those who want to analyze Kaggle's ecosystem, discover high-quality datasets, and explore metadata in a structured way. 🌍📊
No Data Sources
kaggle.com
zip
Updated Apr 12, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaggle (2017). No Data Sources [Dataset]. https://www.kaggle.com/kaggle/no-data-sources
Explore at:
zip(159 bytes)Available download formats
Dataset updated
Apr 12, 2017
Dataset authored and provided by
Kagglehttp://kaggle.com/
Description
This isn't a dataset, it is a collection of kernels written on Kaggle that use no data at all.
Stacknet
kaggle.com
zip
Updated Mar 15, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
akKaggle (2019). Stacknet [Dataset]. https://www.kaggle.com/datasets/akkaggle2018/stacknet
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Mar 15, 2019
Authors
akKaggle
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Context

Kaggle kernels doesn't have the package, pystacknet, so we created a dataset containing it for the Petfinder Competition

Acknowledgements

Code from: https://github.com/h2oai/pystacknet

@bkkaggle (https://www.kaggle.com/bkkaggle) helped with creating the dataset
Wheat Variety Classification
kaggle.com
zip
Updated Nov 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sudhanshu Rastogi (2022). Wheat Variety Classification [Dataset]. https://www.kaggle.com/datasets/sudhanshu2198/wheat-variety-classification
Explore at:
zip(3877 bytes)Available download formats
Dataset updated
Nov 23, 2022
Authors
Sudhanshu Rastogi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Data Set Information:

The dataset comprised wheat kernels belonging to three different varieties of wheat: Kama, Rosa and Canadian, 70 elements each. The data set can be used for the tasks of classification and cluster analysis.All of these parameters were real-valued continuous

Attribute Information:

To construct the data, seven geometric parameters of wheat kernels were measured:

area A,

perimeter P,

compactness C = 4*pi*A/P^2,

length of kernel,

width of kernel,

asymmetry coefficient

length of kernel groove.
kaggle-severstal-kernel
kaggle.com
zip
Updated Oct 24, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maksim Filin (2019). kaggle-severstal-kernel [Dataset]. https://www.kaggle.com/xsardas/kaggleseverstalkernel
Explore at:
zip(7073 bytes)Available download formats
Dataset updated
Oct 24, 2019
Authors
Maksim Filin
Description
Dataset

This dataset was created by Maksim Filin

Contents
PlaygroundS4E04|OriginalData
kaggle.com
zip
Updated Apr 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ravi Ramakrishnan (2024). PlaygroundS4E04|OriginalData [Dataset]. https://www.kaggle.com/datasets/ravi20076/playgrounds4e04originaldata
Explore at:
zip(67811 bytes)Available download formats
Dataset updated
Apr 1, 2024
Authors
Ravi Ramakrishnan
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset is created using the below reference-
https://archive.ics.uci.edu/dataset/1/abalone
We import the corresponding repository in a Kaggle kernel and populate the dataset thereby. Users may choose to import the corresponding dataset with a simple read_csv in pandas and proceed with the solution.

Best wishes!
progresbar2-local
kaggle.com
zip
Updated Jun 7, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Justin Chae (2021). progresbar2-local [Dataset]. https://www.kaggle.com/justinchae/progresbar2local
Explore at:
zip(47743 bytes)Available download formats
Dataset updated
Jun 7, 2021
Authors
Justin Chae
Description
Dataset

This dataset was created by Justin Chae

Contents
all_kernels_cleaned
kaggle.com
zip
Updated Nov 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KlemenVodopivec (2022). all_kernels_cleaned [Dataset]. https://www.kaggle.com/datasets/klemenvodopivec/all-kernels-cleaned
Explore at:
zip(79146 bytes)Available download formats
Dataset updated
Nov 16, 2022
Authors
KlemenVodopivec
Description
Dataset

This dataset was created by KlemenVodopivec

Contents

Facebook

Twitter

Click to copy link

Link copied

Cite

TheItCrow (2025). Kaggle's Most Used Packages & Method Calls [Dataset]. https://www.kaggle.com/datasets/kevinbnisch/kaggles-most-used-packages-and-method-calls

Kaggle's Most Used Packages & Method Calls

Kaggle-Meta Enriched With Imports & Method Calls

Explore at:

zip(2405388375 bytes)Available download formats

Dataset updated

Jun 13, 2025

Authors

TheItCrow

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Most Imported R Packages	Most Imported Python Packages
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F17421843%2F5bb95536aa5d8092d56f526aa04c8cd1%2Foutput.png?generation=1749374431744993&alt=media" alt="">	https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F17421843%2Fa3d9a02ae0b314bfa6b3eb411c405ec0%2Foutput1.png?generation=1749374439690291&alt=media" alt="">

We perform this extraction using the following three regex patterns:

PYTHON_IMPORT_REGEX = re.compile(r'(?:from\s+([a-zA-Z0-9_\.]+)\s+import|import\s+([a-zA-Z0-9_\.]+))')
PYTHON_METHOD_REGEX = *I wish I could add the regex here but kaggle kinda breaks if I do lol*
R_IMPORT_REGEX = re.compile(r'(?:library|require)\((?:[\'"]?)([a-zA-Z0-9_.]+)(?:[\'"]?)\)')

Clear search

Close search

Google apps

Main menu

Kaggle's Most Used Packages & Method Calls

Code4ML 2.0

Code4ML 2.0 Enhancements

Applications

Kaggle Survey 2022 kernel stats

Dataset

Contents

Manoeuvring Kaggle Kernel and Data Environment

Dataset

Contents

Kaggles' top Kernels and Datasets

Context

Content

Go kaggle team

ImagesForKernel

Dataset

Contents

OS Kernel Anomaly Dataset

Kaggle Survey Challenge - All Kernels

Tensorflow's Global and Operation level seeds

Clean Meta Kaggle

Cleaned Meta-Kaggle Dataset

The Original Dataset - Meta-Kaggle

August 2023 update

The Problems with the Original Dataset

The Solution

book kaggle

Dataset

Contents

kaggle survey historical meta

Kaggle Dataset Metadata Repository

Kaggle Dataset Metadata Collection 📊

Dataset Overview:

Columns Description 📋

No Data Sources

Stacknet

Context

Acknowledgements

Wheat Variety Classification

Data Set Information:

Attribute Information:

kaggle-severstal-kernel

Dataset

Contents

PlaygroundS4E04|OriginalData

progresbar2-local

Dataset

Contents

all_kernels_cleaned

Dataset

Contents

Kaggle's Most Used Packages & Method Calls

Kaggle-Meta Enriched With Imports & Method Calls