100+ datasets found

Manoeuvring Kaggle Kernel and Data Environment
kaggle.com
Updated Apr 11, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Regi (2019). Manoeuvring Kaggle Kernel and Data Environment [Dataset]. https://www.kaggle.com/regivm/kernel/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 11, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Regi
Description
Dataset

This dataset was created by Regi

Contents
ExploRNA_input
kaggle.com
Updated Jan 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fatih Ozturk (2021). ExploRNA_input [Dataset]. https://www.kaggle.com/datasets/fatihozturk/fixed-draw
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 28, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Fatih Ozturk
Description
Dataset

This dataset was created by Fatih Öztürk

Contents
ExploRNA_input
kaggle.com
zip
Updated Jan 22, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fatih Ozturk (2021). ExploRNA_input [Dataset]. https://www.kaggle.com/fatihozturk/fixed-draw
Explore at:
zip(894773063 bytes)Available download formats
Dataset updated
Jan 22, 2021
Authors
Fatih Ozturk
Description
Dataset

This dataset was created by Fatih Ozturk

Contents
Titanic Dataset - cleaned
kaggle.com
Updated Aug 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
WinstonSDodson (2019). Titanic Dataset - cleaned [Dataset]. https://www.kaggle.com/winstonsdodson/titanic-dataset-cleaned/kernels
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 9, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
WinstonSDodson
Description
This is the classic Titanic Dataset provided in the Kaggle Competition K Kernel and then cleaned in one of the most popular Kernels there. Please see the Kernel titled, "A Data Science Framework: To Achieve 99% Accuracy" for a great lesson in data science. This Kernel gives a great explanaton of the thinking behind the of this data cleaning as well as a very professional demonstration of the technologies and skills to do so. It then continues to provide an overview of many ML techniques and it is copiously and meticulously documented with many useful citations.

Of course, data cleaning is an essential skill in data science but I wanted to use this data for a study of other machine learning techniques. So, I found and used this set of data that is well known and cleaned to a benchmark accepted by many.
ImagesForKernel
kaggle.com
Updated Jul 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
deeplearner (2020). ImagesForKernel [Dataset]. https://www.kaggle.com/adarshpathak/imagesforkernel/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 9, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
deeplearner
Description
Dataset

This dataset was created by deeplearner

Contents
Haar Cascades for Face Detection
kaggle.com
Updated Dec 21, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gabriel Preda (2019). Haar Cascades for Face Detection [Dataset]. https://www.kaggle.com/gpreda/haar-cascades-for-face-detection/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 21, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Gabriel Preda
Description
Context

The data is from: https://github.com/opencv/opencv/tree/master/data/haarcascades, included here for easy usage with OpenCV from Kaggle Kernels.

Content

Haar Cascades for Face, Profile, Eyes, Smile, Upper Body Detection.

Acknowledgements

The copyright belongs to the authors as mentioned in each file. Data source is from: https://github.com/opencv/opencv/tree/master/data/haarcascades

Inspiration

Use this data to extract persons face, eyes, smile, profile face, upper body from both still images and videos.
The All in One Model Prediction Files
kaggle.com
Updated Mar 28, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zafar (2018). The All in One Model Prediction Files [Dataset]. https://www.kaggle.com/datasets/fizzbuzz/the-all-in-one-model-prediction-files/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 28, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Zafar
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Zafar

Released under CC0: Public Domain

Contents

Prediction files
4717-kernel-3
kaggle.com
Updated Jun 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
YuanXu (2023). 4717-kernel-3 [Dataset]. https://www.kaggle.com/datasets/xuyuan/4717kernel3
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 7, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
YuanXu
Description
Dataset

This dataset was created by YuanXu

Contents
CSV Files Used in My Kernel
kaggle.com
Updated Jun 2, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Darien Schettler (2018). CSV Files Used in My Kernel [Dataset]. https://www.kaggle.com/dschettler8845/csv-files-used-in-my-kernel/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 2, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Darien Schettler
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Darien Schettler

Released under CC0: Public Domain

Contents
PetFinder External Data
kaggle.com
Updated Mar 22, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
luyaxin (2019). PetFinder External Data [Dataset]. https://www.kaggle.com/datasets/m7catsue/petfinder-external-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 22, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
luyaxin
Description
Dataset

This dataset was created by luyaxin

Contents
infer-kernel
kaggle.com
Updated Oct 24, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maksim Filin (2019). infer-kernel [Dataset]. https://www.kaggle.com/xsardas/inferkernel/kernels
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 24, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Maksim Filin
Description
Dataset

This dataset was created by Maksim Filin

Contents
Sentiment bags of words
kaggle.com
Updated Jun 6, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
My Ho63 (2020). Sentiment bags of words [Dataset]. https://www.kaggle.com/myho63/sentiment-bags-of-words/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 6, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
My Ho63
Description
Dataset

This dataset was created by My Ho63

Contents
Regression Kernel
kaggle.com
zip
Updated Jan 24, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ogün Can Kaya (2019). Regression Kernel [Dataset]. https://www.kaggle.com/datasets/joousep/regression-kernel/discussion
Explore at:
zip(11828 bytes)Available download formats
Dataset updated
Jan 24, 2019
Authors
Ogün Can Kaya
Description
Dataset

This dataset was created by Ogün Can Kaya

Contents
fastai2 wheels
kaggle.com
Updated Jun 25, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vijayabhaskar J (2020). fastai2 wheels [Dataset]. https://www.kaggle.com/vijayabhaskar96/fastai2-wheels/kernels
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 25, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Vijayabhaskar J
Description
Dataset

This dataset was created by Vijayabhaskar J

Contents
Cat in dat: Kernels
kaggle.com
zip
Updated Sep 6, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pavel Prokhorov (2019). Cat in dat: Kernels [Dataset]. https://www.kaggle.com/datasets/pavelvpster/cat-in-dat-kernels
Explore at:
zip(11470236 bytes)Available download formats
Dataset updated
Sep 6, 2019
Authors
Pavel Prokhorov
Description
Context

This dataset contains submissions (and scores) obtained from well performed kernels published in https://www.kaggle.com/c/cat-in-the-dat competition.

Links are in 'kernels.csv' file.

Regards to great authors!
Model_kernel_howud
kaggle.com
Updated Jan 6, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HowYouDoin' (2020). Model_kernel_howud [Dataset]. https://www.kaggle.com/datasets/namaniitb/model-kernel-howud/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 6, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
HowYouDoin'
Description
Dataset

This dataset was created by HowYouDoin'

Contents
unet_characters_pictures
kaggle.com
zip
Updated Sep 13, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ruslan Baynazarov (2019). unet_characters_pictures [Dataset]. https://www.kaggle.com/hocop1/unet-characters-pictures
Explore at:
zip(346426 bytes)Available download formats
Dataset updated
Sep 13, 2019
Authors
Ruslan Baynazarov
Description
Pictures for my kernel https://www.kaggle.com/hocop1/unet-character-detector
sgemm-kernel
kaggle.com
Updated Sep 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sujeeth Shetty (2020). sgemm-kernel [Dataset]. https://www.kaggle.com/isujeeth/sgemm/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 5, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sujeeth Shetty
Description
Context

SGEMM GPU kernel performance Dataset

Content

The data set is of SGEMM GPU kernel performance which consists of 14 features and 241600 records. This data set measures the running time of a matrix-matrix product A*B = C, where all matrices have size 2048 x 2048, using a parameterizable SGEMM GPU kernel with 261400 possible parameter combinations. Out of 14 features, the first 10 are ordinal and can only take up to 4 different powers of two values, and the 4 last variables are binary.

Acknowledgements

https://archive.ics.uci.edu/ml/datasets/SGEMM+GPU+kernel+performance
Pretrained PyTorch models
kaggle.com
zip
Updated Oct 6, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pedro Lima (2017). Pretrained PyTorch models [Dataset]. https://www.kaggle.com/pvlima/pretrained-pytorch-models
Explore at:
zip(239593921 bytes)Available download formats
Dataset updated
Oct 6, 2017
Authors
Pedro Lima
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Context

Experiment to apply same strategy from Beluga's Keras dataset with PyTorch models. This dataset has the weights for several of the models included in PyTorch. To use these weights they need to be copied when the kernel runs, like in this example.

Content

PyTorch models included:

Inception-V3

ResNet18

ResNet50

Acknowledgements

Beluga's Keras dataset PyTorch
UCI ML Drug Review dataset
kaggle.com
Updated Dec 13, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jessica Li (2018). UCI ML Drug Review dataset [Dataset]. https://www.kaggle.com/jessicali9530/kuc-hackathon-winter-2018/home
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 13, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jessica Li
Description
This dataset was used for the Winter 2018 Kaggle University Club Hackathon and is now publicly available. See Acknowledgments section for citation and licensing. Note: The types of data and recommendation based solutions provided by the contestants are purely for NLP learning purposes. They are not suitable for a real world drug recommendations solutions.

Welcome to the Kaggle University Club Hackathon!

If you are interested in joining Kaggle University Club, please e-mail Jessica Li at lijessica@google.com

This Hackathon is open to all undergraduate, master, and PhD students who are part of the Kaggle University Club program. The Hackathon provides students with a chance to build capacity via hands-on ML, learn from one another, and engage in a self-defined project that is meaningful to their careers.

Teams must register via Google Form to be eligible for the Hackathon. The Hackathon starts on Monday, November 12, 2018 and ends on Monday, December 10, 2018. Teams have one month to work on a team submission. Teams must do all work within the Kernel editor and set Kernel(s) to public at all times.

Prompt

The freestyle format of hackathons has time and again stimulated groundbreaking and innovative data insights and technologies. The Kaggle University Club Hackathon recreates this environment virtually on our platform. We challenge you to build a meaningful project around the UCI Machine Learning - Drug Review Dataset. Teams are free to let their creativity run and propose methods to analyze this dataset and form interesting machine learning models.

Machine learning has permeated nearly all fields and disciplines of study. One hot topic is using natural language processing and sentiment analysis to identify, extract, and make use of subjective information. The UCI ML Drug Review dataset provides patient reviews on specific drugs along with related conditions and a 10-star patient rating system reflecting overall patient satisfaction. The data was obtained by crawling online pharmaceutical review sites. This data was published in a study on sentiment analysis of drug experience over multiple facets, ex. sentiments learned on specific aspects such as effectiveness and side effects (see the acknowledgments section to learn more).

The sky's the limit here in terms of what your team can do! Teams are free to add supplementary datasets in conjunction with the drug review dataset in their Kernel. Discussion is highly encouraged within the forum and Slack so everyone can learn from their peers.

Here are just a couple ideas as to what you could do with the data:

Classification: Can you predict the patient's condition based on the review?

Regression: Can you predict the rating of the drug based on the review?

Sentiment analysis: What elements of a review make it more helpful to others? Which patients tend to have more negative reviews? Can you determine if a review is positive, neutral, or negative?

Data visualizations: What kind of drugs are there? What sorts of conditions do these patients have?

Top Submissions

There is no one correct answer to this Hackathon, and teams are free to define the direction of their own project. That being said, there are certain core elements generally found across all outstanding Kernels on the Kaggle platform. The best Kernels are:

Complex: How many domains of analysis and topics does this Kernel cover? Does it attempt machine learning methods? Does the Kernel offer a variety of unique analyses and interesting conclusions or solutions?

Original: What is the subject matter of this Kernel? Does it have a well-defined and interesting project scope, narrative or problem? Could the results make an impact? Is it thought provoking?

Approachable: How easy is it to understand this Kernel? Are all thought processes clear? Is the code clean, with useful comments? Are visualizations and processes articulated and self-explanatory?

Teams with top submissions have a chance to receive exclusive Kaggle University Club swag and be featured on our official blog and across social media.

IMPORTANT: Teams must set all Kernels to public at all times. This is so we can track each team's progression, but more importantly it encourages collaboration, productive discussion, and healthy inspiration to all teams. It is not so that teams can simply copycat good ideas. If a team's Kernel isn't their own organic work, it will not be considered a top submission. Teams must come up with a project on their own.

Submission Styling

The final Kernel submission for the Hackathon must contain the following information:

All team members added as collaborators to the Kernel

Somewhere at the top of your Kernel, find a space to put down all team member names, university name, club name, and team name (as specified whe...

Facebook

Twitter

Click to copy link

Link copied

Cite

Regi (2019). Manoeuvring Kaggle Kernel and Data Environment [Dataset]. https://www.kaggle.com/regivm/kernel/tasks

Manoeuvring Kaggle Kernel and Data Environment

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 11, 2019

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Regi

Description

Dataset

This dataset was created by Regi

Clear search

Close search

Google apps

Main menu

Manoeuvring Kaggle Kernel and Data Environment

Dataset

Contents

ExploRNA_input

Dataset

Contents

ExploRNA_input

Dataset

Contents

Titanic Dataset - cleaned

ImagesForKernel

Dataset

Contents

Haar Cascades for Face Detection

Context

Content

Acknowledgements

Inspiration

The All in One Model Prediction Files

Dataset

Contents

4717-kernel-3

Dataset

Contents

CSV Files Used in My Kernel

Dataset

Contents

PetFinder External Data

Dataset

Contents

infer-kernel

Dataset

Contents

Sentiment bags of words

Dataset

Contents

Regression Kernel

Dataset

Contents

fastai2 wheels

Dataset

Contents

Cat in dat: Kernels

Context

Model_kernel_howud

Dataset

Contents

unet_characters_pictures

sgemm-kernel

Context

Content

Acknowledgements

Pretrained PyTorch models

Context

Content

Acknowledgements

UCI ML Drug Review dataset

Welcome to the Kaggle University Club Hackathon!

Prompt

Top Submissions

Submission Styling

Manoeuvring Kaggle Kernel and Data Environment

Manoeuvring Kaggle Kernel and Data Environment

Dataset

Contents