100+ datasets found
  1. Manoeuvring Kaggle Kernel and Data Environment

    • kaggle.com
    Updated Apr 11, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Regi (2019). Manoeuvring Kaggle Kernel and Data Environment [Dataset]. https://www.kaggle.com/regivm/kernel/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 11, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Regi
    Description

    Dataset

    This dataset was created by Regi

    Contents

  2. ExploRNA_input

    • kaggle.com
    Updated Jan 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fatih Ozturk (2021). ExploRNA_input [Dataset]. https://www.kaggle.com/datasets/fatihozturk/fixed-draw
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 28, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Fatih Ozturk
    Description

    Dataset

    This dataset was created by Fatih Öztürk

    Contents

  3. ExploRNA_input

    • kaggle.com
    zip
    Updated Jan 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fatih Ozturk (2021). ExploRNA_input [Dataset]. https://www.kaggle.com/fatihozturk/fixed-draw
    Explore at:
    zip(894773063 bytes)Available download formats
    Dataset updated
    Jan 22, 2021
    Authors
    Fatih Ozturk
    Description

    Dataset

    This dataset was created by Fatih Ozturk

    Contents

  4. Titanic Dataset - cleaned

    • kaggle.com
    Updated Aug 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WinstonSDodson (2019). Titanic Dataset - cleaned [Dataset]. https://www.kaggle.com/winstonsdodson/titanic-dataset-cleaned/kernels
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 9, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    WinstonSDodson
    Description

    This is the classic Titanic Dataset provided in the Kaggle Competition K Kernel and then cleaned in one of the most popular Kernels there. Please see the Kernel titled, "A Data Science Framework: To Achieve 99% Accuracy" for a great lesson in data science. This Kernel gives a great explanaton of the thinking behind the of this data cleaning as well as a very professional demonstration of the technologies and skills to do so. It then continues to provide an overview of many ML techniques and it is copiously and meticulously documented with many useful citations.

    Of course, data cleaning is an essential skill in data science but I wanted to use this data for a study of other machine learning techniques. So, I found and used this set of data that is well known and cleaned to a benchmark accepted by many.

  5. ImagesForKernel

    • kaggle.com
    Updated Jul 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    deeplearner (2020). ImagesForKernel [Dataset]. https://www.kaggle.com/adarshpathak/imagesforkernel/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 9, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    deeplearner
    Description

    Dataset

    This dataset was created by deeplearner

    Contents

  6. Haar Cascades for Face Detection

    • kaggle.com
    Updated Dec 21, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriel Preda (2019). Haar Cascades for Face Detection [Dataset]. https://www.kaggle.com/gpreda/haar-cascades-for-face-detection/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 21, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Gabriel Preda
    Description

    Context

    The data is from: https://github.com/opencv/opencv/tree/master/data/haarcascades, included here for easy usage with OpenCV from Kaggle Kernels.

    Content

    Haar Cascades for Face, Profile, Eyes, Smile, Upper Body Detection.

    Acknowledgements

    The copyright belongs to the authors as mentioned in each file. Data source is from: https://github.com/opencv/opencv/tree/master/data/haarcascades

    Inspiration

    Use this data to extract persons face, eyes, smile, profile face, upper body from both still images and videos.

  7. The All in One Model Prediction Files

    • kaggle.com
    Updated Mar 28, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zafar (2018). The All in One Model Prediction Files [Dataset]. https://www.kaggle.com/datasets/fizzbuzz/the-all-in-one-model-prediction-files/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 28, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Zafar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Zafar

    Released under CC0: Public Domain

    Contents

    Prediction files

  8. 4717-kernel-3

    • kaggle.com
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    YuanXu (2023). 4717-kernel-3 [Dataset]. https://www.kaggle.com/datasets/xuyuan/4717kernel3
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    YuanXu
    Description

    Dataset

    This dataset was created by YuanXu

    Contents

  9. CSV Files Used in My Kernel

    • kaggle.com
    Updated Jun 2, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Darien Schettler (2018). CSV Files Used in My Kernel [Dataset]. https://www.kaggle.com/dschettler8845/csv-files-used-in-my-kernel/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 2, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Darien Schettler
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Darien Schettler

    Released under CC0: Public Domain

    Contents

  10. PetFinder External Data

    • kaggle.com
    Updated Mar 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    luyaxin (2019). PetFinder External Data [Dataset]. https://www.kaggle.com/datasets/m7catsue/petfinder-external-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 22, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    luyaxin
    Description

    Dataset

    This dataset was created by luyaxin

    Contents

  11. infer-kernel

    • kaggle.com
    Updated Oct 24, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maksim Filin (2019). infer-kernel [Dataset]. https://www.kaggle.com/xsardas/inferkernel/kernels
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 24, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Maksim Filin
    Description

    Dataset

    This dataset was created by Maksim Filin

    Contents

  12. Sentiment bags of words

    • kaggle.com
    Updated Jun 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    My Ho63 (2020). Sentiment bags of words [Dataset]. https://www.kaggle.com/myho63/sentiment-bags-of-words/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 6, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    My Ho63
    Description

    Dataset

    This dataset was created by My Ho63

    Contents

  13. Regression Kernel

    • kaggle.com
    zip
    Updated Jan 24, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ogün Can Kaya (2019). Regression Kernel [Dataset]. https://www.kaggle.com/datasets/joousep/regression-kernel/discussion
    Explore at:
    zip(11828 bytes)Available download formats
    Dataset updated
    Jan 24, 2019
    Authors
    Ogün Can Kaya
    Description

    Dataset

    This dataset was created by Ogün Can Kaya

    Contents

  14. fastai2 wheels

    • kaggle.com
    Updated Jun 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vijayabhaskar J (2020). fastai2 wheels [Dataset]. https://www.kaggle.com/vijayabhaskar96/fastai2-wheels/kernels
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 25, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Vijayabhaskar J
    Description

    Dataset

    This dataset was created by Vijayabhaskar J

    Contents

  15. Cat in dat: Kernels

    • kaggle.com
    zip
    Updated Sep 6, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pavel Prokhorov (2019). Cat in dat: Kernels [Dataset]. https://www.kaggle.com/datasets/pavelvpster/cat-in-dat-kernels
    Explore at:
    zip(11470236 bytes)Available download formats
    Dataset updated
    Sep 6, 2019
    Authors
    Pavel Prokhorov
    Description

    Context

    This dataset contains submissions (and scores) obtained from well performed kernels published in https://www.kaggle.com/c/cat-in-the-dat competition.

    Links are in 'kernels.csv' file.

    Regards to great authors!

  16. Model_kernel_howud

    • kaggle.com
    Updated Jan 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HowYouDoin' (2020). Model_kernel_howud [Dataset]. https://www.kaggle.com/datasets/namaniitb/model-kernel-howud/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 6, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    HowYouDoin'
    Description

    Dataset

    This dataset was created by HowYouDoin'

    Contents

  17. unet_characters_pictures

    • kaggle.com
    zip
    Updated Sep 13, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ruslan Baynazarov (2019). unet_characters_pictures [Dataset]. https://www.kaggle.com/hocop1/unet-characters-pictures
    Explore at:
    zip(346426 bytes)Available download formats
    Dataset updated
    Sep 13, 2019
    Authors
    Ruslan Baynazarov
    Description
  18. sgemm-kernel

    • kaggle.com
    Updated Sep 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sujeeth Shetty (2020). sgemm-kernel [Dataset]. https://www.kaggle.com/isujeeth/sgemm/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 5, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sujeeth Shetty
    Description

    Context

    SGEMM GPU kernel performance Dataset

    Content

    The data set is of SGEMM GPU kernel performance which consists of 14 features and 241600 records. This data set measures the running time of a matrix-matrix product A*B = C, where all matrices have size 2048 x 2048, using a parameterizable SGEMM GPU kernel with 261400 possible parameter combinations. Out of 14 features, the first 10 are ordinal and can only take up to 4 different powers of two values, and the 4 last variables are binary.

    Acknowledgements

    https://archive.ics.uci.edu/ml/datasets/SGEMM+GPU+kernel+performance

  19. Pretrained PyTorch models

    • kaggle.com
    zip
    Updated Oct 6, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Lima (2017). Pretrained PyTorch models [Dataset]. https://www.kaggle.com/pvlima/pretrained-pytorch-models
    Explore at:
    zip(239593921 bytes)Available download formats
    Dataset updated
    Oct 6, 2017
    Authors
    Pedro Lima
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Context

    Experiment to apply same strategy from Beluga's Keras dataset with PyTorch models. This dataset has the weights for several of the models included in PyTorch. To use these weights they need to be copied when the kernel runs, like in this example.

    Content

    PyTorch models included:

    • Inception-V3

    • ResNet18

    • ResNet50

    Acknowledgements

    Beluga's Keras dataset PyTorch

  20. UCI ML Drug Review dataset

    • kaggle.com
    Updated Dec 13, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jessica Li (2018). UCI ML Drug Review dataset [Dataset]. https://www.kaggle.com/jessicali9530/kuc-hackathon-winter-2018/home
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 13, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jessica Li
    Description

    This dataset was used for the Winter 2018 Kaggle University Club Hackathon and is now publicly available. See Acknowledgments section for citation and licensing. Note: The types of data and recommendation based solutions provided by the contestants are purely for NLP learning purposes. They are not suitable for a real world drug recommendations solutions.

    Welcome to the Kaggle University Club Hackathon!

    If you are interested in joining Kaggle University Club, please e-mail Jessica Li at lijessica@google.com

    This Hackathon is open to all undergraduate, master, and PhD students who are part of the Kaggle University Club program. The Hackathon provides students with a chance to build capacity via hands-on ML, learn from one another, and engage in a self-defined project that is meaningful to their careers.

    Teams must register via Google Form to be eligible for the Hackathon. The Hackathon starts on Monday, November 12, 2018 and ends on Monday, December 10, 2018. Teams have one month to work on a team submission. Teams must do all work within the Kernel editor and set Kernel(s) to public at all times.

    Prompt

    The freestyle format of hackathons has time and again stimulated groundbreaking and innovative data insights and technologies. The Kaggle University Club Hackathon recreates this environment virtually on our platform. We challenge you to build a meaningful project around the UCI Machine Learning - Drug Review Dataset. Teams are free to let their creativity run and propose methods to analyze this dataset and form interesting machine learning models.

    Machine learning has permeated nearly all fields and disciplines of study. One hot topic is using natural language processing and sentiment analysis to identify, extract, and make use of subjective information. The UCI ML Drug Review dataset provides patient reviews on specific drugs along with related conditions and a 10-star patient rating system reflecting overall patient satisfaction. The data was obtained by crawling online pharmaceutical review sites. This data was published in a study on sentiment analysis of drug experience over multiple facets, ex. sentiments learned on specific aspects such as effectiveness and side effects (see the acknowledgments section to learn more).

    The sky's the limit here in terms of what your team can do! Teams are free to add supplementary datasets in conjunction with the drug review dataset in their Kernel. Discussion is highly encouraged within the forum and Slack so everyone can learn from their peers.

    Here are just a couple ideas as to what you could do with the data:

    • Classification: Can you predict the patient's condition based on the review?
    • Regression: Can you predict the rating of the drug based on the review?
    • Sentiment analysis: What elements of a review make it more helpful to others? Which patients tend to have more negative reviews? Can you determine if a review is positive, neutral, or negative?
    • Data visualizations: What kind of drugs are there? What sorts of conditions do these patients have?

    Top Submissions

    There is no one correct answer to this Hackathon, and teams are free to define the direction of their own project. That being said, there are certain core elements generally found across all outstanding Kernels on the Kaggle platform. The best Kernels are:

    1. Complex: How many domains of analysis and topics does this Kernel cover? Does it attempt machine learning methods? Does the Kernel offer a variety of unique analyses and interesting conclusions or solutions?
    2. Original: What is the subject matter of this Kernel? Does it have a well-defined and interesting project scope, narrative or problem? Could the results make an impact? Is it thought provoking?
    3. Approachable: How easy is it to understand this Kernel? Are all thought processes clear? Is the code clean, with useful comments? Are visualizations and processes articulated and self-explanatory?

    Teams with top submissions have a chance to receive exclusive Kaggle University Club swag and be featured on our official blog and across social media.

    IMPORTANT: Teams must set all Kernels to public at all times. This is so we can track each team's progression, but more importantly it encourages collaboration, productive discussion, and healthy inspiration to all teams. It is not so that teams can simply copycat good ideas. If a team's Kernel isn't their own organic work, it will not be considered a top submission. Teams must come up with a project on their own.

    Submission Styling

    The final Kernel submission for the Hackathon must contain the following information:

    • All team members added as collaborators to the Kernel
    • Somewhere at the top of your Kernel, find a space to put down all team member names, university name, club name, and team name (as specified whe...
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Regi (2019). Manoeuvring Kaggle Kernel and Data Environment [Dataset]. https://www.kaggle.com/regivm/kernel/tasks
Organization logo

Manoeuvring Kaggle Kernel and Data Environment

Manoeuvring Kaggle Kernel and Data Environment

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 11, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Regi
Description

Dataset

This dataset was created by Regi

Contents

Search
Clear search
Close search
Google apps
Main menu