This dataset was created by Regi
This dataset was created by Fatih Öztürk
This dataset was created by Fatih Ozturk
This is the classic Titanic Dataset provided in the Kaggle Competition K Kernel and then cleaned in one of the most popular Kernels there. Please see the Kernel titled, "A Data Science Framework: To Achieve 99% Accuracy" for a great lesson in data science. This Kernel gives a great explanaton of the thinking behind the of this data cleaning as well as a very professional demonstration of the technologies and skills to do so. It then continues to provide an overview of many ML techniques and it is copiously and meticulously documented with many useful citations.
Of course, data cleaning is an essential skill in data science but I wanted to use this data for a study of other machine learning techniques. So, I found and used this set of data that is well known and cleaned to a benchmark accepted by many.
This dataset was created by deeplearner
The data is from: https://github.com/opencv/opencv/tree/master/data/haarcascades, included here for easy usage with OpenCV from Kaggle Kernels.
Haar Cascades for Face, Profile, Eyes, Smile, Upper Body Detection.
The copyright belongs to the authors as mentioned in each file. Data source is from: https://github.com/opencv/opencv/tree/master/data/haarcascades
Use this data to extract persons face, eyes, smile, profile face, upper body from both still images and videos.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Zafar
Released under CC0: Public Domain
Prediction files
This dataset was created by YuanXu
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Darien Schettler
Released under CC0: Public Domain
This dataset was created by luyaxin
This dataset was created by Maksim Filin
This dataset was created by My Ho63
This dataset was created by Ogün Can Kaya
This dataset was created by Vijayabhaskar J
This dataset contains submissions (and scores) obtained from well performed kernels published in https://www.kaggle.com/c/cat-in-the-dat competition.
Links are in 'kernels.csv' file.
Regards to great authors!
This dataset was created by HowYouDoin'
Pictures for my kernel https://www.kaggle.com/hocop1/unet-character-detector
SGEMM GPU kernel performance Dataset
The data set is of SGEMM GPU kernel performance which consists of 14 features and 241600 records. This data set measures the running time of a matrix-matrix product A*B = C, where all matrices have size 2048 x 2048, using a parameterizable SGEMM GPU kernel with 261400 possible parameter combinations. Out of 14 features, the first 10 are ordinal and can only take up to 4 different powers of two values, and the 4 last variables are binary.
https://archive.ics.uci.edu/ml/datasets/SGEMM+GPU+kernel+performance
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Experiment to apply same strategy from Beluga's Keras dataset with PyTorch models. This dataset has the weights for several of the models included in PyTorch. To use these weights they need to be copied when the kernel runs, like in this example.
PyTorch models included:
Inception-V3
ResNet18
ResNet50
Beluga's Keras dataset PyTorch
If you are interested in joining Kaggle University Club, please e-mail Jessica Li at lijessica@google.com
This Hackathon is open to all undergraduate, master, and PhD students who are part of the Kaggle University Club program. The Hackathon provides students with a chance to build capacity via hands-on ML, learn from one another, and engage in a self-defined project that is meaningful to their careers.
Teams must register via Google Form to be eligible for the Hackathon. The Hackathon starts on Monday, November 12, 2018 and ends on Monday, December 10, 2018. Teams have one month to work on a team submission. Teams must do all work within the Kernel editor and set Kernel(s) to public at all times.
The freestyle format of hackathons has time and again stimulated groundbreaking and innovative data insights and technologies. The Kaggle University Club Hackathon recreates this environment virtually on our platform. We challenge you to build a meaningful project around the UCI Machine Learning - Drug Review Dataset. Teams are free to let their creativity run and propose methods to analyze this dataset and form interesting machine learning models.
Machine learning has permeated nearly all fields and disciplines of study. One hot topic is using natural language processing and sentiment analysis to identify, extract, and make use of subjective information. The UCI ML Drug Review dataset provides patient reviews on specific drugs along with related conditions and a 10-star patient rating system reflecting overall patient satisfaction. The data was obtained by crawling online pharmaceutical review sites. This data was published in a study on sentiment analysis of drug experience over multiple facets, ex. sentiments learned on specific aspects such as effectiveness and side effects (see the acknowledgments section to learn more).
The sky's the limit here in terms of what your team can do! Teams are free to add supplementary datasets in conjunction with the drug review dataset in their Kernel. Discussion is highly encouraged within the forum and Slack so everyone can learn from their peers.
Here are just a couple ideas as to what you could do with the data:
There is no one correct answer to this Hackathon, and teams are free to define the direction of their own project. That being said, there are certain core elements generally found across all outstanding Kernels on the Kaggle platform. The best Kernels are:
Teams with top submissions have a chance to receive exclusive Kaggle University Club swag and be featured on our official blog and across social media.
IMPORTANT: Teams must set all Kernels to public at all times. This is so we can track each team's progression, but more importantly it encourages collaboration, productive discussion, and healthy inspiration to all teams. It is not so that teams can simply copycat good ideas. If a team's Kernel isn't their own organic work, it will not be considered a top submission. Teams must come up with a project on their own.
The final Kernel submission for the Hackathon must contain the following information:
This dataset was created by Regi