23 datasets found
  1. Cats&Dogs (Pickle)

    • kaggle.com
    Updated Feb 27, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FLuzmano (2020). Cats&Dogs (Pickle) [Dataset]. https://www.kaggle.com/fariziluzman/catsdogs-pickle/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 27, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    FLuzmano
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by FLuzmano

    Released under CC0: Public Domain

    Contents

    CNN

    For Google colab practice

  2. h

    part1_dataSorted_Diversevul_llama2_dataset

    • huggingface.co
    Updated Mar 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atharva Prashant Pawar (2024). part1_dataSorted_Diversevul_llama2_dataset [Dataset]. https://huggingface.co/datasets/atharvapawar/part1_dataSorted_Diversevul_llama2_dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 19, 2024
    Authors
    Atharva Prashant Pawar
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset : part1_dataSorted_Diversevul_llama2_dataset

      dataset lines : 2768
    
    
    
    
    
      Kaggle Notebook (for dataset splitting) : https://www.kaggle.com/code/mrappplg/securix-diversevul-dataset
    
    
    
    
    
      Google Colab Notebook : https://colab.research.google.com/drive/1z6fLQrcMSe1-AVMHp0dp6uDr4RtVIOzF?usp=sharing
    
  3. sports-classification

    • kaggle.com
    zip
    Updated Jul 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tam NguyenVan (2019). sports-classification [Dataset]. https://www.kaggle.com/prjnce0fpersja/sportsclassification
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Jul 22, 2019
    Authors
    Tam NguyenVan
    Description

    The datasets were made by Mr Anaubhavmaity. https://github.com/anubhavmaity

    I use for making a example that is trained on google-colab

  4. h

    hagrid-sample-500k-384p

    • huggingface.co
    Updated Jul 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Mills (2023). hagrid-sample-500k-384p [Dataset]. https://huggingface.co/datasets/cj-mills/hagrid-sample-500k-384p
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 3, 2023
    Authors
    Christian Mills
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains 509,323 images from HaGRID (HAnd Gesture Recognition Image Dataset) downscaled to 384p. The original dataset is 716GB and contains 552,992 1080p images. I created this sample for a tutorial so readers can use the dataset in the free tiers of Google Colab and Kaggle Notebooks.

      Original Authors:
    

    Alexander Kapitanov Andrey Makhlyarchuk Karina Kvanchiani

      Original Dataset Links
    

    GitHub Kaggle Datasets Page

      Object Classes
    

    ['call'… See the full description on the dataset page: https://huggingface.co/datasets/cj-mills/hagrid-sample-500k-384p.

  5. Generated-images

    • kaggle.com
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antoine Bonnet (2023). Generated-images [Dataset]. https://www.kaggle.com/datasets/antoinebonnet2001/generated-images
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Antoine Bonnet
    Description

    This dataset during the challengeV2 of the INF473V at ecole polytechnique. It consists in additionnal images for the dataset generated with stable diffusion. Code used to generate them : https://colab.research.google.com/drive/1zicIWGK7hd-TH_8tNJ4kgxrrPeHsgZWv?usp=sharing

  6. gld20GB

    • kaggle.com
    Updated Sep 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JkReddy (2020). gld20GB [Dataset]. https://www.kaggle.com/jkreddy/gld20gb
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 24, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    JkReddy
    Description

    Context

    It took very long time/weeks, to make this dataset, giving me an extensive data engineering capabilities. Used both GitHub and GCP as storage and both kaggle and colab to prepare this dataset. It would have been more useful to everyone, had i done this much earlier.

    Content

    All images from original set are included. To reduce the dataset size, all images have been resized to a minimum dimension of (224320) using tensorflow resize API.

    Acknowledgements

    Extensively used stackoverflow to find best solutions for many data engineering tasks and thanks for all those who have solved those issues earlier.

    Inspiration

    Original dataset size 99GB cant be used in colab to train the custom model.

  7. R

    Accident Detection Model Dataset

    • universe.roboflow.com
    zip
    Updated Apr 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Accident detection model (2024). Accident Detection Model Dataset [Dataset]. https://universe.roboflow.com/accident-detection-model/accident-detection-model/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 8, 2024
    Dataset authored and provided by
    Accident detection model
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Accident Bounding Boxes
    Description

    Accident-Detection-Model

    Accident Detection Model is made using YOLOv8, Google Collab, Python, Roboflow, Deep Learning, OpenCV, Machine Learning, Artificial Intelligence. It can detect an accident on any accident by live camera, image or video provided. This model is trained on a dataset of 3200+ images, These images were annotated on roboflow.

    Problem Statement

    • Road accidents are a major problem in India, with thousands of people losing their lives and many more suffering serious injuries every year.
    • According to the Ministry of Road Transport and Highways, India witnessed around 4.5 lakh road accidents in 2019, which resulted in the deaths of more than 1.5 lakh people.
    • The age range that is most severely hit by road accidents is 18 to 45 years old, which accounts for almost 67 percent of all accidental deaths.

    Accidents survey

    https://user-images.githubusercontent.com/78155393/233774342-287492bb-26c1-4acf-bc2c-9462e97a03ca.png" alt="Survey">

    Literature Survey

    • Sreyan Ghosh in Mar-2019, The goal is to develop a system using deep learning convolutional neural network that has been trained to identify video frames as accident or non-accident.
    • Deeksha Gour Sep-2019, uses computer vision technology, neural networks, deep learning, and various approaches and algorithms to detect objects.

    Research Gap

    • Lack of real-world data - We trained model for more then 3200 images.
    • Large interpretability time and space needed - Using google collab to reduce interpretability time and space required.
    • Outdated Versions of previous works - We aer using Latest version of Yolo v8.

    Proposed methodology

    • We are using Yolov8 to train our custom dataset which has been 3200+ images, collected from different platforms.
    • This model after training with 25 iterations and is ready to detect an accident with a significant probability.

    Model Set-up

    Preparing Custom dataset

    • We have collected 1200+ images from different sources like YouTube, Google images, Kaggle.com etc.
    • Then we annotated all of them individually on a tool called roboflow.
    • During Annotation we marked the images with no accident as NULL and we drew a box on the site of accident on the images having an accident
    • Then we divided the data set into train, val, test in the ratio of 8:1:1
    • At the final step we downloaded the dataset in yolov8 format.
      #### Using Google Collab
    • We are using google colaboratory to code this model because google collab uses gpu which is faster than local environments.
    • You can use Jupyter notebooks, which let you blend code, text, and visualisations in a single document, to write and run Python code using Google Colab.
    • Users can run individual code cells in Jupyter Notebooks and quickly view the results, which is helpful for experimenting and debugging. Additionally, they enable the development of visualisations that make use of well-known frameworks like Matplotlib, Seaborn, and Plotly.
    • In Google collab, First of all we Changed runtime from TPU to GPU.
    • We cross checked it by running command ‘!nvidia-smi’
      #### Coding
    • First of all, We installed Yolov8 by the command ‘!pip install ultralytics==8.0.20’
    • Further we checked about Yolov8 by the command ‘from ultralytics import YOLO from IPython.display import display, Image’
    • Then we connected and mounted our google drive account by the code ‘from google.colab import drive drive.mount('/content/drive')’
    • Then we ran our main command to run the training process ‘%cd /content/drive/MyDrive/Accident Detection model !yolo task=detect mode=train model=yolov8s.pt data= data.yaml epochs=1 imgsz=640 plots=True’
    • After the training we ran command to test and validate our model ‘!yolo task=detect mode=val model=runs/detect/train/weights/best.pt data=data.yaml’ ‘!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt conf=0.25 source=data/test/images’
    • Further to get result from any video or image we ran this command ‘!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt source="/content/drive/MyDrive/Accident-Detection-model/data/testing1.jpg/mp4"’
    • The results are stored in the runs/detect/predict folder.
      Hence our model is trained, validated and tested to be able to detect accidents on any video or image.

    Challenges I ran into

    I majorly ran into 3 problems while making this model

    • I got difficulty while saving the results in a folder, as yolov8 is latest version so it is still underdevelopment. so i then read some blogs, referred to stackoverflow then i got to know that we need to writ an extra command in new v8 that ''save=true'' This made me save my results in a folder.
    • I was facing problem on cvat website because i was not sure what
  8. NYC Jobs Dataset (Filtered Columns)

    • kaggle.com
    Updated Oct 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeffery Mandrake (2022). NYC Jobs Dataset (Filtered Columns) [Dataset]. https://www.kaggle.com/datasets/jefferymandrake/nyc-jobs-filtered-cols
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 5, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jeffery Mandrake
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    New York
    Description

    Use this dataset with Misra's Pandas tutorial: How to use the Pandas GroupBy function | Pandas tutorial

    The original dataset came from this site: https://data.cityofnewyork.us/City-Government/NYC-Jobs/kpav-sd4t/data

    I used Google Colab to filter the columns with the following Pandas commands. Here's a Colab Notebook you can use with the commands listed below: https://colab.research.google.com/drive/17Jpgeytc075CpqDnbQvVMfh9j-f4jM5l?usp=sharing

    Once the csv file is uploaded to Google Colab, use these commands to process the file.

    import pandas as pd # load the file and create a pandas dataframe df = pd.read_csv('/content/NYC_Jobs.csv') # keep only these columns df = df[['Job ID', 'Civil Service Title', 'Agency', 'Posting Type', 'Job Category', 'Salary Range From', 'Salary Range To' ]] # save the csv file without the index column df.to_csv('/content/NYC_Jobs_filtered_cols.csv', index=False)

  9. scikit-learn v 0.22.2.post1

    • kaggle.com
    Updated Nov 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nhu Hoang (2021). scikit-learn v 0.22.2.post1 [Dataset]. https://www.kaggle.com/geninhu/scikitlearn-v-0222post1/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 6, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nhu Hoang
    Description

    scikit-learn - version 0.22.2.post1

    This is the default scikit learn version on Google Colab by November 2021. Different versions of sklearn gets to different results (and will generate error if use 2 different versions at the same task)

    Usage:

    !pip -q install ../input/sklearn-1-0/scikit_learn-0.22.2.post1-cp37-cp37m-manylinux1_x86_64.whl

  10. R

    Robust Shelf Monitoring Dataset

    • universe.roboflow.com
    zip
    Updated Dec 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shelf Monitoring (2022). Robust Shelf Monitoring Dataset [Dataset]. https://universe.roboflow.com/shelf-monitoring/robust-shelf-monitoring/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 14, 2022
    Dataset authored and provided by
    Shelf Monitoring
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Stock Of Products In Shelf Bounding Boxes
    Description

    Robust Shelf Monitoring

    We aim to build a Robust Shelf Monitoring system to help store keepers to maintain accurate inventory details, to re-stock items efficiently and on-time and to tackle the problem of misplaced items where an item is accidentally placed at a different location. Our product aims to serve as store manager that alerts the owner about items that needs re-stocking and misplaced items.

    Training the model:

    • Unzip the labelled dataset from kaggle and store it to your google drive.
    • Follow the tutorial and update the training parameters in custom-yolov4-detector.cfg file in /darknet/cfg/ directory.
    • filters = (number of classes + 5) * 3 for each yolo layer.
    • max_batches = (number of classes) * 2000

    Steps to run the prediction colab notebook:

    1. Install the required dependencies; pymongo,dnspython.
    2. Clone the darknet repository and the required python scripts.
    3. Mount the google drive containing the weight file.
    4. Copy the pre-trained weight file to the yolo content directory.
    5. Run the detect.py script to peform the prediction. ## Presenting the predicted result. The detect.py script have option to send SMS notification to the shop keepers. We have built a front-end for building the phone-book for collecting the details of the shopkeepers. It also displays the latest prediction result and model accuracy.
  11. h

    hagrid-classification-512p-no-gesture-150k

    • huggingface.co
    Updated Apr 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Mills (2025). hagrid-classification-512p-no-gesture-150k [Dataset]. https://huggingface.co/datasets/cj-mills/hagrid-classification-512p-no-gesture-150k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 2, 2025
    Authors
    Christian Mills
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for "hagrid-classification-512p-no-gesture-150k"

    This dataset contains 153,735 training images from HaGRID (HAnd Gesture Recognition Image Dataset) modified for image classification instead of object detection. The original dataset is 716GB. I created this sample for a tutorial so readers can use the dataset in the free tiers of Google Colab and Kaggle Notebooks.

      Original Authors:
    

    Alexander Kapitanov Andrey Makhlyarchuk Karina Kvanchiani… See the full description on the dataset page: https://huggingface.co/datasets/cj-mills/hagrid-classification-512p-no-gesture-150k.

  12. cassava_in_class_ef_b5_folds_0_a_4

    • kaggle.com
    zip
    Updated Jul 31, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    João Luiz de Souza Torres (2021). cassava_in_class_ef_b5_folds_0_a_4 [Dataset]. https://www.kaggle.com/joaoluizsouzatorres/cassava-in-class-ef-b5-folds-0-a-4
    Explore at:
    zip(1596542791 bytes)Available download formats
    Dataset updated
    Jul 31, 2021
    Authors
    João Luiz de Souza Torres
    Description
  13. h

    GraySpectrogram

    • huggingface.co
    Updated Oct 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    make brain project 2023 (2023). GraySpectrogram [Dataset]. https://huggingface.co/datasets/mb23/GraySpectrogram
    Explore at:
    Dataset updated
    Oct 31, 2023
    Dataset authored and provided by
    make brain project 2023
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Google/MusicCapsをスペクトログラムにしたデータ。

    The dataset viwer of this repository is truncated, so maybe you should see this one instaed.

      Dataset information
    
    
    
    
    画像
    caption
    data_idx
    number
    
    
    
     1025px × 216px
     音楽の説明
     どのデータから生成されたデータか
     5秒ずつ区切ったデータのうち、何番目か
    
    
    
    
    
    
    
      How this dataset was made
    

    コード:https://colab.research.google.com/drive/13m792FEoXszj72viZuBtusYRUL1z6Cu2?usp=sharing 参考にしたKaggle Notebook :… See the full description on the dataset page: https://huggingface.co/datasets/mb23/GraySpectrogram.

  14. xView1 dataset yolov5

    • kaggle.com
    Updated Nov 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luigi Scotto Rosato (2023). xView1 dataset yolov5 [Dataset]. https://www.kaggle.com/datasets/luigiscottorosato/xview1-dataset-yolov5
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 29, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Luigi Scotto Rosato
    Description

    xView1 Adapted for YOLOv5 in Colab

    Overview:

    This dataset is a modified version of the xView1 dataset, specifically tailored for seamless integration with YOLOv5 in Google Colab. The xView1 dataset originally consists of high-resolution satellite imagery labeled for object detection tasks. In this adapted version, we have preprocessed the data and organized it to facilitate easy usage with YOLOv5, a popular deep learning framework for object detection.

    Dataset Contents:

    Images: The dataset includes a collection of high-resolution satellite images covering diverse geographic locations. These images have been resized and preprocessed to align with the requirements of YOLOv5, ensuring efficient training and testing.

    Annotations:

    Object annotations are provided for each image, specifying the bounding boxes and class labels of various objects present in the scenes. The annotations are formatted to match the YOLOv5 input specifications.

    Usage Instructions:

    1. Download the dataset files, including images and annotations.
    2. Clone the YOLOv5 repository in Colab.
    3. Move dataset files (train.txt and val.txt) to the yolov5 directory.
    4. Use the provided .yaml for training.
  15. Clean dirty containers in Montevideo

    • kaggle.com
    Updated Aug 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rodrigo Laguna (2021). Clean dirty containers in Montevideo [Dataset]. https://www.kaggle.com/rodrigolaguna/clean-dirty-containers-in-montevideo/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 21, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rodrigo Laguna
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Montevideo
    Description

    Context

    It all started during last #StayAtHome during 2020's pandemic: some neighbors worried about trash in Montevideo's container.

    The goal is to automatically detect clean from dirty containers to ask for maintenance.

    Want to know more about the entire process? Checkout this thread on how it began, and this other with respect to version 6 update process.

    Content

    Data is splitted in training/testing split, they are independent. However, each split contains several near duplicate images (typicaly, same container from different perspectives or days). Image sizes differ a lot among them.

    There are four major sources: * Images taken from Google Street View, they are 600x600 pixels, automatically collected through its API. * Images contributed by individual persons, most of which I took my self. * Images taken from social networks (Twitter & Facebook) and news. * Images contributed by pormibarrio.uy - 17-11-2020

    Images were took from green containers, the most popular in Montevideo, but also widely used in some other cities.

    Current version (clean-dirty-garbage-containers-V6) is also available here or you can download it as follows: wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1mdfJoOrO6MeTc3eMEjIDkAKlwK9bUFg6' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/ /p')&id=1mdfJoOrO6MeTc3eMEjIDkAKlwK9bUFg6" -O clean-dirty-garbage-containers-V6.zip && rm -rf /tmp/cookies.txt This is specially useful if you want to download it in Google Colab.

    This repo contains the code used during its building and documentation process, including the baselines for the purposed tasks.

    Dataset on news

    Since this is a hot topic in Montevideo, specially nowadays, with elections next week, it catch some attention from local press:

    Acknowledgements

    Thanks to every single person who give me images from their containers. Special thanks to my friend Diego, whose idea of using google street view as a source of data really contributed to increase the dataset. And finally to my wife, who supported me during this project and contributed a lot to this dataset.

    Citation

    If you use these data in a publication, presentation, or other research project or product, please use the following citation:

    Laguna, Rodrigo. 2021. Clean dirty containers in Montevideo - Version 6.1. url: https://www.kaggle.com/rodrigolaguna/clean-dirty-containers-in-montevideo

    @dataset{RLaguna-clean-dirty:2021,
    author = {Rodrigo Laguna},
    title = {Clean dirty containers in Montevideo},
    year = {2021},
    url = {https://www.kaggle.com/rodrigolaguna/clean-dirty-containers-in-montevideo},
    version = {6.1}
    }
    

    Contact

    I'm on twitter, @ro_laguna_ or write to me r.laguna.queirolo at outlook.com

    Future steps:

    • Add images from mapillary, an open source project similar to GoogleStreetView.
    • Keep going on with manually taken images.
    • Add any image from anyone who would like to contribute.
    • Develop & deploy a bot for automatically report container's status.
    • Translate docs to Spanish
    • Crop images to let one and only one container per image, taking most of the image

    Changelog

    • 19-05-2020: V1 - Initial version
    • 20-05-2020: V2 - Include more training samples
    • 12-09-2020: V3 - Include more training (+676) & testing (+64) samples:

      • train/clean from 574 to 1005 (+431)
      • train/dirty from 365 to 610 (+245)
      • test/clean from 100 to 128 (+28)
      • test/dirty from 100 to 136 (+36)
    • 21-12-2020: V4 - Include more training (+367) & testing (+794) samples, including ~400...

  16. Banana Classification

    • kaggle.com
    Updated Apr 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atri Thakar (2024). Banana Classification [Dataset]. https://www.kaggle.com/datasets/atrithakar/banana-classification/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 23, 2024
    Dataset provided by
    Kaggle
    Authors
    Atri Thakar
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This is a dataset for detecting banana quality using ML. This dataset contains four categories: Unripe, Ripe, Overripe and Rotten. In this dataset, there are enormous amount of images which will help users to train the ML model conveniently and easily.

    NOTE: THIS DATASET HAS BEEN PICKED FROM https://universe.roboflow.com/roboflow-universe-projects/banana-ripeness-classification. I WAS FACING DIFFICULTIES WHILE DOWNLOADING DATASET DIRECTLY TO THE GOOGLE COLAB TO TRAIN MY CNN MODEL AS A PART OF UNIVERSITY PROJECT. ALL CREDITS FOR THIS DATASET, AS FAR AS MY KNOWLEDGE GOES, GOES TO ROBOFLOW. I DO NOT INTEND TO TAKE ANY CREDITS MYSELF OR UNETHICALLY CLAIM OWNERSHIP, I JUST UPLOADED DATASET HERE FOR MY CONVENIENCE, THANK YOU.

  17. Common Voice Corpus 5.1

    • kaggle.com
    zip
    Updated Sep 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Krish Baisoya (2023). Common Voice Corpus 5.1 [Dataset]. https://www.kaggle.com/datasets/krishbaisoya/cv-en-5
    Explore at:
    zip(54099708635 bytes)Available download formats
    Dataset updated
    Sep 15, 2023
    Authors
    Krish Baisoya
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Common Voice is a corpus of speech data read by users on the Common Voice website, and based upon text from a number of public domain sources like user submitted blog posts, old books, movies, and other public speech corpora. Its primary purpose is to enable the training and testing of automatic speech recognition (ASR) systems.

    How it is collected ?

    In google colab, i downloaded the .tar.gz from common-voice (mozilla). And placed the compressed file in a folder marked the folder as dataset and straight-up uploaded it

  18. RSNA Test jpg (512x512) Stage2

    • kaggle.com
    zip
    Updated Nov 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jung (2019). RSNA Test jpg (512x512) Stage2 [Dataset]. https://www.kaggle.com/ratthachat/rsnajpg512stage2
    Explore at:
    zip(5332050421 bytes)Available download formats
    Dataset updated
    Nov 5, 2019
    Authors
    Jung
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Short Description

    This data come from competition RSNA-Intracranial-Hemorrhage-Detection Stage-2 Test-Only

    We convert from .dcm to .jpg to save space and faster run.

    IMPORTANT NOTE this data use a certain preprocessing windows that may not match your submit in Stage-1. So beside using it for educational purpose, if you want to use it on stage-2, use it at your own risk.

    The code used for generating data can be found here : https://colab.research.google.com/drive/1FunZZyl88I_PNqjddGss4wy2MkHpxND9

    Credits :

    (1) Most of the code come from @guiferviz : (thanks so much for your contribution) https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection/discussion/109978#latest-656304

    (2) The main windows ideas are copied from @Appian github (with little modification) https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection/discussion/112819#latest-668603

    (3) The main windows ideas are inspired by @dcstang https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection/discussion/110728#latest-659011

  19. wheat_yolov7_tensorrt

    • kaggle.com
    Updated Sep 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prateek Gupta (2022). wheat_yolov7_tensorrt [Dataset]. https://www.kaggle.com/datasets/iamprateek/wheat-yolov7-tensorrt
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 8, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Prateek Gupta
    Description

    YOLOv7 to TensorRT converted model file for wheat detection challenge. The conversion script link is here

  20. Nike,Adidas Shoes for Image Classification Dataset

    • kaggle.com
    Updated Jul 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ifeanyi Nneji (2022). Nike,Adidas Shoes for Image Classification Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/3980041
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 24, 2022
    Dataset provided by
    Kaggle
    Authors
    Ifeanyi Nneji
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset can be used to build a CNN model that can classify if a shoe is an Adidas or Nike brand.

    The images were pulled from bing using bing_image_search from pypi, 400 images of each class were downloaded and then the dataset was trimmed to 300(some unrelated images were removed in the process of compiling the dataset).

    Link to Notebook

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
FLuzmano (2020). Cats&Dogs (Pickle) [Dataset]. https://www.kaggle.com/fariziluzman/catsdogs-pickle/activity
Organization logo

Cats&Dogs (Pickle)

For Google Colab CNN

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 27, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
FLuzmano
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Dataset

This dataset was created by FLuzmano

Released under CC0: Public Domain

Contents

CNN

For Google colab practice

Search
Clear search
Close search
Google apps
Main menu