https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by FLuzmano
Released under CC0: Public Domain
CNN
For Google colab practice
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset : part1_dataSorted_Diversevul_llama2_dataset
dataset lines : 2768
Kaggle Notebook (for dataset splitting) : https://www.kaggle.com/code/mrappplg/securix-diversevul-dataset
Google Colab Notebook : https://colab.research.google.com/drive/1z6fLQrcMSe1-AVMHp0dp6uDr4RtVIOzF?usp=sharing
The datasets were made by Mr Anaubhavmaity. https://github.com/anubhavmaity
I use for making a example that is trained on google-colab
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains 509,323 images from HaGRID (HAnd Gesture Recognition Image Dataset) downscaled to 384p. The original dataset is 716GB and contains 552,992 1080p images. I created this sample for a tutorial so readers can use the dataset in the free tiers of Google Colab and Kaggle Notebooks.
Original Authors:
Alexander Kapitanov Andrey Makhlyarchuk Karina Kvanchiani
Original Dataset Links
GitHub Kaggle Datasets Page
Object Classes
['call'… See the full description on the dataset page: https://huggingface.co/datasets/cj-mills/hagrid-sample-500k-384p.
This dataset during the challengeV2 of the INF473V at ecole polytechnique. It consists in additionnal images for the dataset generated with stable diffusion. Code used to generate them : https://colab.research.google.com/drive/1zicIWGK7hd-TH_8tNJ4kgxrrPeHsgZWv?usp=sharing
It took very long time/weeks, to make this dataset, giving me an extensive data engineering capabilities. Used both GitHub and GCP as storage and both kaggle and colab to prepare this dataset. It would have been more useful to everyone, had i done this much earlier.
All images from original set are included. To reduce the dataset size, all images have been resized to a minimum dimension of (224320) using tensorflow resize API.
Extensively used stackoverflow to find best solutions for many data engineering tasks and thanks for all those who have solved those issues earlier.
Original dataset size 99GB cant be used in colab to train the custom model.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Accident Detection Model is made using YOLOv8, Google Collab, Python, Roboflow, Deep Learning, OpenCV, Machine Learning, Artificial Intelligence. It can detect an accident on any accident by live camera, image or video provided. This model is trained on a dataset of 3200+ images, These images were annotated on roboflow.
https://user-images.githubusercontent.com/78155393/233774342-287492bb-26c1-4acf-bc2c-9462e97a03ca.png" alt="Survey">
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Use this dataset with Misra's Pandas tutorial: How to use the Pandas GroupBy function | Pandas tutorial
The original dataset came from this site: https://data.cityofnewyork.us/City-Government/NYC-Jobs/kpav-sd4t/data
I used Google Colab to filter the columns with the following Pandas commands. Here's a Colab Notebook you can use with the commands listed below: https://colab.research.google.com/drive/17Jpgeytc075CpqDnbQvVMfh9j-f4jM5l?usp=sharing
Once the csv file is uploaded to Google Colab, use these commands to process the file.
import pandas as pd # load the file and create a pandas dataframe df = pd.read_csv('/content/NYC_Jobs.csv') # keep only these columns df = df[['Job ID', 'Civil Service Title', 'Agency', 'Posting Type', 'Job Category', 'Salary Range From', 'Salary Range To' ]] # save the csv file without the index column df.to_csv('/content/NYC_Jobs_filtered_cols.csv', index=False)
scikit-learn - version 0.22.2.post1
This is the default scikit learn version on Google Colab by November 2021. Different versions of sklearn gets to different results (and will generate error if use 2 different versions at the same task)
Usage:
!pip -q install ../input/sklearn-1-0/scikit_learn-0.22.2.post1-cp37-cp37m-manylinux1_x86_64.whl
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We aim to build a Robust Shelf Monitoring system to help store keepers to maintain accurate inventory details, to re-stock items efficiently and on-time and to tackle the problem of misplaced items where an item is accidentally placed at a different location. Our product aims to serve as store manager that alerts the owner about items that needs re-stocking and misplaced items.
custom-yolov4-detector.cfg
file in /darknet/cfg/ directory.filters = (number of classes + 5) * 3
for each yolo layer.max_batches = (number of classes) * 2000
detect.py
script to peform the prediction.
## Presenting the predicted result.
The detect.py
script have option to send SMS notification to the shop keepers. We have built a front-end for building the phone-book for collecting the details of the shopkeepers. It also displays the latest prediction result and model accuracy.Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Dataset Card for "hagrid-classification-512p-no-gesture-150k"
This dataset contains 153,735 training images from HaGRID (HAnd Gesture Recognition Image Dataset) modified for image classification instead of object detection. The original dataset is 716GB. I created this sample for a tutorial so readers can use the dataset in the free tiers of Google Colab and Kaggle Notebooks.
Original Authors:
Alexander Kapitanov Andrey Makhlyarchuk Karina Kvanchiani… See the full description on the dataset page: https://huggingface.co/datasets/cj-mills/hagrid-classification-512p-no-gesture-150k.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Google/MusicCapsをスペクトログラムにしたデータ。
The dataset viwer of this repository is truncated, so maybe you should see this one instaed.
Dataset information
画像
caption
data_idx
number
1025px × 216px
音楽の説明
どのデータから生成されたデータか
5秒ずつ区切ったデータのうち、何番目か
How this dataset was made
コード:https://colab.research.google.com/drive/13m792FEoXszj72viZuBtusYRUL1z6Cu2?usp=sharing 参考にしたKaggle Notebook :… See the full description on the dataset page: https://huggingface.co/datasets/mb23/GraySpectrogram.
This dataset is a modified version of the xView1 dataset, specifically tailored for seamless integration with YOLOv5 in Google Colab. The xView1 dataset originally consists of high-resolution satellite imagery labeled for object detection tasks. In this adapted version, we have preprocessed the data and organized it to facilitate easy usage with YOLOv5, a popular deep learning framework for object detection.
Images: The dataset includes a collection of high-resolution satellite images covering diverse geographic locations. These images have been resized and preprocessed to align with the requirements of YOLOv5, ensuring efficient training and testing.
Object annotations are provided for each image, specifying the bounding boxes and class labels of various objects present in the scenes. The annotations are formatted to match the YOLOv5 input specifications.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
It all started during last #StayAtHome during 2020's pandemic: some neighbors worried about trash in Montevideo's container.
The goal is to automatically detect clean from dirty containers to ask for maintenance.
Want to know more about the entire process? Checkout this thread on how it began, and this other with respect to version 6 update process.
Data is splitted in training/testing split, they are independent. However, each split contains several near duplicate images (typicaly, same container from different perspectives or days). Image sizes differ a lot among them.
There are four major sources: * Images taken from Google Street View, they are 600x600 pixels, automatically collected through its API. * Images contributed by individual persons, most of which I took my self. * Images taken from social networks (Twitter & Facebook) and news. * Images contributed by pormibarrio.uy - 17-11-2020
Images were took from green containers, the most popular in Montevideo, but also widely used in some other cities.
Current version (clean-dirty-garbage-containers-V6) is also available here or you can download it as follows:
wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate 'https://docs.google.com/uc?export=download&id=1mdfJoOrO6MeTc3eMEjIDkAKlwK9bUFg6' -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/
/p')&id=1mdfJoOrO6MeTc3eMEjIDkAKlwK9bUFg6" -O clean-dirty-garbage-containers-V6.zip && rm -rf /tmp/cookies.txt
This is specially useful if you want to download it in Google Colab.
This repo contains the code used during its building and documentation process, including the baselines for the purposed tasks.
Since this is a hot topic in Montevideo, specially nowadays, with elections next week, it catch some attention from local press:
Thanks to every single person who give me images from their containers. Special thanks to my friend Diego, whose idea of using google street view as a source of data really contributed to increase the dataset. And finally to my wife, who supported me during this project and contributed a lot to this dataset.
If you use these data in a publication, presentation, or other research project or product, please use the following citation:
Laguna, Rodrigo. 2021. Clean dirty containers in Montevideo - Version 6.1. url: https://www.kaggle.com/rodrigolaguna/clean-dirty-containers-in-montevideo
@dataset{RLaguna-clean-dirty:2021,
author = {Rodrigo Laguna},
title = {Clean dirty containers in Montevideo},
year = {2021},
url = {https://www.kaggle.com/rodrigolaguna/clean-dirty-containers-in-montevideo},
version = {6.1}
}
I'm on twitter, @ro_laguna_ or write to me r.laguna.queirolo at outlook.com
12-09-2020: V3 - Include more training (+676) & testing (+64) samples:
21-12-2020: V4 - Include more training (+367) & testing (+794) samples, including ~400...
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is a dataset for detecting banana quality using ML. This dataset contains four categories: Unripe, Ripe, Overripe and Rotten. In this dataset, there are enormous amount of images which will help users to train the ML model conveniently and easily.
NOTE: THIS DATASET HAS BEEN PICKED FROM https://universe.roboflow.com/roboflow-universe-projects/banana-ripeness-classification. I WAS FACING DIFFICULTIES WHILE DOWNLOADING DATASET DIRECTLY TO THE GOOGLE COLAB TO TRAIN MY CNN MODEL AS A PART OF UNIVERSITY PROJECT. ALL CREDITS FOR THIS DATASET, AS FAR AS MY KNOWLEDGE GOES, GOES TO ROBOFLOW. I DO NOT INTEND TO TAKE ANY CREDITS MYSELF OR UNETHICALLY CLAIM OWNERSHIP, I JUST UPLOADED DATASET HERE FOR MY CONVENIENCE, THANK YOU.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Common Voice is a corpus of speech data read by users on the Common Voice website, and based upon text from a number of public domain sources like user submitted blog posts, old books, movies, and other public speech corpora. Its primary purpose is to enable the training and testing of automatic speech recognition (ASR) systems.
In google colab, i downloaded the .tar.gz from common-voice (mozilla). And placed the compressed file in a folder marked the folder as dataset and straight-up uploaded it
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This data come from competition RSNA-Intracranial-Hemorrhage-Detection Stage-2 Test-Only
We convert from .dcm to .jpg to save space and faster run.
IMPORTANT NOTE this data use a certain preprocessing windows that may not match your submit in Stage-1. So beside using it for educational purpose, if you want to use it on stage-2, use it at your own risk.
The code used for generating data can be found here : https://colab.research.google.com/drive/1FunZZyl88I_PNqjddGss4wy2MkHpxND9
(1) Most of the code come from @guiferviz : (thanks so much for your contribution) https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection/discussion/109978#latest-656304
(2) The main windows ideas are copied from @Appian github (with little modification) https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection/discussion/112819#latest-668603
(3) The main windows ideas are inspired by @dcstang https://www.kaggle.com/c/rsna-intracranial-hemorrhage-detection/discussion/110728#latest-659011
YOLOv7 to TensorRT converted model file for wheat detection challenge. The conversion script link is here
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset can be used to build a CNN model that can classify if a shoe is an Adidas or Nike brand.
The images were pulled from bing using bing_image_search from pypi, 400 images of each class were downloaded and then the dataset was trimmed to 300(some unrelated images were removed in the process of compiling the dataset).
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by FLuzmano
Released under CC0: Public Domain
CNN
For Google colab practice