Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains metadata related to three categories of AI and computer vision applications:
Handwritten Math Solutions: Metadata on images of handwritten math problems with step-by-step solutions.
Multi-lingual Street Signs: Road sign images in various languages, with translations.
Security Camera Anomalies: Surveillance footage metadata distinguishing between normal and suspicious activities.
The dataset is useful for machine learning, image recognition, OCR (Optical Character Recognition), anomaly detection, and AI model training.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset for this project is represented by photos, photos for the buildings of the University of Salford, these photos are taken by a mobile phone camera from different angels and different distances , even though this task sounds so easy but it encountered some challenges, these challenges are summarized below:
1. Obstacles.
a. Fixed or unremovable objects.
When taking several photos for a building or a landscape from different angels and directions ,there are some of these angels blocked by a form of a fixed object such as trees and plants, light poles, signs, statues, cabins, bicycle shades, scooter stands, generators/transformers, construction barriers, construction equipment and any other service equipment so it is unavoidable to represent some photos without these objects included, this will raise 3 questions.
- will these objects confuse the model/application we intend to create meaning will that obstacle prevent the model/application from identifying the designated building?
- Or will the photos be more precise with these objects and provide the capability for the model/application to identify these building with these obstacles included?
- How far is the maximum length for detection? In other words, how far will the mobile device with the application be from the building so it could or could not detect the designated building?
b. Removable and moving objects.
- Any University is crowded with staff and students especially in the rush hours of the day so it is hard for some photos to be taken without a personnel appearing in that photo in a certain time period of the day.
But, due to privacy issues and showing respect to that person, these photos are better excluded.
- Parked vehicles, trollies and service equipment can be an obstacle and might appear in these images as well as it can block access to some areas which an image from a certain angel cannot be obtained.
- Animals, like dogs, cats, birds or even squirrels cannot be avoided in some photos which are entitled to the same questions above.
2. Weather.
In a deep learning project, more data means more accuracy and less error, at this stage of our project it was agreed to have 50 photos per building but we can increase the number of photos for more accurate results but due to the limitation of time for this project it was agreed for 50 per building only.
these photos were taken on cloudy days and to expand our work on this project (as future works and recommendations).
Photos on sunny, rainy, foggy, snowy and any other weather condition days can be included.
Even photos in different times of the day can be included such as night, dawn, and sunset times. To provide our designated model with all the possibilities to identify these buildings in all available circumstances.
University House: 60 images Peel building is an important figure of the University of Salford due to its distinct and amazing exterior design but unfortunately it was excluded from the selection due to some maintenance activities at the time of collecting the photos for this project as it is partially covered with scaffolding and a lot of movement by personnel and equipment. If the supervisor suggests that this will be another challenge to include in the project then, it is mandatory to collect its photos. There are many other buildings in the University of Salford and again to expand our project in the future, we can include all the buildings of the University of Salford. The full list of buildings of the university can be reviewed by accessing an interactive map on: www.salford.ac.uk/find-us
Expand Further. This project can be improved furthermore with so many capabilities, again due to the limitation of time given to this project , these improvements can be implemented later as future works. In simple words, this project is to create an application that can display the building’s name when pointing a mobile device with a camera to that building. Future featured to be added: a. Address/ location: this will require collection of additional data which is the longitude and latitude of each building included or the post code which will be the same taking under consideration how close these buildings appear on the interactive map application such as Google maps, Google earth or iMaps. b. Description of the building: what is the building for, by which school is this building occupied? and what facilities are included in this building? c. Interior Images: all the photos at this stage were taken for the exterior of the buildings, will interior photos make an impact on the model/application for example, if the user is inside newton or chapman and opens the application, will the building be identified especially the interior of these buildings have a high level of similarity for the corridors, rooms, halls, and labs? Will the furniture and assets will be as obstacles or identification marks? d. Directions to a specific area/floor inside the building: if the interior images succeed with the model/application, it would be a good idea adding a search option to the model/application so it can guide the user to a specific area showing directions to that area, for example if the user is inside newton building and searches for lab 141 it will direct him to the first floor of the building with an interactive arrow that changes while the user is approaching his destination. Or, if the application can identify the building from its interior, a drop down list will be activated with each floor of this building, for example, if the model/application identifies Newton building, the drop down list will be activated and when pressing on that drop down list it will represent interactive tabs for each floor of the building, selecting one of the floors by clicking on its tab will display the facilities on that floor for example if the user presses on floor 1 tab, another screen will appear displaying which facilities are on that floor. Furthermore, if the model/application identifies another building, it should activate a different number of floors as buildings differ in the number of floors from each other. this feature can be improved with a voice assistant that can direct the user after he applies his search (something similar to the voice assistant in Google maps but applied to the interior of the university’s buildings. e. Top View: if a drone with a camera can be afforded, it can provide arial images and top views for the buildings that can be added to the model/application but these images can be similar to the interior images situation , the buildings can be similar to each other from the top with other obstacles included like water tanks and AC units.
Other Questions:
Will the model/application be reproducible? the presumed answer for this question should be YES, IF, the model/application will be fed with the proper data (images) such as images of restaurants, schools, supermarkets, hospitals, government facilities...etc.
Facebook
Twitterhttps://dataverse.ird.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.23708/N2UY4Chttps://dataverse.ird.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.23708/N2UY4C
There are already a lot of datasets linked to computer vision tasks (Imagenet, MS COCO, Pascal VOC, OpenImages, and numerous others), but they all suffer from important bias. One bias of significance for us is the data origin: most datasets are composed of data coming from developed countries. Facing this situation, and the need of data with local context in developing countries, we try here to adapt common data generation process to inclusive data, meaning data drawn from locations and cultural context that are unseen or poorly represented. We chose to replicate MS COCO's data generation process, as it is well documented and easy to implement. Data was collected from January to April 2022 through Flickr platform. This dataset contains the results of our data collection process, as follows : 23 text files containing comma separated URLs for each of the 23 geographic zones identified in the UN M49 norm. These text files are named according to the names of the geographic zones they cover. Annotations for 400 images per geographic zones. Those annotations are COCO-style, and inform on the presence or absence of 91 categories of objects or concepts on the images. They are shared in a JSON format. Licenses for the 400 annotations per geographic zones, based on the original licenses of the data and specified per image. Those licenses are shared under CSV format. A document explaining the objectives and methodology underlying the data collection, also describing the different components of the dataset.
Facebook
TwitterThis dataset was created by Ismail ELBOUKNIFY
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset captures cultural attitudes towards machine vision technologies as they are expressed in art, games and narratives. The dataset includes records of 500 creative works (including 77 digital games, 191 digital artworks and 236 movies, novels and other narratives) that use or represent machine vision technologies like facial recognition, deepfakes, and augmented reality. The dataset is divided into three main tables, relating to the works, to specific situations in each work involving machine vision technologies, and to the characters that interact with the technologies. Data about each work includes title, author, year and country of publication; types of machine vision technologies featured; topics the work addresses, and sentiments associated with that machine vision usage in the work. In the various works we identified 884 specific situations where machine vision is central. The dataset includes detailed data about each of these situations that describes the actions of human and non-human agents, including machine vision technologies. The dataset is the product of a digital humanities project and can be also viewed as a database at http://machine-vision.no. Data was collected by a team of topic experts who followed an analytical model developed to explore relationships between humans and technologies, inspired by posthumanist and feminist new materialist theories. The project team identified relevant works by searching databases, visiting exhibitions and conferences, reading scholarship, and consulting other experts. The inclusion criteria were creative works( art, games, narratives (movies, novels, etc)) where one of the following machine vision technologies was used in or represented by the work: 3D scans, AI, Augmented reality, Biometrics, Body scans, Camera, Cameraphone, Deepfake, Drones, Emotion recognition, Facial recognition, Filtering, Holograms, Image generation, Interactive panoramas Machine learning, MicroscopeOrTelescope Motion tracking, Non-Visible Spectrum Object recognition, Ocular implant, Satellite images, Surveillance cameras, UGV, Virtual reality, and Webcams. The dataset as well as the more detailed database can be viewed, searched, extracted, or otherwise used or reused and is considered particularly useful for humanities and social science scholars interested in the relationship between technology and culture, and by designers, artists, and scientists developing machine vision technologies.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Explore our detailed Small Object Detection Dataset designed for AI and machine learning applications.
Facebook
TwitterPopulation distribution : the race distribution is Asians, Caucasians and black people, the gender distribution is male and female, the age distribution is from children to the elderly
Collecting environment : including indoor and outdoor scenes (such as supermarket, mall and residential area, etc.)
Data diversity : different ages, different time periods, different cameras, different human body orientations and postures, different ages collecting environment
Device : surveillance cameras, the image resolution is not less than 1,9201,080
Data format : the image data format is .jpg, the annotation file format is .json
Annotation content : human body rectangular bounding boxes, 15 human body attributes
Quality Requirements : A rectangular bounding box of human body is qualified when the deviation is not more than 3 pixels, and the qualified rate of the bounding boxes shall not be lower than 97%;Annotation accuracy of attributes is over 97%
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains 4,599 high-quality, annotated images of 25 commonly used chemistry lab apparatuses. The images, each containing structures in real-world settings, have been captured from different angles, backgrounds, and distances, while also undergoing variations in lighting to aid in the robustness of object detection models. Every image has been labeled using bounding box annotation in TXT (YOLO) format, alongside the class IDs and normalized bounding box coordinates, making object detection more precise. The annotations and bounding boxes have been built using the Roboflow platform.To achieve a better learning procedure, the dataset has been split into three sub-datasets: training, validation, and testing. The training dataset constitutes 70% of the entire dataset, with validation and testing at 20% and 10% respectively. In addition, all images undergo scaling to a standard of 640x640 pixels while being auto-oriented to rectify rotation discrepancies brought about by the EXIF metadata. The dataset is structured in three main folders - train, valid, and test, and each contains images/ and labels/ subfolders. Every image contains a label file containing class and bounding box data corresponding to each detected object.The whole dataset features 6,960 labeled instances per 25 apparatus categories including beakers, conical flasks, measuring cylinders, test tubes, among others. The dataset can be utilized for the development of automation systems, real-time monitoring and tracking systems, tools for safety monitoring, alongside AI educational tools.
Facebook
Twitterhttp://www.gnu.org/licenses/gpl-3.0.en.htmlhttp://www.gnu.org/licenses/gpl-3.0.en.html
A complete description of this dataset is available at https://robotology.github.io/iCubWorld .
Facebook
Twitter## Overview
Computer Vision is a dataset for object detection tasks - it contains Person Basketball annotations for 6,846 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Description: Human Faces and Objects Dataset (HFO-5000) The Human Faces and Objects Dataset (HFO-5000) is a curated collection of 5,000 images, categorized into three distinct classes: male faces (1,500), female faces (1,500), and objects (2,000). This dataset is designed for machine learning and computer vision applications, including image classification, face detection, and object recognition. The dataset provides high-quality, labeled images with a structured CSV file for seamless integration into deep learning pipelines.
Column Description: The dataset is accompanied by a CSV file that contains essential metadata for each image. The CSV file includes the following columns: file_name: The name of the image file (e.g., image_001.jpg). label: The category of the image, with three possible values: "male" (for male face images) "female" (for female face images) "object" (for images of various objects) file_path: The full or relative path to the image file within the dataset directory.
Uniqueness and Key Features: 1) Balanced Distribution: The dataset maintains an even distribution of human faces (male and female) to minimize bias in classification tasks. 2) Diverse Object Selection: The object category consists of a wide variety of items, ensuring robustness in distinguishing between human and non-human entities. 3) High-Quality Images: The dataset consists of clear and well-defined images, suitable for both training and testing AI models. 4) Structured Annotations: The CSV file simplifies dataset management and integration into machine learning workflows. 5) Potential Use Cases: This dataset can be used for tasks such as gender classification, facial recognition benchmarking, human-object differentiation, and transfer learning applications.
Conclusion: The HFO-5000 dataset provides a well-structured, diverse, and high-quality set of labeled images that can be used for various computer vision tasks. Its balanced distribution of human faces and objects ensures fairness in training AI models, making it a valuable resource for researchers and developers. By offering structured metadata and a wide range of images, this dataset facilitates advancements in deep learning applications related to facial recognition and object classification.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a corpus of about 500 computer vision datasets, from which we sampled 114 dataset publications across different vision tasks and coded for themes through both structured and qualitative content analysis. This work most closely pairs with research question 1 in the genealogies of data project (https://arxiv.org/abs/2007.07399): How do dataset developers in CV and NLP research, describe and motivate the decisions that go into their creation?
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Computer Vision Final is a dataset for object detection tasks - it contains Traffic Sign annotations for 2,227 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Collect images (1 Enrollment photo, 20 Historical photos per Identity) and videos (1 Indoor, 1 Outdoor) from unique identities
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Lab Pics Computer Vision is a dataset for object detection tasks - it contains Beaker annotations for 1,083 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Images of small objects for small instance detections. Currently, four object types are available.
We collect four datasets of small objects from images/videos on the Internet (e.g.YouTube or Google).
Fly Dataset: contains 600 video frames with an average of 86 ± 39 flies per frame (648×72 @ 30 fps). 32 images are used for training (1:6:187) and 50 images for testing (301:6:600).
Honeybee Dataset: contains 118 images with an average of 28 ± 6 honeybees per image (640×480). The dataset is divided evenly for training and test sets. Only the first 32 images are used for training.
Fish Dataset: contains 387 frames of video with an average of 56±9 fish per frame (300×410 @ 30 fps). 32 images are used for training (1:3:94) and 65 for testing (193:3:387).
Seagull Dataset: contains three high-resolution images (624×964) with an average of 866±107 seagulls per image. The first image is used for training, and the rest for testing.
Citation: Small Instance Detection by Integer Programming on Object Density Maps. Zheng Ma, Lei Yu, and Antoni B. Chan, In: IEEE Conf. Computer Vision and Pattern Recognition (CVPR), Boston, Jun 2015
original form of dataset available here
Developing object detection algorithms that are more accurate at detecting small objects or small instances of objects.
Facebook
TwitterRace distribution : Asians, Caucasians, black people
Gender distribution : gender balance
Age distribution : ranging from teenager to the elderly, the middle-aged and young people are the majorities
Collecting environment : including indoor and outdoor scenes
Data diversity : different shooting heights, different ages, different light conditions, different collecting environment, clothes in different seasons, multiple human poses
Device : cameras
Data format : the data format is .jpg/mp4, the annotation file format is .json, the camera parameter file format is .json, the point cloud file format is .pcd
Accuracy : based on the accuracy of the poses, the accuracy exceeds 97%;the accuracy of labels of gender, race, age, collecting environment and clothes are more than 97%
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
PV Computer Vision Datasets is a dataset for object detection tasks - it contains SP annotations for 4,031 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
AI And Computer Vision V2 is a dataset for object detection tasks - it contains 3 annotations for 343 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
Twitterproje/computer-vision-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains metadata related to three categories of AI and computer vision applications:
Handwritten Math Solutions: Metadata on images of handwritten math problems with step-by-step solutions.
Multi-lingual Street Signs: Road sign images in various languages, with translations.
Security Camera Anomalies: Surveillance footage metadata distinguishing between normal and suspicious activities.
The dataset is useful for machine learning, image recognition, OCR (Optical Character Recognition), anomaly detection, and AI model training.