100+ datasets found
  1. Face Image Database

    • kaggle.com
    Updated Mar 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yogesh Kantak (2021). Face Image Database [Dataset]. https://www.kaggle.com/yk7283/face-image-database/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 24, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Yogesh Kantak
    Description

    Dataset

    This dataset was created by Yogesh Kantak

    Contents

  2. a

    Data from: Tiny Images Dataset

    • academictorrents.com
    bittorrent
    Updated Jul 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antonio Torralba and Rob Fergus and William T Freeman (2020). Tiny Images Dataset [Dataset]. https://academictorrents.com/details/325fc900c2c7bb7a0cfcfd45851a65c2f5b5391d
    Explore at:
    bittorrent(426335124834)Available download formats
    Dataset updated
    Jul 3, 2020
    Dataset authored and provided by
    Antonio Torralba and Rob Fergus and William T Freeman
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    With the advent of the Internet, billions of images are now freely available online and constitute a dense sampling of the visual world. Using a variety of non-parametric methods, we explore this world with the aid of a large dataset of 79,302,017 images collected from the Internet. Motivated by psychophysical results showing the remarkable tolerance of the human visual system to degradations in image resolution, the images in the dataset are stored as 32 x 32 color images. Each image is loosely labeled with one of the 75,062 non-abstract nouns in English, as listed in the Wordnet lexical database. Hence the image database gives a comprehensive coverage of all object categories and scenes. The semantic information from Wordnet can be used in conjunction with nearest-neighbor methods to perform object classification over a range of semantic levels minimizing the effects of labeling noise. For certain classes that are particularly prevalent in the dataset, such as people, we are able to

  3. Google SERP Data, Web Search Data, Google Images Data | Real-Time API

    • datarade.ai
    .json, .csv
    Updated May 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenWeb Ninja (2024). Google SERP Data, Web Search Data, Google Images Data | Real-Time API [Dataset]. https://datarade.ai/data-products/openweb-ninja-google-data-google-image-data-google-serp-d-openweb-ninja
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    May 17, 2024
    Dataset authored and provided by
    OpenWeb Ninja
    Area covered
    Uganda, Burundi, Panama, Barbados, Tokelau, South Georgia and the South Sandwich Islands, Ireland, Grenada, Virgin Islands (U.S.), Uruguay
    Description

    OpenWeb Ninja's Google Images Data (Google SERP Data) API provides real-time image search capabilities for images sourced from all public sources on the web.

    The API enables you to search and access more than 100 billion images from across the web including advanced filtering capabilities as supported by Google Advanced Image Search. The API provides Google Images Data (Google SERP Data) including details such as image URL, title, size information, thumbnail, source information, and more data points. The API supports advanced filtering and options such as file type, image color, usage rights, creation time, and more. In addition, any Advanced Google Search operators can be used with the API.

    OpenWeb Ninja's Google Images Data & Google SERP Data API common use cases:

    • Creative Media Production: Enhance digital content with a vast array of real-time images, ensuring engaging and brand-aligned visuals for blogs, social media, and advertising.

    • AI Model Enhancement: Train and refine AI models with diverse, annotated images, improving object recognition and image classification accuracy.

    • Trend Analysis: Identify emerging market trends and consumer preferences through real-time visual data, enabling proactive business decisions.

    • Innovative Product Design: Inspire product innovation by exploring current design trends and competitor products, ensuring market-relevant offerings.

    • Advanced Search Optimization: Improve search engines and applications with enriched image datasets, providing users with accurate, relevant, and visually appealing search results.

    OpenWeb Ninja's Annotated Imagery Data & Google SERP Data Stats & Capabilities:

    • 100B+ Images: Access an extensive database of over 100 billion images.

    • Images Data from all Public Sources (Google SERP Data): Benefit from a comprehensive aggregation of image data from various public websites, ensuring a wide range of sources and perspectives.

    • Extensive Search and Filtering Capabilities: Utilize advanced search operators and filters to refine image searches by file type, color, usage rights, creation time, and more, making it easy to find exactly what you need.

    • Rich Data Points: Each image comes with more than 10 data points, including URL, title (annotation), size information, thumbnail, and source information, providing a detailed context for each image.

  4. o

    Optical Coherence Tomography Image Retinal Database

    • openicpsr.org
    • search.gesis.org
    Updated Feb 15, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peyman Gholami; Vasudevan Lakshminarayanan (2019). Optical Coherence Tomography Image Retinal Database [Dataset]. http://doi.org/10.3886/E108503V1
    Explore at:
    Dataset updated
    Feb 15, 2019
    Dataset provided by
    University of Waterloo
    Authors
    Peyman Gholami; Vasudevan Lakshminarayanan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    An open source Optical Coherence Tomography Image Database containing different retinal OCT images with different pathological conditions. Please use the following citation if you use the database: Peyman Gholami, Priyanka Roy, Mohana Kuppuswamy Parthasarathy, Vasudevan Lakshminarayanan, "OCTID: Optical Coherence Tomography Image Database", arXiv preprint arXiv:1812.07056, (2018). For more information and details about the database see: https://arxiv.org/abs/1812.07056

  5. d

    Congress of Neurological Surgeons Online Image Database

    • dknet.org
    • scicrunch.org
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Congress of Neurological Surgeons Online Image Database [Dataset]. http://identifiers.org/RRID:SCR_006310
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Data set of almost 2,000 neurosurgical images using a variety of search options.

  6. P

    Data from: WebVision Dataset

    • paperswithcode.com
    Updated Mar 18, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wen Li; Li-Min Wang; Wei Li; Eirikur Agustsson; Luc van Gool (2017). WebVision Dataset [Dataset]. https://paperswithcode.com/dataset/webvision-database
    Explore at:
    Dataset updated
    Mar 18, 2017
    Authors
    Wen Li; Li-Min Wang; Wei Li; Eirikur Agustsson; Luc van Gool
    Description

    The WebVision dataset is designed to facilitate the research on learning visual representation from noisy web data. It is a large scale web images dataset that contains more than 2.4 million of images crawled from the Flickr website and Google Images search.

    The same 1,000 concepts as the ILSVRC 2012 dataset are used for querying images, such that a bunch of existing approaches can be directly investigated and compared to the models trained from the ILSVRC 2012 dataset, and also makes it possible to study the dataset bias issue in the large scale scenario. The textual information accompanied with those images (e.g., caption, user tags, or description) are also provided as additional meta information. A validation set contains 50,000 images (50 images per category) is provided to facilitate the algorithmic development.

  7. USDA ARS Image Gallery

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    • +2more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). USDA ARS Image Gallery [Dataset]. https://catalog.data.gov/dataset/usda-ars-image-gallery-7f166
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    This Image Gallery is provided as a complimentary source of high-quality digital photographs available from the Agricultural Research Service information staff. Photos, (over 2,000 .jpegs) in the Image Gallery are copyright-free, public domain images unless otherwise indicated. Resources in this dataset:Resource Title: USDA ARS Image Gallery (Web page) . File Name: Web Page, url: https://www.ars.usda.gov/oc/images/image-gallery/ Over 2000 copyright-free images from ARS staff.

  8. Images Alike

    • kaggle.com
    Updated Jan 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alin Cijov (2021). Images Alike [Dataset]. https://www.kaggle.com/alincijov/images-alike/notebooks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 2, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Alin Cijov
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Image Similarity

    'images' folder contains pairs of images that look alike (each image is separated from its look alike). The 'validate.csv' contains the look alike images uid from 'image_a' and 'image_b'.

    Challenge

    The main challenge is to train a model so that for each image in 'images' folder, we can predict its look alike image. Use the 'validate.csv' to make sure that your model is accurate.

    Pattern

    Cornell University - Computer Vision and Pattern Recognition

  9. P

    LIVE-itw Dataset

    • paperswithcode.com
    Updated Dec 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deepti Ghadiyaram; Alan C. Bovik (2023). LIVE-itw Dataset [Dataset]. https://paperswithcode.com/dataset/live-itw
    Explore at:
    Dataset updated
    Dec 24, 2023
    Authors
    Deepti Ghadiyaram; Alan C. Bovik
    Description

    Image quality assessment (IQA) databases enable researchers to evaluate the performance of IQA algorithms and contribute towards attaining the ultimate goal of objective quality assessment research - matching human perception. Most publicly available image quality databases have been created under highly controlled conditions by introducing graded simulated distortions onto high-quality photographs. However, images captured using typical real-world mobile camera devices are usually afflicted by complex mixtures of multiple distortions, which are not necessarily well-modeled by the synthetic distortions found in existing databases. Our newly designed and created LIVE In the Wild Image Quality Challenge Database, contains widely diverse authentic image distortions on a large number of images captured using a representative variety of modern mobile devices. We also designed and implemented a new online crowdsourcing system, which we have used to conduct a very large-scale, multi-month image quality assessment subjective study. The LIVE In the Wild Image Quality Database has over 350,000 opinion scores on 1,162 images evaluated by over 8100 unique human observers.

  10. c

    High-Quality Fashion Image Dataset

    • crawlfeeds.com
    jpg, zip
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). High-Quality Fashion Image Dataset [Dataset]. https://crawlfeeds.com/datasets/fashion-products-images-dataset
    Explore at:
    zip, jpgAvailable download formats
    Dataset updated
    May 29, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Elevate your AI and machine learning projects with our comprehensive fashion image dataset, carefully curated to meet the needs of cutting-edge applications in e-commerce, product recommendation systems, and fashion trend analysis.

    Our fashion product images dataset includes over 111,000+ high-resolution JPG images featuring labeled data for clothing, accessories, styles, and more. These images have been sourced from multiple platforms, ensuring diverse and representative content for your projects.

    Why Choose Our Fashion Dataset?

    • Extensive Image Collection: Gain access to a vast library of 111K+ fashion images, perfect for training machine learning models with precision.
    • Detailed Labels: The dataset includes annotated images for garments, accessories, and various fashion styles to enhance model accuracy.
    • Versatile Applications: Ideal for e-commerce platforms, AI-based fashion assistants, trend analysis, and product personalization.
    • Quality You Can Trust: Download a sample dataset to evaluate the quality and compatibility before diving into the complete collection.

    Whether you're building a product recommendation engine, a virtual stylist, or conducting advanced research in fashion AI, this dataset is your go-to resource.

    Download and Explore the Fashion Dataset Today!

    Get started now and unlock the potential of your AI projects with our reliable and diverse fashion images dataset. Perfect for professionals and researchers alike.

  11. Z

    EyeOnWater training dataset for assessing the inclusion of water images

    • data.niaid.nih.gov
    • zenodo.org
    Updated Mar 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marine Information Service (2025). EyeOnWater training dataset for assessing the inclusion of water images [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10777440
    Explore at:
    Dataset updated
    Mar 20, 2025
    Dataset provided by
    Marine Information Service
    Krijger, Tjerk
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Training dataset

    The EyeOnWater app is designed to assess the ocean's water quality using images captured by regular citizens. In order to have an extra helping hand in determining whether an image meets the criteria for inclusion in the app, the YOLOv8 model for image classification is employed. With the help of this model all uploaded pictures are assessed. If the model deems a water image unsuitable, it is excluded from the app's online database. In order to train this model a training dataset containing a large pool of different images is required. The dataset contains a total of 13,766 images, categorized into three distinct classes: “water_good,” “water_bad,” and “other.” The “water_good” class includes images that meet the requirements of EyeOnWater. The “water_bad” class comprises images of water that do not fulfill these requirements. Finally, the “other” class consists of miscellaneous images that users submitted, which do not depict water. This categorization enables precise filtering and analysis of images relevant to water quality assessment.

  12. c

    Fashion images dataset extracted from farfetch

    • crawlfeeds.com
    csv, zip
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Fashion images dataset extracted from farfetch [Dataset]. https://crawlfeeds.com/datasets/fashion-images-extracted-from-the-farfetch-website
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    May 29, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Explore our curated dataset of fashion images extracted from farfetch. Discover the latest trends and designer collections for style inspiration. Browse now!

    Images count: 63K+

    Products count: 15K+

  13. TNO Image Fusion Dataset

    • figshare.com
    • search.datacite.org
    zip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Toet (2023). TNO Image Fusion Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.1008029.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Alexander Toet
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The TNO Image Fusion Dataset contains nultispectral (intensified visual, near-infrared, and longwave infrared or thermal) nighttime imagery of different military relevant scenerios, registered with different multiband camnera systems. The different camera systems used to register this imagery are respectively Athena, DHV, FEL, and TRICLOBS. Imagery recorded with these systems are stored in folders labeled with the corresponding camera system name. Information on the registration conditions and the respective camera systems is included in the REFRENCES sections in each of the folders. The images can freely be used for research purposes, and may be used in publications without prior notice, provided proper credit is given to the owner (TNO, Soesterberg, The Netherlands) and this figshare dataset is properly referenced.

  14. Image Search Results from Google Images

    • openwebninja.com
    json
    Updated Oct 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenWeb Ninja (2024). Image Search Results from Google Images [Dataset]. https://www.openwebninja.com/api/real-time-image-search
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Oct 6, 2024
    Dataset authored and provided by
    OpenWeb Ninja
    Area covered
    Global Image Search
    Description

    This dataset provides comprehensive access to image search results from Google Images in real-time. It supports all image search filters and parameters available on Google Images Advanced Search, enabling precise and targeted image queries. The dataset is delivered in a JSON format via REST API.

  15. AIDER (Aerial Image Dataset for Emergency Response Applications)

    • zenodo.org
    zip
    Updated Aug 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christos Kyrkou; Christos Kyrkou (2020). AIDER (Aerial Image Dataset for Emergency Response Applications) [Dataset]. http://doi.org/10.5281/zenodo.3888300
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 3, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Christos Kyrkou; Christos Kyrkou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    AIDER (Aerial Image Dataset for Emergency Response applications): The dataset construction involved manually collecting all images for four disaster events, namely Fire/Smoke, Flood, Collapsed Building/Rubble, and Traffic Accidents, as well as one class for the Normal case.

    The aerial images for the disaster events were collected through various online sources (e.g. google images, bing images, youtube, news agencies web sites, etc.) using the keywords ”Aerial View” or ”UAV” or”Drone” and an event such as Fire”,”Earthquake”,”Highway accident”, etc. Images are initially of different sizes but are standardized prior to training. All images where manually inspected to first contain the event that was of interested and then to have the event centered at the image so that any geometric transformations during augmentation would not remove it from the image view. During the data collection process the various disaster events were captured with different resolutions and under various condition with regards to illumination and viewpoint. Finally, to replicate real world scenarios the dataset is imbalanced in the sense that it contains more images from the Normal class.

    This subset includes around 500 images for each disaster class and over 4000 images for the normal class. This makes it an imbalanced classification problem.

    It is advised to further enhance the dataset that random augmentations are probabilistically applied to each image prior to adding it to the batch for training. Specifically there are a number of possible transformations such as geometric (rotations, translations, horizontal axis mirroring, cropping and zooming), as well as image manipulations (illumination changes, color shifting, blurring, sharpening, and shadowing).

  16. Data from: Inventory of online public databases and repositories holding...

    • s.cnmilf.com
    • datadiscoverystudio.org
    • +4more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Inventory of online public databases and repositories holding agricultural data in 2017 [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/inventory-of-online-public-databases-and-repositories-holding-agricultural-data-in-2017-d4c81
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    United States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data. Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to establish where agricultural researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently publish their data, including general research data repositories, _domain-specific databases, and the top journals compare how much data is in institutional vs. _domain-specific vs. federal platforms determine which repositories are recommended by top journals that require or recommend the publication of supporting data ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository can publish data Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find _domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered. Search methods We first compiled a list of known _domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal / USDA results). Most of these results were _domain specific, though some contained a mix of data subjects. We then used search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. We found both ag specific university repositories and general university repositories that housed a portion of agricultural data. Ag specific university repositories are included in the list of _domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology Information) repositories. Next we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS, Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development (RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they (a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories. Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies, ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online author guidelines for each the top 50 journals. Evaluation We ran a series of searches on all resulting general subject databases with the designated search terms. From the results, we noted the total number of datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection, and any search term that returned greater than 100 and greater than 500 results. We compared _domain-specific databases and repositories based on parent organization, type of institution, and whether data submissions were dependent on conditions such as funding or affiliation of some kind. Results A summary of the major findings from our data review: Over half of the top 50 ag-related journals from our profile require or encourage open data for their published authors. There are few general repositories that are both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility), ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result comprise at least 5% of the total collection. Not even one quarter of the _domain-specific repositories and datasets reviewed allow open submission by any researcher regardless of funding or affiliation. See included README file for descriptions of each individual data file in this dataset. Resources in this dataset:Resource Title: Journals. File Name: Journals.csvResource Title: Journals - Recommended repositories. File Name: Repos_from_journals.csvResource Title: TDWG presentation. File Name: TDWG_Presentation.pptxResource Title: Domain Specific ag data sources. File Name: domain_specific_ag_databases.csvResource Title: Data Dictionary for Ag Data Repository Inventory. File Name: Ag_Data_Repo_DD.csvResource Title: General repositories containing ag data. File Name: general_repos_1.csvResource Title: README and file inventory. File Name: README_InventoryPublicDBandREepAgData.txt

  17. f

    Retinal Fundus Glaucoma Image dataset

    • figshare.com
    bin
    Updated Feb 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fu Huazhu; Li Fei; José Ignacio Orlando; Bogunović Hrvoje; Liao Jingan; Zhang Shaochong; Zhang Xiulan; Zhuo Zhang; Feng Shou Yin; Jiang Liu; Wing Kee Wong; Ngan Meng Tan; Beng Hai Lee; Jun Cheng; Tien Yin Wong (2024). Retinal Fundus Glaucoma Image dataset [Dataset]. http://doi.org/10.6084/m9.figshare.24549217.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 12, 2024
    Dataset provided by
    figshare
    Authors
    Fu Huazhu; Li Fei; José Ignacio Orlando; Bogunović Hrvoje; Liao Jingan; Zhang Shaochong; Zhang Xiulan; Zhuo Zhang; Feng Shou Yin; Jiang Liu; Wing Kee Wong; Ngan Meng Tan; Beng Hai Lee; Jun Cheng; Tien Yin Wong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Z. Zhang et al., "ORIGA-light: An online retinal fundus image database for glaucoma analysis and research," 2010 Annual International Conference of the IEEE Engineering in Medicine and Biology, Buenos Aires, Argentina, 2010, pp. 3065-3068, doi: 10.1109/IEMBS.2010.5626137.Huazhu Fu, Fei Li, José Ignacio Orlando, Hrvoje Bogunović, Xu Sun, Jingan Liao, Yanwu Xu, Shaochong Zhang, Xiulan Zhang, July 9, 2019, "REFUGE: Retinal Fundus Glaucoma Challenge", IEEE Dataport, doi: https://dx.doi.org/10.21227/tz6e-r977.https://www.kaggle.com/datasets/arnavjain1/glaucoma-datasets

  18. m

    Relevant Image Dataset

    • data.mendeley.com
    Updated Dec 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hayri Volkan Agun (2020). Relevant Image Dataset [Dataset]. http://doi.org/10.17632/mbk294tthf.1
    Explore at:
    Dataset updated
    Dec 22, 2020
    Authors
    Hayri Volkan Agun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset contains relevant and irrelevant image tags of Web pages of 125 different domains. The image dataset contains the web domain, file number, the text of image HTML element, attributes of image elements, the size attributes, the parent HTML element of the image, and relevancy of the image. Each Web domain contains 100 Web pages with varying number of image elements.

  19. Wirestock's AI/ML Image Training Data, 4.5M Files with Metadata

    • datarade.ai
    .csv
    Updated Jul 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WIRESTOCK (2023). Wirestock's AI/ML Image Training Data, 4.5M Files with Metadata [Dataset]. https://datarade.ai/data-products/wirestock-s-ai-ml-image-training-data-4-5m-files-with-metadata-wirestock
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jul 18, 2023
    Dataset provided by
    Wirestock, Inc.
    Authors
    WIRESTOCK
    Area covered
    Peru, Georgia, Belarus, Sudan, Jersey, New Caledonia, Chile, Estonia, Swaziland, Pakistan
    Description

    Wirestock's AI/ML Image Training Data, 4.5M Files with Metadata: This data product is a unique offering in the realm of AI/ML training data. What sets it apart is the sheer volume and diversity of the dataset, which includes 4.5 million files spanning across 20 different categories. These categories range from Animals/Wildlife and The Arts to Technology and Transportation, providing a rich and varied dataset for AI/ML applications.

    The data is sourced from Wirestock's platform, where creators upload and sell their photos, videos, and AI art online. This means that the data is not only vast but also constantly updated, ensuring a fresh and relevant dataset for your AI/ML needs. The data is collected in a GDPR-compliant manner, ensuring the privacy and rights of the creators are respected.

    The primary use-cases for this data product are numerous. It is ideal for training machine learning models for image recognition, improving computer vision algorithms, and enhancing AI applications in various industries such as retail, healthcare, and transportation. The diversity of the dataset also means it can be used for more niche applications, such as training AI to recognize specific objects or scenes.

    This data product fits into Wirestock's broader data offering as a key resource for AI/ML training. Wirestock is a platform for creators to sell their work, and this dataset is a collection of that work. It represents the breadth and depth of content available on Wirestock, making it a valuable resource for any company working with AI/ML.

    The core benefits of this dataset are its volume, diversity, and quality. With 4.5 million files, it provides a vast resource for AI training. The diversity of the dataset, spanning 20 categories, ensures a wide range of images for training purposes. The quality of the images is also high, as they are sourced from creators selling their work on Wirestock.

    In terms of how the data is collected, creators upload their work to Wirestock, where it is then sold on various marketplaces. This means the data is sourced directly from creators, ensuring a diverse and unique dataset. The data includes both the images themselves and associated metadata, providing additional context for each image.

    The different image categories included in this dataset are Animals/Wildlife, The Arts, Backgrounds/Textures, Beauty/Fashion, Buildings/Landmarks, Business/Finance, Celebrities, Education, Emotions, Food Drinks, Holidays, Industrial, Interiors, Nature Parks/Outdoor, People, Religion, Science, Signs/Symbols, Sports/Recreation, Technology, Transportation, Vintage, Healthcare/Medical, Objects, and Miscellaneous. This wide range of categories ensures a diverse dataset that can cater to a variety of AI/ML applications.

  20. Z

    Data from: Dataset "Privacy-aware image classification and search"

    • data.niaid.nih.gov
    • eprints.soton.ac.uk
    Updated Oct 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siersdorfer, Stefan (2021). Dataset "Privacy-aware image classification and search" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4568970
    Explore at:
    Dataset updated
    Oct 15, 2021
    Dataset provided by
    Siersdorfer, Stefan
    Zerr, Sergej
    Demidova, Elena
    Hare Jonathon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Modern content sharing environments such as Flickr or YouTube contain a large number of private resources such as photos showing weddings, family holidays, and private parties. These resources can be of a highly sensitive nature, disclosing many details of the users' private sphere. In order to support users in making privacy decisions in the context of image sharing and to provide them with a better overview of privacy-related visual content available on the Web, we propose techniques to automatically detect private images and to enable privacy-oriented image search. In order to classify images, we use the metadata like title and tags and plan to use visual features which are described in our scientific paper. The data set used in the paper is now available.

    Picalet! cleaned dataset - ( recommended for experiments) userstudy - (images annotated with queries, anonymized user id and privacy value)

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Yogesh Kantak (2021). Face Image Database [Dataset]. https://www.kaggle.com/yk7283/face-image-database/tasks
Organization logo

Face Image Database

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 24, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Yogesh Kantak
Description

Dataset

This dataset was created by Yogesh Kantak

Contents

Search
Clear search
Close search
Google apps
Main menu