100+ datasets found
  1. h

    alfred-dataset

    • huggingface.co
    Updated Oct 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JackLiu (2024). alfred-dataset [Dataset]. https://huggingface.co/datasets/JackLiuAngel/alfred-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 28, 2024
    Authors
    JackLiu
    License

    https://choosealicense.com/licenses/openrail/https://choosealicense.com/licenses/openrail/

    Description

    JackLiuAngel/alfred-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  2. P

    US-4 Dataset

    • paperswithcode.com
    • opendatalab.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yixiong Chen; Chunhui Zhang; Li Liu; Cheng Feng; Changfeng Dong; Yongfang Luo; Xiang Wan, US-4 Dataset [Dataset]. https://paperswithcode.com/dataset/us-4
    Explore at:
    Authors
    Yixiong Chen; Chunhui Zhang; Li Liu; Cheng Feng; Changfeng Dong; Yongfang Luo; Xiang Wan
    Description

    The US-4 is a dataset of Ultrasound (US) images. It is a video-based image dataset that contains over 23,000 high-resolution images from four US video sub-datasets, where two sub-datasets are newly collected by experienced doctors for this dataset.

  3. h

    sp-dataset

    • huggingface.co
    Updated Nov 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    themex (2024). sp-dataset [Dataset]. https://huggingface.co/datasets/themex1380/sp-dataset
    Explore at:
    Dataset updated
    Nov 14, 2024
    Authors
    themex
    Description

    themex1380/sp-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. R

    Paper and Plastic Detection Dataset

    • universe.roboflow.com
    zip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Methods of Research Final Project (2023). Paper and Plastic Detection Dataset [Dataset]. https://universe.roboflow.com/methods-of-research-final-project/paper-and-plastic-detection
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset authored and provided by
    Methods of Research Final Project
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Paper Plastic Bounding Boxes
    Description

    Here are a few use cases for this project:

    1. Waste Classification and Recycling: Industries or municipal bodies could employ this model to automatically sort waste into paper or plastic categories, facilitating more efficient recycling processes.

    2. Environmental Protection: Various organizations or government departments might use the model for capturing and monitoring plastic waste in public areas or natural environments, helping to measure pollution levels.

    3. Retail and Supermarkets: It could be integrated into self-service checkout systems to automatically identify the difference between plastic and paper packaging, allowing for potential pricing differences or recycling initiatives.

    4. Education and Research: Teachers, students, and researchers can use it as a practical tool for exploring machine learning or environmental sciences and promoting the importance of waste separation.

    5. Smart Home Integration: The model could be integrated into a smart home system to guide residents in sorting their trash accurately and educating them on recycling.

  5. T

    sci_tail

    • tensorflow.org
    • paperswithcode.com
    • +3more
    Updated Dec 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). sci_tail [Dataset]. https://www.tensorflow.org/datasets/catalog/sci_tail
    Explore at:
    Dataset updated
    Dec 23, 2022
    Description

    The SciTail dataset is an entailment dataset created from multiple-choice science exams and web sentences. Each question and the correct answer choice are converted into an assertive statement to form the hypothesis. Information retrieval is used to obtain relevant text from a large text corpus of web sentences, and these sentences are used as a premise P. The annotation of such premise-hypothesis pair is crowdsourced as supports (entails) or not (neutral), in order to create the SciTail dataset. The dataset contains 27,026 examples with 10,101 examples with entails label and 16,925 examples with neutral label.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('sci_tail', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  6. P

    CoNaLa Dataset

    • paperswithcode.com
    Updated Feb 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pengcheng Yin; Bowen Deng; Edgar Chen; Bogdan Vasilescu; Graham Neubig (2024). CoNaLa Dataset [Dataset]. https://paperswithcode.com/dataset/conala
    Explore at:
    Dataset updated
    Feb 9, 2024
    Authors
    Pengcheng Yin; Bowen Deng; Edgar Chen; Bogdan Vasilescu; Graham Neubig
    Description

    The CMU CoNaLa, the Code/Natural Language Challenge dataset is a joint project from the Carnegie Mellon University NeuLab and Strudel labs. Its purpose is for testing the generation of code snippets from natural language. The data comes from StackOverflow questions. There are 2379 training and 500 test examples that were manually annotated. Every example has a natural language intent and its corresponding python snippet. In addition to the manually annotated dataset, there are also 598,237 mined intent-snippet pairs. These examples are similar to the hand-annotated ones except that they contain a probability if the pair is valid.

  7. Tree-O dataset

    • catalog.data.gov
    • datasets.ai
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Tree-O dataset [Dataset]. https://catalog.data.gov/dataset/tree-o-dataset
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The dataset contains, by Census block-group, the variables used in the analysis and the resulting ranks and scores derived as described in the manuscript text. This dataset is associated with the following publication: Almeter, A., A. Tashie, A. Proctor, T. McAlexander, D. Browning, C. Rudder, L. Jackson, and R. Araujo. A Needs-Driven, Multi-Objective Approach to Allocate Urban Ecosystem Services from 10,000 Trees. Sustainability. MDPI AG, Basel, SWITZERLAND, 10(12): 4488, (2018).

  8. h

    synthetic-dataset-1m-dalle3-high-quality-captions

    • huggingface.co
    Updated May 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ben (2024). synthetic-dataset-1m-dalle3-high-quality-captions [Dataset]. https://huggingface.co/datasets/ProGamerGov/synthetic-dataset-1m-dalle3-high-quality-captions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 3, 2024
    Authors
    Ben
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for Dalle3 1 Million+ High Quality Captions

    Alt name: Human Preference Synthetic Dataset

    Example grids for landscapes, cats, creatures, and fantasy are also available.

      Description:
    

    This dataset comprises of AI-generated images sourced from various websites and individuals, primarily focusing on Dalle 3 content, along with contributions from other AI systems of sufficient quality like Stable Diffusion and Midjourney (MJ v5 and above). As users… See the full description on the dataset page: https://huggingface.co/datasets/ProGamerGov/synthetic-dataset-1m-dalle3-high-quality-captions.

  9. d

    What's Happening LA Calendar Dataset - ARCHIVED

    • catalog.data.gov
    • data.lacity.org
    • +1more
    Updated Nov 29, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.lacity.org (2021). What's Happening LA Calendar Dataset - ARCHIVED [Dataset]. https://catalog.data.gov/dataset/whats-happening-la-calendar-dataset-archived
    Explore at:
    Dataset updated
    Nov 29, 2021
    Dataset provided by
    data.lacity.org
    Description

    All-City event calendar - ARCHIVED For the new LA City Events dataset (refreshed daily), see https://data.lacity.org/A-Prosperous-City/LA-City-Events/rx9t-fp7k

  10. Eye Dataset

    • kaggle.com
    zip
    Updated Feb 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prasad V Patil (2021). Eye Dataset [Dataset]. https://www.kaggle.com/datasets/prasadvpatil/eye-dataset
    Explore at:
    zip(3265889 bytes)Available download formats
    Dataset updated
    Feb 1, 2021
    Authors
    Prasad V Patil
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Prasad V Patil

    Released under CC0: Public Domain

    Contents

  11. Liver Dataset

    • kaggle.com
    zip
    Updated Dec 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Figo (2023). Liver Dataset [Dataset]. https://www.kaggle.com/datasets/figolm10/liver-dataset
    Explore at:
    zip(273901 bytes)Available download formats
    Dataset updated
    Dec 17, 2023
    Authors
    Figo
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Figo

    Released under Apache 2.0

    Contents

  12. u

    Python Programming Dataset

    • pub.uni-bielefeld.de
    Updated Feb 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin Paaßen (2020). Python Programming Dataset [Dataset]. https://pub.uni-bielefeld.de/record/2941052
    Explore at:
    Dataset updated
    Feb 10, 2020
    Authors
    Benjamin Paaßen
    Description

    This repository contains programming data collected from 15 students during November and December of 2019 at Bielefeld University. Students were asked to implement gradient descent. Note that this data set contains only source code snapshots and neither timestamps nor personal information. All students programmed in a web environment, which is also contained in this repository.

  13. Documents dataset

    • kaggle.com
    zip
    Updated Oct 24, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leo Arruda (2017). Documents dataset [Dataset]. https://www.kaggle.com/datasets/leoarruda/documents-dataset
    Explore at:
    zip(226618 bytes)Available download formats
    Dataset updated
    Oct 24, 2017
    Authors
    Leo Arruda
    Description

    Dataset

    This dataset was created by Leo Arruda

    Contents

  14. Data from: UF FLMNH Ichthyology

    • gbif.org
    • es.bionomia.net
    • +4more
    Updated Dec 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rob Robins; Rob Robins (2024). UF FLMNH Ichthyology [Dataset]. http://doi.org/10.15468/8mjsel
    Explore at:
    Dataset updated
    Dec 23, 2024
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Florida Museum of Natural History
    Authors
    Rob Robins; Rob Robins
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Area covered
    Description

    The UF Fish Collection, dating to 1917, contains 214,205 lots and 2,300,803 specimens. Included are representatives of 8,250 species from 400 families. The collection includes 93 primary types and approximately 1,600 lots of secondary types representing 563 species. Also in the collection are 5,825 specimens of disarticulated and articulated skeletons representing 875 species. Especially notable are historic collections of large and important marine fishes as well as rapidly growing collections of freshwater fishes from Southeast Asia. In 2006, the museum expanded its program to archive frozen tissue samples with a newly established UF Genetic Resources Collection. Tissues of fishes are stored in -20ºC freezers and number 4,150 samples of 900 species. All specimens and tissues are databased online and available for loan.

  15. N

    Minneola, KS Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Minneola, KS Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/d0a58acc-c980-11ee-9145-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 19, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Kansas, Minneola
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Minneola by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Minneola across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of male population, with 52.37% of total population being male. Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Minneola is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Minneola total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Minneola Population by Race & Ethnicity. You can refer the same here

  16. R

    My Game Pics Dataset

    • universe.roboflow.com
    zip
    Updated Dec 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    My Game Pics (2024). My Game Pics Dataset [Dataset]. https://universe.roboflow.com/my-game-pics/my-game-pics
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 19, 2024
    Dataset authored and provided by
    My Game Pics
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Deer Hog Bounding Boxes
    Description

    This dataset contains annotated pictures of animals (like wild pigs and deer) from trail cameras in East Texas.

    You can use this dataset and the detection API to create computer vision applications for hunting, monitoring animal population health, counting deer sightings, and more!

    Automatically filter through hours of trail cam footage to find the times/frames when wild game is caught on camera.

  17. CA Geographic Boundaries

    • data.ca.gov
    • catalog.data.gov
    shp
    Updated May 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Technology (2024). CA Geographic Boundaries [Dataset]. https://data.ca.gov/dataset/ca-geographic-boundaries
    Explore at:
    shp(2597712), shp(136046), shp(10153125)Available download formats
    Dataset updated
    May 3, 2024
    Dataset provided by
    California Department of Technologyhttp://cdt.ca.gov/
    Description

    This dataset contains shapefile boundaries for CA State, counties and places from the US Census Bureau's 2023 MAF/TIGER database. Current geography in the 2023 TIGER/Line Shapefiles generally reflects the boundaries of governmental units in effect as of January 1, 2023.

  18. t

    Rivers Dataset - Dataset - LDM

    • service.tib.eu
    Updated Aug 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Rivers Dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/rivers-dataset
    Explore at:
    Dataset updated
    Aug 10, 2023
    Description

    Global rivers License ODbL by OpenStreetMap

  19. UCSB - University of California Santa Barbara Herbarium

    • gbif.org
    • bionomia.net
    Updated Dec 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GBIF (2024). UCSB - University of California Santa Barbara Herbarium [Dataset]. http://doi.org/10.15468/qpxmw0
    Explore at:
    Dataset updated
    Dec 3, 2024
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    Cheadle Center for Biodiversity and Ecological Restoration
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Santa Barbara
    Description

    The University of California, Santa Barbara (UCSB) Herbarium has approximately 120,000 herbarium specimens of vascular plants, lichens, bryophytes, and marine macroalgae. The herbarium is housed at the Cheadle Center for Biodiversity and Ecological Restoration on the campus of UCSB. The vascular plant collection consist mainly of specimens from Santa Barbara County, including the northern Channel Islands, with additional collections from San Luis Obispo, Kern, and Ventura Counties, the southern Sierra Nevada region, southern California, and northern Mexico. Special collections include the J. R. Haller pine collection (5,000 specimens), with emphasis on population-level sampling of many western North American pine species, and the Cornelius H. Muller oak collection, with ca. 7,000 specimens from the USA and Mexico. Also conserved in the herbarium are ca. 69,000 slide preparations and spirit collections of Vernon I. Cheadle and Katherine Esau. There are 43 type specimens of plants and marine macroalgae. Incorporated collections include the Santa Rosa Island Reserve (SCIR) herbarium (1,500) and the marine macroalgae of the Santa Barbara Museum of Natural History (1,035), which contains some of the earliest collections of California seaweeds. Greg Wahlert is the current collections manager. Taxonomy and nomenclature follow the second edition of the Jepson Manual (Baldwin et al., 2012). Financial assistance with digitization efforts is provided in part by the UCSB Coastal Fund.

  20. i

    Signing in the Wild dataset

    • ieee-dataport.org
    Updated Feb 23, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mark Borg (2019). Signing in the Wild dataset [Dataset]. http://doi.org/10.21227/w24f-yh32
    Explore at:
    Dataset updated
    Feb 23, 2019
    Dataset provided by
    IEEE Dataport
    Authors
    Mark Borg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Our Signing in the Wild dataset consists of various videos harvested from YouTube containing people signing in various sign languages and doing so in diverse settings, environments, under complex signer and camera motion, and even group signing. This dataset is intended to be used for sign language detection. For the negative set, we created two classes of videos, labelled ‘speaking’ and ‘other’. Our motivation for the ‘speaking’ class is that speech is often accompanied by hand gestures (gesticulation), which can be easily confused with signing. Signing can be discriminated by its linguistic nature, i.e., its distinct phonological, morphological and categorical (discrete) structures, while gesticulation tends to be more spontaneous, idiosyncratic and analogue in nature. For the ‘other’ class, we looked for distractors to both ‘signing’ and ‘speaking’, i.e., videos containing hand movements that are quite similar to signing/gesticulation and thus might confuse a classifier. Examples include: miming, hand exercises, various manual activities like playing instruments, painting, writing, yoga and martial arts, sports like table tennis, etc. Also included are activities similar to speech, like people laughing, clapping, nodding, listening to other speakers, etc. A total of 1120 videos are included in our dataset, each video contributing the first 6.6 minutes, resulting in 2000 frames per video when sampled at 5Hz. We have 1.45 million video frames in total. Our videos are untrimmed, i.e., a video can contain multiple activities, background scenes, scene cuts, and other actions done by the same or different actors. Thus the videos are unconstrained both spatially and also temporally. This is in line with recent trends in video action recognition [8], and unlike ASLR datasets where trimmed videos are the norm. In particular, several videos in our dataset contain all 3 classes (occasionally with temporal overlap), and sometimes the same person alternating between signing and speaking. We performed manual groundtruthing at video frame level. Since action boundaries can be inherently fuzzy, we consider a short temporal context (10 frames) surrounding the frame to be labelled in order to decide on its class label. We also adopt certain spatial guidelines, e.g. mouth movements must be visible for action ‘speaking’, thus eliminating distant views and when the speaker turns his/her back to the camera. Ambiguous cases are left unlabelled. We annotate video segments that do not contain signing or speaking as ‘other’, including opening/closing credits, title screens, scene transitions, animations, background scenes, etc. If you find this dataset useful, please cite the following paper: Mark Borg, Kenneth P. Camilleri, "Sign Language Detection "In The Wild" With Recurrent Neural Networks", ICASSP 2019. Any comments, suggestions, feedback are welcome: mborg2005 gmail com

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
JackLiu (2024). alfred-dataset [Dataset]. https://huggingface.co/datasets/JackLiuAngel/alfred-dataset

alfred-dataset

JackLiuAngel/alfred-dataset

Explore at:
78 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 28, 2024
Authors
JackLiu
License

https://choosealicense.com/licenses/openrail/https://choosealicense.com/licenses/openrail/

Description

JackLiuAngel/alfred-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

Search
Clear search
Close search
Google apps
Main menu