Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Fill In The Blanks is a dataset for object detection tasks - it contains FIB annotations for 1,496 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterThe Polyvore Outfit dataset is one of the largest datasets for fashion compatibility prediction and Fill-in-the-Blank (FITB) tasks. It provides structured information about outfits and individual fashion items, making it a valuable resource for research in outfit recommendation and fashion compatibility modeling.
The dataset consists of two types of sets: disjoint and non-disjoint:
Each item has information in polyvore_item_metadata.json:
json
{url_name: "bean scotch plaid shirt relaxed",
description': "The same great tartan flannel as in our men's shirt, designed just for you. Relaxed Fit: Our most generous fit sits farthest from the body. Falls at low hip. etc."
categories: ["Women's Fashion", "Clothing", "Tops", "L.L.Bean tops"],
title: "L.L.Bean Scotch Plaid Shirt, Relaxed",
related: ["Plaid shirts",
"Flannel shirt",
"Shirt top",
"Button front shirt",
"Bright shirts",
"Tartan shirt"],
category_id: "11",
semantic_category: "tops"}
Category ids can be matched via categories.csv:
| category_id | sub_category|main_category |
| --- | --- |
| 3 | dress|all-body |
| 7 | skirt|bottoms |
| 11 | sweater|tops |
Each set has information in polyvore_outfit_titles.json:
json
{
'url_name': "parka time is now",
'title': "Parkas"
}
The Fill-in-the-Blank (FITB) task is designed to evaluate how well a model understands fashion compatibility. Given a sequence of items in an outfit, the model must predict the missing (target) item from a set of candidate items.
{
question: Item sequence of a set,
blank_position: Target item.
answers: Multiple items including the target item
}
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3530207%2Fdfbcb126319732404616d40e8e4adcee%2Fpolyvore_outfit.png?generation=1741297192521567&alt=media">
The Compatibility Task is used to assess how well a model can determine whether a given set of fashion items are compatible with each other. The model learns an embedding space where visually and semantically similar items are placed closer together, making it possible to predict outfit compatibility effectively.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3530207%2F0773192a4e3639df5bfe3a9a583e2aad%2Fpolyvore_outfit-Page_2.png?generation=1741325987776694&alt=media">
@misc{vasileva2018learningtypeawareembeddingsfashion,
title={Learning Type-Aware Embeddings for Fashion Compatibility},
author={Mariya I. Vasileva and Bryan A. Plummer and Krishna Dusad and Shreya Rajpal and Ranjitha Kumar and David Forsyth},
year={2018},
eprint={1803.09196},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/1803.09196},
}
Currently could not find source license from the authors about the provided dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
All_blanks_ghw is a dataset for object detection tasks - it contains . annotations for 781 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterThis dataset contains information about every Dominion card so far released, including promo cards (blue-backed randomizer cards are not included). I wanted to make the set pretty comprehensive so there is a lot of data about each card.
Check out the metadata table to find out what each column contains.
Thanks to Rio Grande Games for such an amazing game, but also please stop. All these cards are super heavy!
Feel free to take a look at the tables. Please let me know if you find any mistakes. I tried to be careful but I did enter all this data manually. I plan to use R and/or Python to do some analysis of the data.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains oil chemistry data from the Deepwater Horizon (DWH) accident collected cooperatively by BP and the agencies, including agencies that serve as DWH natural resource damage trustees (Trustees). This report provides additional context for the oil chemistry dataset, including information about the collection, analysis, and organization of the data. This data posting differs from other recently published datasets relating to the Gulf of Mexico in several respects: This oil chemistry dataset focuses on information related to petroleum-related chemical constituents in oil samples and other matrices collected during studies focused primarily on the sampling of oil. Chemical compounds that are not present in Mississippi Canyon lease block 252 (MC252) oil (such as polychlorinated biphenyls, pesticides, and halogenated volatiles) are not included in this dataset. This dataset includes data from independent studies performed by BP consultants. BP has been working to produce and organize these independent data, and has engaged outside consultants to perform quality assurance and quality control (QA/QC) checks. As a result, these data have not previously been publicly accessible. This dataset combines results from 24 NRDA studies and 20 Response studies to create a unified data file. Before posting, extensive work was done independently by BP contractors to verify some aspects of the posted data (e.g., positional coordinates and field data). The chemical parameters provided in this data posting include: Parent and alkylated polycyclic aromatic hydrocarbons (PAHs) Saturated hydrocarbons (SHC) Total petroleum hydrocarbons (TPH), including parameters reported as total extractable material (TEM) and total extractable hydrocarbon (TEH) Benzene, toluene, ethylbenzene, and xylenes (BTEX) and other volatile hydrocarbons classified as paraffins, isoparaffins, aromatics, naphthenes, and olefins (PIANO) Geochemical biomarkers (sterane and triterpane), where available Dispersant markers Metals Total organic carbon. The chemical analyte lists are generally consistent between studies for the standard PAH, SHC, and BTEX compounds. However, portions of the dataset also include analysis of the extended PIANO volatile hydrocarbon list, TPH, or geochemical biomarkers. Additionally, NRDA PAH analyses include an extended list of PAHs (parent and alkylated PAHs) and non-PAH compounds (decalins, benzothiophenes, naphthobenzothiophenes, and related chemicals) that were not included in all of the Response analyses. This dataset includes data associated with natural field samples for oil and other sample matrices, along with the associated field-collected quality control samples, such as field replicates, equipment blanks, field blanks, and trip blanks. Laboratory duplicate sample data are provided, where available. MC252 control oils associated with chemistry analyses are part of a separate data posting (see Version History). No other laboratory quality control samples (e.g., laboratory blanks or spike samples) are included in this data posting. In addition to the detailed data file, a cross-tab summary version of these data is also provided (OilChemistry_O-05v01-01_xTab.zip) in Excel format. Results for each sample are presented in a single row with individual chemicals in columns. In addition, a summed concentration for total PAHs is included as “PAH50 Sum” and was calculated using 50 individual PAH results. This summary is provided to enable researchers to access this large data set in a more manageable format. A sample summary table in excel format is provided to summarize the number of analytical results for each parameter type for each sample.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This report contains water chemistry data, and provides additional context for the posted dataset, including information about the collection, analysis, and organization of the data. This data posting differs from other recently published datasets relating to the Gulf of Mexico in several respects: This Water Chemistry dataset focuses only on information related to petroleum-related chemical constituents in water samples. Chemical compounds that are not present in oil (such as polychlorinated biphenyls, pesticides, and halogenated volatiles) are not included in this dataset. This dataset includes data from independent studies performed by BP consultants. BP has been working to produce and organize these independent data, and has engaged outside consultants to perform quality assurance and quality control (QA/QC) checks. These data were initially made publically accessible in November 2013 and are updated with additional results in this publication to reflect the results of further QA/QC. This dataset combines results from fifty-four NRDA studies and thirteen Response studies to create a unified data file. Certain Response data have been reprocessed with lower analytical method detection limits (MDLs) than originally reported, and other data have been adjusted for surrogate recoveries as described in the documentation. The focus of this data posting is chemistry data associated with water column samples collected in both federal and state jurisdictional waters in the Gulf of Mexico from May 2010 through July 2012. More than 20,000 water samples with associated chemistry analyses collected at more than 6,300 sampling stations are included in this posting. These samples were collected during 67 studies, using more than 100 sampling cruises and surveys. These studies can be classified into four general categories: NRDA Cooperative—Studies conducted as part of the NRDA which were agreed to and executed cooperatively by the National Oceanic and Atmospheric Administration (NOAA), U.S. Department of Interior (DOI), and/or other Trustees, and BP. BP NRDA Independent—Studies conducted by BP independently to develop data to support and inform the NRDA. Trustee Independent—Studies conducted by NOAA, DOI, and/or other Trustees independently to develop data to support and inform the NRDA. Response (non-NRDA)—Studies conducted by BP and/or government representatives under the direction of the Unified Area Command and in association with activities performed in response to the DWH accident (the Response). The chemical parameters provided in this data posting include: Parent and alkylated polycyclic aromatic hydrocarbons (PAHs) Saturated hydrocarbons (SHC) Total petroleum hydrocarbons (TPH), including parameters reported as total extractable material (TEM) and total extractable hydrocarbon (TEH) Benzene, toluene, ethyl benzene, and xylenes (BTEX) and other volatile hydrocarbons classified as paraffins, isoparaffins, aromatics, naphthenes, and olefins (PIANO) Geochemical biomarkers (sterane and triterpane), where available. The chemical analyte lists are generally consistent between studies for the standard PAH, SHC, and BTEX compounds. However, portions of the dataset also include analysis of the extended PIANO volatile hydrocarbon list, TPH, or geochemical biomarkers. Additionally, NRDA PAH analyses include an extended list of parent and alkylated decalins, benzothiophenes, naphthobenzothiophenes, and several other PAHs and related chemicals that were not included in Response analyses. This dataset includes data associated with natural field samples for whole water and for Payne filter and filtrate pairs (Payne et al. 1999), along with the associated field-collected quality control samples, such as field replicates, equipment blanks, field blanks, and trip blanks. Laboratory duplicate sample data are provided, where available. Mississippi Canyon lease block 252 (MC252) control oils analyses associated with water chemistry analyses are provided separately from this data posting (see Version History). No other laboratory quality control samples (e.g., laboratory blanks and spike samples) are included in this data posting. Before posting, extensive work was done to verify some aspects of the posted information (e.g., positional coordinates and field data).
Facebook
TwitterMajors Questions Text Data, About 6.03 million majors questions with explanations and without explanations combined; Each question includes question type, question, answer, and explanation, some questions may have errors in question types; majors include Party Building, Law, Engineering, Civil Service, Computer Science, Economics, Graduate Studies, Medicine, Language, Self-Study, Comprehensive and Policy Essay Writing; question types include Multiple Choice, Single Choice, True/False, Fill in the Blanks, Short Answer, and Essay; this dataset can be used for tasks such as LLM training, chatgpt
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Fill In The Blanks is a dataset for object detection tasks - it contains FIB annotations for 1,496 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).