51 datasets found

Z
MSL Curiosity Rover Images with Science and Engineering Classes
data.niaid.nih.gov
zenodo.org
Updated Sep 17, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Steven Lu (2020). MSL Curiosity Rover Images with Science and Engineering Classes [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3892023
Explore at:
Dataset updated
Sep 17, 2020
Dataset provided by
Steven Lu
Kiri L. Wagstaff
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Please note that the file msl-labeled-data-set-v2.1.zip below contains the latest images and labels associated with this data set.

Data Set Description

The data set consists of 6,820 images that were collected by the Mars Science Laboratory (MSL) Curiosity Rover by three instruments: (1) the Mast Camera (Mastcam) Left Eye; (2) the Mast Camera Right Eye; (3) the Mars Hand Lens Imager (MAHLI). With the help from Dr. Raymond Francis, a member of the MSL operations team, we identified 19 classes with science and engineering interests (see the "Classes" section for more information), and each image is assigned with 1 class label. We split the data set into training, validation, and test sets in order to train and evaluate machine learning algorithms. The training set contains 5,920 images (including augmented images; see the "Image Augmentation" section for more information); the validation set contains 300 images; the test set contains 600 images. The training set images were randomly sampled from sol (Martian day) range 1 - 948; validation set images were randomly sampled from sol range 949 - 1920; test set images were randomly sampled from sol range 1921 - 2224. All images are resized to 227 x 227 pixels without preserving the original height/width aspect ratio.

Directory Contents

images - contains all 6,820 images

class_map.csv - string-integer class mappings

train-set-v2.1.txt - label file for the training set

val-set-v2.1.txt - label file for the validation set

test-set-v2.1.txt - label file for the test set

The label files are formatted as below:

"Image-file-name class_in_integer_representation"

Labeling Process

Each image was labeled with help from three different volunteers (see Contributor list). The final labels are determined using the following processes:

If all three labels agree with each other, then use the label as the final label.

If the three labels do not agree with each other, then we manually review the labels and decide the final label.

We also performed error analysis to correct labels as a post-processing step in order to remove noisy/incorrect labels in the data set.

Classes

There are 19 classes identified in this data set. In order to simplify our training and evaluation algorithms, we mapped the class names from string to integer representations. The names of classes, string-integer mappings, distributions are shown below:

Class name, counts (training set), counts (validation set), counts (test set), integer representation

Arm cover, 10, 1, 4, 0

Other rover part, 190, 11, 10, 1

Artifact, 680, 62, 132, 2

Nearby surface, 1554, 74, 187, 3

Close-up rock, 1422, 50, 84, 4

DRT, 8, 4, 6, 5

DRT spot, 214, 1, 7, 6

Distant landscape, 342, 14, 34, 7

Drill hole, 252, 5, 12, 8

Night sky, 40, 3, 4, 9

Float, 190, 5, 1, 10

Layers, 182, 21, 17, 11

Light-toned veins, 42, 4, 27, 12

Mastcam cal target, 122, 12, 29, 13

Sand, 228, 19, 16, 14

Sun, 182, 5, 19, 15

Wheel, 212, 5, 5, 16

Wheel joint, 62, 1, 5, 17

Wheel tracks, 26, 3, 1, 18

Image Augmentation

Only the training set contains augmented images. 3,920 of the 5,920 images in the training set are augmented versions of the remaining 2000 original training images. Images taken by different instruments were augmented differently. As shown below, we employed 5 different methods to augment images. Images taken by the Mastcam left and right eye cameras were augmented using a horizontal flipping method, and images taken by the MAHLI camera were augmented using all 5 methods. Note that one can filter based on the file names listed in the train-set.txt file to obtain a set of non-augmented images.

90 degrees clockwise rotation (file name ends with -r90.jpg)

180 degrees clockwise rotation (file name ends with -r180.jpg)

270 degrees clockwise rotation (file name ends with -r270.jpg)

Horizontal flip (file name ends with -fh.jpg)

Vertical flip (file name ends with -fv.jpg)

Acknowledgment

The authors would like to thank the volunteers (as in the Contributor list) who provided annotations for this data set. We would also like to thank the PDS Imaging Note for the continuous support of this work.
h
Data from: SRL4ORL: Improving Opinion Role Labeling Using Multi-Task...
heidata.uni-heidelberg.de
tudatalib.ulb.tu-darmstadt.de
zip
Updated Feb 4, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ana Marasovic; Ana Marasovic (2019). SRL4ORL: Improving Opinion Role Labeling Using Multi-Task Learning With Semantic Role Labeling [Source Code] [Dataset]. http://doi.org/10.11588/DATA/LWN9XE
Explore at:
zip(14676065)Available download formats
Unique identifier
https://doi.org/10.11588/DATA/LWN9XE
Dataset updated
Feb 4, 2019
Dataset provided by
heiDATA
Authors
Ana Marasovic; Ana Marasovic
License
https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/LWN9XEhttps://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/LWN9XE
Description
This repository contains code for reproducing experiments done in Marasovic and Frank (2018). Paper abstract: For over a decade, machine learning has been used to extract opinion-holder-target structures from text to answer the question "Who expressed what kind of sentiment towards what?". Recent neural approaches do not outperform the state-of-the-art feature-based models for Opinion Role Labeling (ORL). We suspect this is due to the scarcity of labeled training data and address this issue using different multi-task learning (MTL) techniques with a related task which has substantially more data, i.e. Semantic Role Labeling (SRL). We show that two MTL models improve significantly over the single-task model for labeling of both holders and targets, on the development and the test sets. We found that the vanilla MTL model, which makes predictions using only shared ORL and SRL features, performs the best. With deeper analysis, we determine what works and what might be done to make further improvements for ORL. Data for ORL Download MPQA 2.0 corpus. Check mpqa2-pytools for example usage. Splits can be found in the datasplit folder. Data for SRL The data is provided by: CoNLL-2005 Shared Task, but the original words are from the Penn Treebank dataset, which is not publicly available. How to train models? python main.py --adv_coef 0.0 --model fs --exp_setup_id new --n_layers_orl 0 --begin_fold 0 --end_fold 4 python main.py --adv_coef 0.0 --model html --exp_setup_id new --n_layers_orl 1 --n_layers_shared 2 --begin_fold 0 --end_fold 4 python main.py --adv_coef 0.0 --model sp --exp_setup_id new --n_layers_orl 3 --begin_fold 0 --end_fold 4 python main.py --adv_coef 0.1 --model asp --exp_setup_id prior --n_layers_orl 3 --begin_fold 0 --end_fold 10
n
LandCoverNet Asia
access.earthdata.nasa.gov
Updated Oct 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). LandCoverNet Asia [Dataset]. http://doi.org/10.34911/rdnt.63fxe5
Explore at:
Unique identifier
https://doi.org/10.34911/rdnt.63fxe5
Dataset updated
Oct 10, 2023
Time period covered
Jan 1, 2020 - Jan 1, 2023
Area covered

Description
LandCoverNet is a global annual land cover classification training dataset with labels for the multi-spectral satellite imagery from Sentinel-1, Sentinel-2 and Landsat-8 missions in 2018. LandCoverNet Asia contains data across Asia, which accounts for ~31% of the global dataset. Each pixel is identified as one of the seven land cover classes based on its annual time series. These classes are water, natural bare ground, artificial bare ground, woody vegetation, cultivated vegetation, (semi) natural vegetation, and permanent snow/ice.
There are a total of 2753 image chips of 256 x 256 pixels in LandCoverNet South America V1.0 spanning 92 tiles. Each image chip contains temporal observations from the following satellite products with an annual class label, all stored in raster format (GeoTIFF files):
* Sentinel-1 ground range distance (GRD) with radiometric calibration and orthorectification at 10m spatial resolution
* Sentinel-2 surface reflectance product (L2A) at 10m spatial resolution
* Landsat-8 surface reflectance product from Collection 2 Level-2

Radiant Earth Foundation designed and generated this dataset with a grant from Schmidt Futures with additional support from NASA ACCESS, Microsoft AI for Earth and in kind technology support from Sinergise.
n
LandCoverNet North America
cmr.earthdata.nasa.gov
access.earthdata.nasa.gov
Updated Oct 10, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). LandCoverNet North America [Dataset]. http://doi.org/10.34911/rdnt.jx15e8
Explore at:
Unique identifier
https://doi.org/10.34911/rdnt.jx15e8
Dataset updated
Oct 10, 2023
Time period covered
Jan 1, 2020 - Jan 1, 2023
Area covered

Description
LandCoverNet is a global annual land cover classification training dataset with labels for the multi-spectral satellite imagery from Sentinel-1, Sentinel-2 and Landsat-8 missions in 2018. LandCoverNet North America contains data across North America, which accounts for ~13% of the global dataset. Each pixel is identified as one of the seven land cover classes based on its annual time series. These classes are water, natural bare ground, artificial bare ground, woody vegetation, cultivated vegetation, (semi) natural vegetation, and permanent snow/ice.

There are a total of 1561 image chips of 256 x 256 pixels in LandCoverNet North America V1.0 spanning 40 tiles. Each image chip contains temporal observations from the following satellite products with an annual class label, all stored in raster format (GeoTIFF files):
* Sentinel-1 ground range distance (GRD) with radiometric calibration and orthorectification at 10m spatial resolution
* Sentinel-2 surface reflectance product (L2A) at 10m spatial resolution
* Landsat-8 surface reflectance product from Collection 2 Level-2

Radiant Earth Foundation designed and generated this dataset with a grant from Schmidt Futures with additional support from NASA ACCESS, Microsoft AI for Earth and in kind technology support from Sinergise.
Dataset: An Open Combinatorial Diffraction Dataset Including Consensus Human...
data.nist.gov
cloud.csiss.gmu.edu
+1more
Updated Oct 23, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brian DeCost (2020). Dataset: An Open Combinatorial Diffraction Dataset Including Consensus Human and Machine Learning Labels with Quantified Uncertainty for Training New Machine Learning Models [Dataset]. http://doi.org/10.18434/mds2-2301
Explore at:
Unique identifier
https://doi.org/10.18434/mds2-2301, https://identifiers.org/ark:/88434/mds2-2301
Dataset updated
Oct 23, 2020
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Authors
Brian DeCost
License
https://www.nist.gov/open/licensehttps://www.nist.gov/open/license
Description
The open dataset, software, and other files accompanying the manuscript "An Open Combinatorial Diffraction Dataset Including Consensus Human and Machine Learning Labels with Quantified Uncertainty for Training New Machine Learning Models," submitted for publication to Integrated Materials and Manufacturing Innovations. Machine learning and autonomy are increasingly prevalent in materials science, but existing models are often trained or tuned using idealized data as absolute ground truths. In actual materials science, "ground truth" is often a matter of interpretation and is more readily determined by consensus. Here we present the data, software, and other files for a study using as-obtained diffraction data as a test case for evaluating the performance of machine learning models in the presence of differing expert opinions. We demonstrate that experts with similar backgrounds can disagree greatly even for something as intuitive as using diffraction to identify the start and end of a phase transformation. We then use a logarithmic likelihood method to evaluate the performance of machine learning models in relation to the consensus expert labels and their variance. We further illustrate this method's efficacy in ranking a number of state-of-the-art phase mapping algorithms. We propose a materials data challenge centered around the problem of evaluating models based on consensus with uncertainty. The data, labels, and code used in this study are all available online at data.gov, and the interested reader is encouraged to replicate and improve the existing models or to propose alternative methods for evaluating algorithmic performance.
n
LandCoverNet Australia
access.earthdata.nasa.gov
cmr.earthdata.nasa.gov
Updated Oct 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). LandCoverNet Australia [Dataset]. http://doi.org/10.34911/rdnt.0vgi25
Explore at:
Unique identifier
https://doi.org/10.34911/rdnt.0vgi25
Dataset updated
Oct 10, 2023
Time period covered
Jan 1, 2020 - Jan 1, 2023
Area covered

Description
LandCoverNet is a global annual land cover classification training dataset with labels for the multi-spectral satellite imagery from Sentinel-1, Sentinel-2 and Landsat-8 missions in 2018. LandCoverNet Australia contains data across Australia, which accounts for ~7% of the global dataset. Each pixel is identified as one of the seven land cover classes based on its annual time series. These classes are water, natural bare ground, artificial bare ground, woody vegetation, cultivated vegetation, (semi) natural vegetation, and permanent snow/ice.
There are a total of 600 image chips of 256 x 256 pixels in LandCoverNet Australia V1.0 spanning 20 tiles. Each image chip contains temporal observations from the following satellite products with an annual class label, all stored in raster format (GeoTIFF files):
* Sentinel-1 ground range distance (GRD) with radiometric calibration and orthorectification at 10m spatial resolution
* Sentinel-2 surface reflectance product (L2A) at 10m spatial resolution
* Landsat-8 surface reflectance product from Collection 2 Level-2

Radiant Earth Foundation designed and generated this dataset with a grant from Schmidt Futures with additional support from NASA ACCESS, Microsoft AI for Earth and in kind technology support from Sinergise.
Multi-Label Classification Dataset
kaggle.com
Updated Jan 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shivanand (2021). Multi-Label Classification Dataset [Dataset]. https://www.kaggle.com/datasets/shivanandmn/multilabel-classification-dataset/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 28, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shivanand
Description
Context

NLP: Multi-label Classification Dataset.

Content

The dataset contains 6 different labels(Computer Science, Physics, Mathematics, Statistics, Quantitative Biology, Quantitative Finance) to classify the research papers based on Abstract and Title. The value 1 in label columns represents that label belongs to that paper. Each paper has multiple labels as 1.

Acknowledgements

This dataset is from Analytics Vidhya Hackthon

Inspiration

Do you solve it to get the best score?
l-sized Training and Evaluation Data for Publication "Using Supervised...
zenodo.org
data.niaid.nih.gov
application/gzip
Updated Apr 20, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tobias Weber; Tobias Weber (2020). l-sized Training and Evaluation Data for Publication "Using Supervised Learning to Classify Metadata of Research Data by Field of Study" [Dataset]. http://doi.org/10.5281/zenodo.3490460
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3490460
Dataset updated
Apr 20, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tobias Weber; Tobias Weber
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Automated classification of metadata of research data by their discipline(s) of research can be used in scientometric research, by repository service providers, and in the context of research data aggregation services. Openly available metadata of the DataCite index for research data were used to compile a large training and evaluation set comprised of 609,524 records. This is the cleaned and vectorized version with a feature selection of large size.
d
Data from: Fashion conversation data on Instagram
search.dataone.org
dataverse.harvard.edu
Updated Nov 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ha, Yu-i; Kwon, Sejeong; Cha, Meeyoung; Joo, Jungseock (2023). Fashion conversation data on Instagram [Dataset]. https://search.dataone.org/view/sha256%3Ac3de287ab8b375881b5922ac14887dfb46780a2bb7434e64bc7f71f2da7868fa
Explore at:
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Ha, Yu-i; Kwon, Sejeong; Cha, Meeyoung; Joo, Jungseock
Description
Our fashion dataset is composed of information about 24,752 posts by 13,350 people on Instagram. The data collection was done over a month period in January, 2015. We searched for posts mentioning 48 internationally renowned fashion brand names as hashtag. Our data contain information about hashtags as well as image features based on deep learning (Convolutional Neural Network or CNN). The list of learned features include selfies, body snaps, marketing shots, non-fashion, faces, logo, etc. Please refer to our paper for the full description of how we built our deep learning model.
Quantitative Content Analysis Data for Hand Labeling Road Surface Conditions...
zenodo-rdm.web.cern.ch
data.niaid.nih.gov
zip
Updated Sep 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carly Sutter; Carly Sutter; Kara Sulia; Kara Sulia; Nick P. Bassill; Nick P. Bassill; Christopher D. Thorncroft; Christopher D. Wirz; Christopher D. Wirz; Vanessa Przybylo; Vanessa Przybylo; Mariana G. Cains; Mariana G. Cains; Jacob Radford; Jacob Radford; David Aaron Evans; David Aaron Evans; Christopher D. Thorncroft (2023). Quantitative Content Analysis Data for Hand Labeling Road Surface Conditions in New York State Department of Transportation Camera Images [Dataset]. http://doi.org/10.5281/zenodo.8370665
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8370665
Dataset updated
Sep 27, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Carly Sutter; Carly Sutter; Kara Sulia; Kara Sulia; Nick P. Bassill; Nick P. Bassill; Christopher D. Thorncroft; Christopher D. Wirz; Christopher D. Wirz; Vanessa Przybylo; Vanessa Przybylo; Mariana G. Cains; Mariana G. Cains; Jacob Radford; Jacob Radford; David Aaron Evans; David Aaron Evans; Christopher D. Thorncroft
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
New York
Description
Traffic camera images from the New York State Department of Transportation (511ny.org) are used to create a hand-labeled dataset of images classified into to one of six road surface conditions: 1) severe snow, 2) snow, 3) wet, 4) dry, 5) poor visibility, or 6) obstructed. Six labelers (authors Sutter, Wirz, Przybylo, Cains, Radford, and Evans) went through a series of four labeling trials where reliability across all six labelers were assessed using the Krippendorff’s alpha (KA) metric (Krippendorff, 2007). The online tool by Dr. Freelon (Freelon, 2013; Freelon, 2010) was used to calculate reliability metrics after each trial, and the group achieved inter-coder reliability with KA of 0.888 on the 4th trial. This process is known as quantitative content analysis, and three pieces of data used in this process are shared, including: 1) a PDF of the codebook which serves as a set of rules for labeling images, 2) images from each of the four labeling trials, including the use of New York State Mesonet weather observation data (Brotzge et al., 2020), and 3) an Excel spreadsheet including the calculated inter-coder reliability metrics and other summaries used to asses reliability after each trial.

The broader purpose of this work is that the six human labelers, after achieving inter-coder reliability, can then label large sets of images independently, each contributing to the creation of larger labeled dataset used for training supervised machine learning models to predict road surface conditions from camera images. The xCITE lab (xCITE, 2023) is used to store camera images from 511ny.org, and the lab provides computing resources for training machine learning models.
H
Calcite data set with boolean label
dataverse.harvard.edu
search.dataone.org
Updated Feb 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vlad Ioan Tomescu (2022). Calcite data set with boolean label [Dataset]. http://doi.org/10.7910/DVN/VDBQVV
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/VDBQVV
Dataset updated
Feb 27, 2022
Dataset provided by
Harvard Dataverse
Authors
Vlad Ioan Tomescu
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
A data set consisting of various literature metrics on the Apache Calcite. The presence of defects is marked with boolean labels.
n
Open Cities AI Challenge Dataset
access.earthdata.nasa.gov
cmr.earthdata.nasa.gov
Updated Oct 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Open Cities AI Challenge Dataset [Dataset]. http://doi.org/10.34911/rdnt.f94cxb
Explore at:
Unique identifier
https://doi.org/10.34911/rdnt.f94cxb
Dataset updated
Oct 10, 2023
Time period covered
Jan 1, 2020 - Jan 1, 2023
Area covered

Description
This dataset was developed as part of a challenge to segment building footprints from aerial imagery. The goal of the challenge was to accelerate the development of more accurate, relevant, and usable open-source AI models to support mapping for disaster risk management in African cities [Read more about the challenge]. The data consists of drone imagery from 10 different cities and regions across Africa
Codeforces Dataset
kaggle.com
zip
Updated Jun 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Phan Dinh Khoi (2023). Codeforces Dataset [Dataset]. https://www.kaggle.com/datasets/phandinhkhoi/codeforces-dataset/suggestions?status=pending&yourSuggestions=true
Explore at:
zip(811002069 bytes)Available download formats
Dataset updated
Jun 19, 2023
Authors
Phan Dinh Khoi
Description
Dataset

This dataset was created by Phan Dinh Khoi

Contents
n
AgriFieldNet Competition Dataset
cmr.earthdata.nasa.gov
access.earthdata.nasa.gov
Updated Oct 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). AgriFieldNet Competition Dataset [Dataset]. http://doi.org/10.34911/rdnt.wu92p1
Explore at:
Unique identifier
https://doi.org/10.34911/rdnt.wu92p1
Dataset updated
Oct 10, 2023
Time period covered
Jan 1, 2020 - Jan 1, 2023
Area covered

Description
This dataset contains crop types of agricultural fields in four states of Uttar Pradesh, Rajasthan, Odisha and Bihar in northern India. There are 13 different classes in the dataset including Fallow land and 12 crop types of Wheat, Mustard, Lentil, Green pea, Sugarcane, Garlic, Maize, Gram, Coriander, Potato, Bersem, and Rice. The dataset is split to train and test collections as part of the AgriFieldNet India Competition. Ground reference data for this dataset is collected by IDinsight’s Data on Demand team. Radiant Earth Foundation carried out the training dataset curation and publication. This training dataset is generated through a grant from the Enabling Crop Analytics at Scale (ECAAS) Initiative funded by The Bill & Melinda Gates Foundation and implemented by Tetra Tech.
o
Data from: Large-scale study of speech acts' development using automatic...
osf.io
Updated May 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mitja Nikolaus; Abdellah Fourtassi (2021). Large-scale study of speech acts' development using automatic labelling [Dataset]. https://osf.io/hvzs2
Explore at:
Dataset updated
May 10, 2021
Dataset provided by
Center For Open Science
Authors
Mitja Nikolaus; Abdellah Fourtassi
Description
No description was included in this Dataset collected from the OSF
H
Replication data for: Independent Labels? The Power behind Environmental...
data.niaid.nih.gov
dataverse.harvard.edu
Updated May 19, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bullock, Graham (2015). Replication data for: Independent Labels? The Power behind Environmental Information about Products and Companies [Dataset]. http://doi.org/10.7910/DVN/28279
Explore at:
text/x-stata-syntax; charset=us-ascii, tsv, xlsx, txtAvailable download formats
Unique identifier
https://doi.org/10.7910/DVN/28279
Dataset updated
May 19, 2015
Dataset authored and provided by
Bullock, Graham
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
United States
Description
Power is a ubiquitous term in political science, and yet the discipline lacks a metric of power that can be applied to both formal and informal political contexts. Building on past work on power and power resources, this paper develops a method to estimate the power of different actors over an organization. It uses this method to analyze the power of the public, private, and civil sectors within an original dataset of 245 cases of corporate sustainability ratings and product eco-labels, such as ENERGY STAR, LEED Certification, and Newsweek's Greenest Company Rankings. These initiatives have received limited attention from the political science literature, but they have become an increasingly prominent political phenomenon. The paper finds that the private and civil sectors have more power over these information-based governance initiatives than the public sector. It also reveals their lack of transparency and hybrid accountability relationships, which complicate their legitimacy and effectiveness.
Pressurized Water Reactor
kaggle.com
Updated Aug 17, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Avinash Bagul (2020). Pressurized Water Reactor [Dataset]. https://www.kaggle.com/avibagul80/pressurized-water-reactor/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 17, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Avinash Bagul
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Dataset

This dataset was created by Avinash Bagul

Released under Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)

Contents
Supplementary material 1 from: Penev L (2017) From Open Access to Open...
zenodo.org
data.niaid.nih.gov
bin
Updated Dec 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lyubomir Penev; Lyubomir Penev (2023). Supplementary material 1 from: Penev L (2017) From Open Access to Open Science from the viewpoint of a scholarly publisher. Research Ideas and Outcomes 3: e12265. https://doi.org/10.3897/rio.3.e12265 [Dataset]. http://doi.org/10.3897/rio.3.e12265.suppl1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.3897/rio.3.e12265.suppl1
Dataset updated
Dec 21, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lyubomir Penev; Lyubomir Penev
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
A presentation held by Lyubomir Penev in the iDiv Seminar Seies at the Biodiversity Informatics Unit of the German Centre for Integrative Biodiversity Research (iDiv) Leipzig, 15 February 2017.
n
ramp Building Footprint Dataset - Jashore, Bangladesh
cmr.earthdata.nasa.gov
access.earthdata.nasa.gov
Updated Oct 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). ramp Building Footprint Dataset - Jashore, Bangladesh [Dataset]. http://doi.org/10.34911/rdnt.wn4jmp
Explore at:
Unique identifier
https://doi.org/10.34911/rdnt.wn4jmp
Dataset updated
Oct 10, 2023
Time period covered
Jan 1, 2020 - Jan 1, 2023
Area covered

Description
This chipped training dataset is over Jashore and includes high-resolution imagery (.tif format) and corresponding building footprint vector labels (.geojson format) in 256 x 256 pixel tile/label pairs. This dataset is a ramp Tier 1 dataset, meaning it has been thoroughly reviewed and improved. This dataset was used in the development and testing of a localized ramp model and contains 7,310 tiles and 80,050 individual buildings. The satellite imagery resolution is 35 cm and was sourced from Maxar ODP (104001003BA7C900). Dataset keywords: Urban, Peri-urban, Rural
[Data, Stimuli] A label indicating an old year of establishment improves...
osf.io
Updated Jul 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tomoki Maezawa (2023). [Data, Stimuli] A label indicating an old year of establishment improves evaluations of restaurants and shops serving traditional foods [Dataset]. https://osf.io/nf8sw
Explore at:
Dataset updated
Jul 4, 2023
Dataset provided by
Center for Open Sciencehttps://cos.io/
Authors
Tomoki Maezawa
Description
No description was included in this Dataset collected from the OSF

Facebook

Twitter

Click to copy link

Link copied

Cite

Steven Lu (2020). MSL Curiosity Rover Images with Science and Engineering Classes [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3892023

MSL Curiosity Rover Images with Science and Engineering Classes

Explore at:

Dataset updated

Sep 17, 2020

Dataset provided by

Steven Lu
Kiri L. Wagstaff

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Please note that the file msl-labeled-data-set-v2.1.zip below contains the latest images and labels associated with this data set.

Data Set Description

The data set consists of 6,820 images that were collected by the Mars Science Laboratory (MSL) Curiosity Rover by three instruments: (1) the Mast Camera (Mastcam) Left Eye; (2) the Mast Camera Right Eye; (3) the Mars Hand Lens Imager (MAHLI). With the help from Dr. Raymond Francis, a member of the MSL operations team, we identified 19 classes with science and engineering interests (see the "Classes" section for more information), and each image is assigned with 1 class label. We split the data set into training, validation, and test sets in order to train and evaluate machine learning algorithms. The training set contains 5,920 images (including augmented images; see the "Image Augmentation" section for more information); the validation set contains 300 images; the test set contains 600 images. The training set images were randomly sampled from sol (Martian day) range 1 - 948; validation set images were randomly sampled from sol range 949 - 1920; test set images were randomly sampled from sol range 1921 - 2224. All images are resized to 227 x 227 pixels without preserving the original height/width aspect ratio.

Directory Contents

images - contains all 6,820 images

class_map.csv - string-integer class mappings

train-set-v2.1.txt - label file for the training set

val-set-v2.1.txt - label file for the validation set

test-set-v2.1.txt - label file for the test set

The label files are formatted as below:

"Image-file-name class_in_integer_representation"

Labeling Process

Each image was labeled with help from three different volunteers (see Contributor list). The final labels are determined using the following processes:

If all three labels agree with each other, then use the label as the final label.

If the three labels do not agree with each other, then we manually review the labels and decide the final label.

We also performed error analysis to correct labels as a post-processing step in order to remove noisy/incorrect labels in the data set.

Classes

There are 19 classes identified in this data set. In order to simplify our training and evaluation algorithms, we mapped the class names from string to integer representations. The names of classes, string-integer mappings, distributions are shown below:

Class name, counts (training set), counts (validation set), counts (test set), integer representation

Arm cover, 10, 1, 4, 0

Other rover part, 190, 11, 10, 1

Artifact, 680, 62, 132, 2

Nearby surface, 1554, 74, 187, 3

Close-up rock, 1422, 50, 84, 4

DRT, 8, 4, 6, 5

DRT spot, 214, 1, 7, 6

Distant landscape, 342, 14, 34, 7

Drill hole, 252, 5, 12, 8

Night sky, 40, 3, 4, 9

Float, 190, 5, 1, 10

Layers, 182, 21, 17, 11

Light-toned veins, 42, 4, 27, 12

Mastcam cal target, 122, 12, 29, 13

Sand, 228, 19, 16, 14

Sun, 182, 5, 19, 15

Wheel, 212, 5, 5, 16

Wheel joint, 62, 1, 5, 17

Wheel tracks, 26, 3, 1, 18

Image Augmentation

Only the training set contains augmented images. 3,920 of the 5,920 images in the training set are augmented versions of the remaining 2000 original training images. Images taken by different instruments were augmented differently. As shown below, we employed 5 different methods to augment images. Images taken by the Mastcam left and right eye cameras were augmented using a horizontal flipping method, and images taken by the MAHLI camera were augmented using all 5 methods. Note that one can filter based on the file names listed in the train-set.txt file to obtain a set of non-augmented images.

90 degrees clockwise rotation (file name ends with -r90.jpg)

180 degrees clockwise rotation (file name ends with -r180.jpg)

270 degrees clockwise rotation (file name ends with -r270.jpg)

Horizontal flip (file name ends with -fh.jpg)

Vertical flip (file name ends with -fv.jpg)

Acknowledgment

The authors would like to thank the volunteers (as in the Contributor list) who provided annotations for this data set. We would also like to thank the PDS Imaging Note for the continuous support of this work.

Clear search

Close search

Google apps

Main menu

MSL Curiosity Rover Images with Science and Engineering Classes

Data from: SRL4ORL: Improving Opinion Role Labeling Using Multi-Task...

LandCoverNet Asia

LandCoverNet North America

Dataset: An Open Combinatorial Diffraction Dataset Including Consensus Human...

LandCoverNet Australia

Multi-Label Classification Dataset

Context

Content

Acknowledgements

Inspiration

l-sized Training and Evaluation Data for Publication "Using Supervised...

Data from: Fashion conversation data on Instagram

Quantitative Content Analysis Data for Hand Labeling Road Surface Conditions...

Calcite data set with boolean label

Open Cities AI Challenge Dataset

Codeforces Dataset

Dataset

Contents

AgriFieldNet Competition Dataset

Data from: Large-scale study of speech acts' development using automatic...

Replication data for: Independent Labels? The Power behind Environmental...

Pressurized Water Reactor

Dataset

Contents

Supplementary material 1 from: Penev L (2017) From Open Access to Open...

ramp Building Footprint Dataset - Jashore, Bangladesh

[Data, Stimuli] A label indicating an old year of establishment improves...

MSL Curiosity Rover Images with Science and Engineering Classes