Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Reasons given for a report as a percentage of the overall number of reports.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A global reference dataset on cropland was collected through a crowdsourcing campaign implemented using Geo-Wiki. This reference dataset is based on a systematic sample at latitude and longitude intersections, enhanced in locations where the cropland probability varies between 25-75% for a better representation of cropland globally. Over a three week period, around 36K samples of cropland were collected. For the purpose of quality assessment, additional datasets are provided. One is a control dataset of 1793 sample locations that have been validated by students trained in image interpretation. This dataset was used to assess the quality of the crowd validations as the campaign progressed. Another set of data contains 60 expert or gold standard validations for additional evaluation of the quality of the participants. These three datasets have two parts, one showing cropland only and one where it is compiled per location and user. This reference dataset will be used to validate and compare medium and high resolution cropland maps that have been generated using remote sensing. The dataset can also be used to train classification algorithms in developing new maps of land cover and cropland extent.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the data analyzed in the paper titled "Crowdsourcing Requirements: Does Teamwork Enhance Crowd Creativity?"
The dataset contains the following csv files:
presurvey-questions: List of presurvey questions to collect demographics
disc-questions: List of DISC personality questions to cause a crowd worker’s personality. Each group has a set of 4 statements out of which the worker was expected to select one
post-survey-questions: List of postsurvey questions
users: List of crowd workers in the study; values 1 and 2 of the column ‘group_type’ correspond to workers in solo and interacting teams respectively
presurvey-responses: Workers' responses to the presurvey
personality_data: Workers’ IPIP (O, C, E, A, N metrics) and DISC (raw and normalized) scores
post-survey-responses: Workers' responses to the postsurvey
all_requirements: Requirements in a user story format, elicited by the crowd workers
creativity-ratings.csv: Authors’ average ratings for each requirement for the metrics ‘detailedness’, ‘novelty’ and ‘usefulness’
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This project originated as an unconference-style panel at the 2016 Research Data Access and Preservation Summit. It collects case studies of research data-related events hosted or co-hosted by academic libraries. Case studies collected for the panel describe four flavors of event: the Center for Open Science Workshop on Reproducible Research, Data Carpentry, Software Carpentry, and Day of Data. Subsequent case study contributions describe more instances of these events as well as other events specific to their host institutions. Libraries and other potential sponsors are encouraged to use these case studies as resources for planning their own data-related events. The project organizers will periodically solicit more contributions to this collection of event case studies.
http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj
Timely and reliable monitoring of commodity food prices is an essential requirement for assessing market and food security risks and establishing early warning systems, especially in developing economies. However, data from regional or national systems for tracking changes in food prices in sub-Saharan Africa lacks the temporal or spatial richness and is often insufficient to inform targeted interventions. In addition to limited opportunity for [near-]real-time assessment of food prices, various stages in the commodity supply chain are mostly unrepresented, thereby limiting insights on stage-related price evolution. Yet, governments and market stakeholders rely on commodity price data to make decisions on appropriate interventions or commodity-focused investments. Recent rapid technological development indicates that digital devices and connectivity services are becoming affordable for many, including in remote areas of developing economies. This offers a great opportunity for harvesting price data (via new data collection methodologies, such as crowdsourcing/crowdsensing — i.e. citizen-generated data — using mobile apps/devices) and disseminating it (via web dashboards or other means) in real-time. This real-time data can support decisions at various levels and related policy-making processes. However, market information that aims at improving the functioning of markets and supply chains requires a continuous data flow as well as quality, accessibility and trust. More data does not necessarily translate into better information. Citizen-based data-generation systems are often confronted by challenges related to data quality and citizen participation, which may be further complicated by the volume of data generated compared to traditional approaches. Following the food price hikes during the first noughties of the 21st century, the European Commission's Joint Research Centre (JRC) started working on innovative methodologies for real-time food price data collection and analysis in developing countries. The work carried out so far includes a pilot initiative to crowdsource data from selected markets across several African countries, two workshops (with relevant stakeholders and experts), and the development of a spatial statistical quality methodology to facilitate the best possible exploitation of geo-located data. Based on the latter, the JRC designed the Food Price Crowdsourcing Africa (FPCA) project and implemented it initially within two states in Northern Nigeria, then expanded to two further states. The FPCA is a credible methodology, based on the voluntary provision of data by a crowd (people living in urban, suburban, and rural areas) using a mobile app, leveraging monetary and non-monetary incentives to enhance contribution, which makes it possible to collect, analyse and validate, and disseminate staple food price data in real time across market segments. The granularity and high frequency of the crowdsourcing data open the door to real-time space-time analysis, which can be essential for policy and decision making and rapid response on specific geographic regions.
This dataset contains longitudinal purchases data from 5027 Amazon.com users in the US, spanning 2018 through 2022: amazon-purchases.csv It also includes demographic data and other consumer level variables for each user with data in the dataset. These consumer level variables were collected through an online survey and are included in survey.csv fields.csv describes the columns in the survey.csv file, where fields/survey columns correspond to survey questions. The dataset also contains the survey instrument used to collect the data. More details about the survey questions and possible responses, and the format in which they were presented can be found by viewing the survey instrument. A 'Survey ResponseID' column is present in both the amazon-purchases.csv and survey.csv files. It links a user's survey responses to their Amazon.com purchases. The 'Survey ResponseID' was randomly generated at the time of data collection. amazon-purchases.csv Each row in this file corresponds to an Amazon order. Each such row has the following columns: Survey ResponseID Order date Shipping address state Purchase price per unit Quantity ASIN/ISBN (Product Code) Title Category The data were exported by the Amazon users from Amazon.com and shared by users with their informed consent. PII and other information not listed above were stripped from the data. This processing occurred on users' machines before sharing with researchers.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Crowdsourced Replication Initiative (CRI) involved 204 researchers who volunteered to engage in a replication of a well-known study on immigration and social policy preferences. In this project, the participants were surveyed four times between August 20th, 2018 and January 20th, 2019. Survey questions with identifying features have been removed to protect participant anonymity and the data are available in the file cri_survey_long_public with labels or *_nolabs, without. The survey included both objective criteria, such as experience with methods and the substantive topic of the replication, and subjective criteria, such as the participants own beliefs about the hypothesis and immigration in general. In addition, they were asked questions about their time commitment, constraints they faced and some other feedback about the process of crowdsourcing. As of 2024, we provide data on the participants’ reviews of the other teams’ models. These review scores were initially not directly useable due to some problems with the 4th wave of the participant survey. The participants were given model descriptions that did not always match with the models they should have reflected. However, we have now used these paragraphs to match descriptions. We were able to match roughly 95% of all models. The new data file peer_model_dyad allows users to analyze data that are in participant-model dyad format. These data are linkable to both the participant survey here, and the CRI model specification and results data on Github (https://github.com/nbreznau/CRI). Because of matching and uneven numbers of models per team, there are some participants whose rankings apply to dozens of models and others only a few. The variable descriptions for these data are in the peer_model_dyad_codebook file. We also now provide dyadic data that matches each participant with each model specification produced by their team in df_dyad. These data contain all model specifications and the AME (Average Marginal Effect) produced by that model.
This data set was collected from various sources: the research team, ANAS employees, and Uber drivers. The method for data collection and data processing for each dataset can be found in the related works.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The spatial distribution of cropland is an important input to many applications including food security monitoring and economic land use modeling. Global land cover maps derived from remote sensing are one source of cropland but they are currently not accurate enough in the cropland domain to meet the needs of the user community. Moreover, when compared with one another, these land cover products show large areas of spatial disagreement, which makes the choice very difficult regarding which land cover product to use. This paper takes an entirely different approach to mapping cropland, using crowdsourcing of Google Earth imagery via tools in Geo-Wiki. Using sample data generated by a crowdsourcing campaign for the collection of the degree of cultivation and settlement in Ethiopia, a cropland map was created using simple inverse distance weighted interpolation. The map was validated using data from the GOFC-GOLD validation portal and an independent crowdsourced dataset from Geo-Wiki. The results show that the crowdsourced cropland map for Ethiopia has a higher overall accuracy than the individual global land cover products for this country. Such an approach has great potential for mapping cropland in other countries where such data do not currently exist. Not only is the approach inexpensive but the data can be collected over a very short period of time using an existing network of volunteers.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Purpose: This is the 2019 Hurricanes Crowdsourced Photos Public Feature Layer View. This is a live publicly accessible layer for the Crowdsource Story Map accessible here: This layer cannot be edited, it is view only. ShareHidden Field: 0 = Needs Review, 1 = Already Reviewed, 2 = Hidden (not available in this public view).Audience: GIS Staff and Technologists who would like to add this layer to their own web maps and apps. If you need access to this layer in other formats, see the Open Data link. Please send us an email at triage@publicsafetygis.org to tell us if you are going to use this layer and if you have any questions or need assistance with this layer.Need to download the photos? See this technical support article.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset is about: A global dataset of crowdsourced land cover and land use reference data (2011-2012). Please consult parent dataset @ https://doi.org/10.1594/PANGAEA.869682 for more information.
This web map is a component of the CrowdMag Visualization App.NOAA's CrowdMag is a crowdsourced data collection project that uses a mobile app to collect geomagnetic data from the magnetometers that modern smartphones use as part of their navigation systems. NCEI collects these data from citizen scientists around the world and provides quality control services before making them available through a series of aggregated maps and charts. These data have the potential to provide a high resolution alternative to geomagnetic satellite data, as well as near real-time information about changes in the magnetic field.This map shows data collected from phones around the world! Displayed are the Crowdsourced magnetic data within a tolerance level of prediction by World Magnetic Model. We have added some uncertainty to each data point shown to ensure the privacy of our contributors. The data points are grouped together (or "aggregated") into small areas , and we display the median data value across all the readings for each point.
This map is updated every day. Layers are available for Median Intensity, Median Horizontal Component (Y), and Median Vertical Component (Z).
Use the time slider to select the date range. Select the different layers under the "Crowdmag Observations" menu. View a color scale using the legend tool. Zoom to your location using the "Find my Location" tool. Click or tap on a data point to view a popup containing more information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Computer vision models that can recognize plant diseases in the field would be valuable tools for disease management and resistance breeding. Generating enough data to train these models is difficult, however, since only trained experts can accurately identify symptoms. In this study, we describe and implement a two-step method for generating a large amount of high-quality training data with minimal expert input. First, experts located symptoms of northern leaf blight (NLB) in field images taken by unmanned aerial vehicles (UAVs), annotating them quickly at low resolution. Second, non-experts were asked to draw polygons around the identified diseased areas, producing high-resolution ground truths that were automatically screened based on agreement between multiple workers. We then used these crowdsourced data to train a convolutional neural network (CNN), feeding the output into a conditional random field (CRF) to segment images into lesion and non-lesion regions with accuracy of 0.9979 and F1 score of 0.7153. The CNN trained on crowdsourced data showed greatly improved spatial resolution compared to one trained on expert-generated data, despite using only one fifth as many expert annotations. The final model was able to accurately delineate lesions down to the millimeter level from UAV-collected images, the finest scale of aerial plant disease detection achieved to date. The two-step approach to generating training data is a promising method to streamline deep learning approaches for plant disease detection, and for complex plant phenotyping tasks in general.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Accurate species identification is fundamental to biodiversity science, but the natural history skills required for this are neglected in formal education at all levels. In this paper we describe how the web application ispotnature.org and its sister site ispot.org.za (collectively, "iSpot") are helping to solve this problem by combining learning technology with crowdsourcing to connect beginners with experts. Over 94% of observations submitted to iSpot receive a determination. To date (2014), iSpot has crowdsourced the identification of 30,000 taxa (>80% at species level) in > 390,000 observations with a global community numbering > 42,000 registered participants. More than half the observations on ispotnature.org were named within an hour of submission. iSpot uses a unique, 9-dimensional reputation system to motivate and reward participants and to verify determinations. Taxon-specific reputation points are earned when a parti! cipant proposes an identification that achieves agreement from other participants, weighted by the agreers' own reputation scores for the taxon. This system is able to discriminate effectively between competing determinations when two or more are proposed for the same observation. In 57% of such cases the reputation system improved the accuracy of the determination, while in the remainder it either improved precision (e.g. by adding a species name to a genus) or revealed false precision, for example where a determination to species level was not supported by the available evidence. We propose that the success of iSpot arises from the structure of its social network which efficiently connects beginners and experts, overcoming the social as well as geographic barriers that normally separate the two.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all the exported data contributed by volunteers during the SKILLNET project via the crowdsourcing platform CERMROL: https://cemrol.hum.uu.nl/#/ (more information also available here: https://skillnet.nl/cemrol/). Via this platform, volunteers helped in selecting (marking) and adding basic metadata to every letter in a series of selected letter editions. The raw, unprocessed files, are provided in this dataset. A selection of this raw data was manually cleaned by a student assistant, it is made available as a separate dataset: https://doi.org/10.34894/NJKUF0 Note: The latest version of the CEMROL's code developed by Sheean Spoel from the Digital Humanities Lab at Utrecht University is publicly available at https://github.com/UUDigitalHumanitieslab/scribeAPI and also deposited in Zenodo. The historical date entry is public at https://github.com/UUDigitalHumanitieslab/historical-dates and https://github.com/UUDigitalHumanitieslab/historical-dates-ui. Sheean indicates that the date calculator itself is not open source (the user interface is). The calculations are based on Axel Findling's Roman date converter and Nikolaus A. Bär's easter date calculator.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time periods for testing and non-testing.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The size of the Crowdsourced Testing Industry market was valued at USD XX Million in 2023 and is projected to reach USD XXX Million by 2032, with an expected CAGR of 10.50% during the forecast period.A crowdsourced testing technique is used through the testing activity with an independent network of test operators. Engaging a broad crowd to perform applications on several devices and systems on which they may operate helps assess the usability, functionality, and performance of those applications. Utilizing a crowd will give more exposure to organizations about testing, enable them to shorten their cycles, and minimize their testing expenses.Crowdsourced testing is most useful for organizations that want to enhance the user experience, detect bugs early in the development cycle, and ensure cross-platform and cross-device compatibility. Recent developments include: January 2022: Testlio, the pioneer of networked testing, introduced fused testing. This new methodology combined expert manual testing with the efficiency of test automation, assisting engineering and product leaders in meeting increased customer demands for exceptional digital experiences., February 2021: Applause App Quality Inc., a crowdsourced testing solutions company, announced the launch of its Excellent Product Platform, which would provide customers with enterprise-grade software-as-a-service infrastructure, digital testing solutions, and access to the world's largest community of digital experts.. Key drivers for this market are: Rise in the Number of Operating Systems, Devices, and Applications, Demand for Scaling Quality Assurance of Software to Magnify Customer Experience. Potential restraints include: Concerns Over Data Privacy Regulations Over the Globe. Notable trends are: Large Enterprises to Constitute a Significant Market Size.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These data support the synonymy proposal task (SPT) experiment described in the paper "Reply & Supply: Efficient crowdsourcing when workers do more than answer questions", https://arxiv.org/abs/1611.00954.The data consist of two CSV files, one describing the questions built by the crowd as they work, the other recording the responses of workers when presented with questions. A "question id" (qid
) field links these data. The IDs of Mechanical Turk workers were deidentified. The task interface is described in Fig. 2 of the paper.Three question sampling algorithms were tested in the experiment. These are recorded in the algorithm
. Note that workers may participate in multiple algorithms, and that qid
is only unique for a given algorithm (qid
1 under the random
algorithm and qid
1 under the binomial
algorithm are not the same question).
NCBI Sequence Read Archive under BioProject number PRJNA606798
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Lameness assessments are rarely conducted routinely on dairy farms and when completed typically underestimate lameness prevalence, hampering early diagnosis and treatment. A well-known feature of many perceptual tasks is that relative assessments are more accurate than absolute assessments, suggesting that creating methods that allow for the relative scoring of ‘which cow is more lame’ will allow for reliable lameness assessments. Here we developed and tested a remote comparative lameness assessment method: we recruited non-experienced crowd workers via an online platform and asked them to watch two videos side-by-side, each showing a cow walking, and to identify which cow was more lame and by how much (on a scale of -3 to 3). We created 11 tasks, each with 10 video pairs for comparison, and recruited 50 workers per task. All tasks were also completed by 5 experienced cattle lameness assessors. We evaluated data filtering and clustering methods based on worker responses and determined the agreement among workers, among experienced assessors, and between these groups. A moderate to high interobserver reliability was observed (intraclass correlation coefficient, ICC=0.46 to 0.77) for crowd workers and agreement was high among the experienced assessors (ICC=0.87). Average crowd worker responses showed excellent agreement with the average of experienced assessor responses (ICC= 0.89 to 0.91), regardless of data processing method. To investigate if we could use fewer workers per task while still retaining high agreement with experienced assessors, we randomly sub-sampled 2 to 43 (1 less than the minimum number of workers retained per task after data cleaning) workers from each task. The agreement with experienced assessors increased substantially as we increased the number of workers from 2 to 10, but little increase was observed after 10 or more workers were used (ICC>0.80). The proposed method provides a fast and cost-effective way to assess lameness in commercial herds. In addition, this method allows for large-scale data collection useful for training computer vision algorithms that could be used to automate lameness assessments on farm.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Reasons given for a report as a percentage of the overall number of reports.