100+ datasets found

H
Replication Data for: How Cross-Validation Can Go Wrong and What to Do About...
dataverse.harvard.edu
Updated Jul 19, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcel Neunhoeffer; Sebastian Sternberg (2018). Replication Data for: How Cross-Validation Can Go Wrong and What to Do About it. [Dataset]. http://doi.org/10.7910/DVN/Y9KMJW
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/Y9KMJW
Dataset updated
Jul 19, 2018
Dataset provided by
Harvard Dataverse
Authors
Marcel Neunhoeffer; Sebastian Sternberg
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The introduction of new “machine learning” methods and terminology to political science complicates the interpretation of results. Even more so, when one term – like cross-validation – can mean very different things. We find different meanings of cross-validation in applied political science work. In the context of predictive modeling, cross-validation can be used to obtain an estimate of true error or as a procedure for model tuning. Using a single cross-validation procedure to obtain an estimate of the true error and for model tuning at the same time leads to serious misreporting of performance measures. We demonstrate the severe consequences of this problem with a series of experiments. We also observe this problematic usage of cross-validation in applied research. We look at Muchlinski et al. (2016) on the prediction of civil war onsets to illustrate how the problematic cross-validation can affect applied work. Applying cross-validation correctly, we are unable to reproduce their findings. We encourage researchers in predictive modeling to be especially mindful when applying cross-validation.
R
Error Detection V2 Dataset
universe.roboflow.com
zip
Updated Nov 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
3dprinting (2024). Error Detection V2 Dataset [Dataset]. https://universe.roboflow.com/3dprinting/error-detection-v2
Explore at:
zipAvailable download formats
Dataset updated
Nov 12, 2024
Dataset authored and provided by
3dprinting
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Variables measured
Warping Q42p Bounding Boxes
Description
Error Detection V2

## Overview Error Detection V2 is a dataset for object detection tasks - it contains Warping Q42p annotations for 1,827 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [MIT license](https://creativecommons.org/licenses/MIT).
E
AKCES-GEC Grammatical Error Correction Dataset for Czech
live.european-language-grid.eu
binary format
Updated Sep 26, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). AKCES-GEC Grammatical Error Correction Dataset for Czech [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/1280
Explore at:
binary formatAvailable download formats
Dataset updated
Sep 26, 2019
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
AKCES-GEC is a grammar error correction corpus for Czech generated from a subset of AKCES. It contains train, dev and test files annotated in M2 format.

Note that in comparison to CZESL-GEC dataset, this dataset contains separated edits together with their type annotations in M2 format and also has two times more sentences.

If you use this dataset, please use following citation:
@article{naplava2019wnut,

title={Grammatical Error Correction in Low-Resource Scenarios},

author={N{\'a}plava, Jakub and Straka, Milan},

journal={arXiv preprint arXiv:1910.00353},

year={2019}

}
d
Data from: A Streamlined and High-Throughput Error-Corrected Next-Generation...
datadryad.org
zenodo.org
zip
Updated Aug 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Page B. McKinzie; Michelle E. Bishop (2019). A Streamlined and High-Throughput Error-Corrected Next-Generation Sequencing Method for Low Variant Allele Frequency Quantitation [Dataset]. http://doi.org/10.5061/dryad.jj4g11s
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.jj4g11s
Dataset updated
Aug 9, 2019
Dataset provided by
Dryad
Authors
Page B. McKinzie; Michelle E. Bishop
Time period covered
Jul 30, 2019
Description
Quantifying mutant or variable allele frequencies (VAFs) of ≤10−3 using next-generation sequencing (NGS) has utility in both clinical and nonclinical settings. Two common approaches for quantifying VAFs using NGS are tagged single-strand sequencing and duplex sequencing. While duplex sequencing is reported to have sensitivity up to 10−8 VAF, it is not a quick, easy, or inexpensive method. We report a method for quantifying VAFs that are ≥10−4 that is as easy and quick for processing samples as standard sequencing kits, yet less expensive than the kits. The method was developed using PCR fragment-based VAFs of Kras codon 12 in log10 increments from 10−5 to 10−1, then applied and tested on native genomic DNA. For both sources of DNA, there is a proportional increase in the observed VAF to input VAF from 10−4 to 100% mutant samples. Variability of quantitation was evaluated within experimental replicates and shown to be consistent across sample preparations. The error at each successive ba...
f
Relative L2 error (%) on the test set for the source domain (TL7).
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated May 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhou, Yuqian; Liu, Qian; Yang, Haolin; Li, Kebing; Xu, Jinghong (2025). Relative L2 error (%) on the test set for the source domain (TL7). [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002035388
Explore at:
Dataset updated
May 22, 2025
Authors
Zhou, Yuqian; Liu, Qian; Yang, Haolin; Li, Kebing; Xu, Jinghong
Description
Relative L2 error (%) on the test set for the source domain (TL7).
s
Citation Trends for "Many-electron self-interaction error in approximate...
shibatadb.com
Updated Nov 28, 2006
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yubetsu (2006). Citation Trends for "Many-electron self-interaction error in approximate density functionals" [Dataset]. https://www.shibatadb.com/article/e9vMYnwh
Explore at:
Dataset updated
Nov 28, 2006
Dataset authored and provided by
Yubetsu
License
https://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt
Time period covered
2007 - 2025
Variables measured
New Citations per Year
Description
Yearly citation counts for the publication titled "Many-electron self-interaction error in approximate density functionals".
o
Mckenzie Road Cross Street Data in Bad Axe, MI
ownerly.com
Updated Apr 3, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ownerly (2022). Mckenzie Road Cross Street Data in Bad Axe, MI [Dataset]. https://www.ownerly.com/mi/bad-axe/mckenzie-rd-home-details
Explore at:
Dataset updated
Apr 3, 2022
Dataset authored and provided by
Ownerly
Area covered
Bad Axe, McKenzie Road, Michigan
Description
This dataset provides information about the number of properties, residents, and average property values for Mckenzie Road cross streets in Bad Axe, MI.
f
Error in the understanding estimation using eye gaze features or the answer...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Oct 25, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanches, Charles Lima; Augereau, Olivier; Kise, Koichi (2018). Error in the understanding estimation using eye gaze features or the answer feature. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000680650
Explore at:
Dataset updated
Oct 25, 2018
Authors
Sanches, Charles Lima; Augereau, Olivier; Kise, Koichi
Description
Error in the understanding estimation using eye gaze features or the answer feature.
f
Predictor importance in RF.
plos.figshare.com
xls
Updated Feb 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Halit Tutar; Senol Celik; Hasan Er; Erdal Gönülal (2025). Predictor importance in RF. [Dataset]. http://doi.org/10.1371/journal.pone.0318230.t006
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0318230.t006
Dataset updated
Feb 5, 2025
Dataset provided by
PLOS ONE
Authors
Halit Tutar; Senol Celik; Hasan Er; Erdal Gönülal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this study, the effect of morphological traits on fresh herbage yield of sorghum x sudangrass hybrid plant grown in Konya province, which is the largest cereal production area in Turkey, was analyzed with some data mining methods. For this purpose, Artificial Neural Networks (ANN), Automatic Linear Model (ALM), Random Forest (RF) Algorithm and Multivariate Adaptive Regression Spline (MARS) Algorithm were used, and the prediction performances of these methods were compared. Plant height of 251.22 cm, stem diameter of 7.03 mm, fresh herbage yield of 8010.69 kg da-1, crude protein ratio of 9.09%, acid detergent fiber 33.23%, neutral detergent fiber 57.44%, acid detergent lignin 7.43%, dry matter digestibility of 63.01%, dry matter intake 2.11%, and relative feed value of 103.02 were the descriptive statistical values that were computed. Model fit statistics, including coefficient of determination (R2), adjusted R2, root of mean square error (RMSE), mean absolute percentage error (MAPE), standard deviation ratio (SD ratio), Mean Absolution Error (MAE) and Relative Absolution Error (RAE), were used to evaluate the prediction abilities of the fitted models. The MARS method was shown to be the best model for describing fresh herbage yield, with the lowest values of RMSE, MAPE, SD ratio, MAE and RAE (137.7, 1.488, 0.072, 109.718 and 0.017, respectively), as well as the highest R2 value (0.995) and adjusted R2 value (0.991). The experimental results show that the MARS algorithm is the most suitable model for predicting fresh herbage yield in sorghum x sudangrass hybrid, providing a good alternative to other data mining algorithms.
T
Lebanon Exports of wigs, false beards, eyebrow, eyelashes, switches;...
tradingeconomics.com
csv, excel, json, xml
Updated Dec 5, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2018). Lebanon Exports of wigs, false beards, eyebrow, eyelashes, switches; articles of human hair to Cyprus [Dataset]. https://tradingeconomics.com/lebanon/exports/cyprus/wigs-hair-human-hair-articles
Explore at:
csv, excel, json, xmlAvailable download formats
Dataset updated
Dec 5, 2018
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1990 - Dec 31, 2025
Area covered
Lebanon
Description
Lebanon Exports of wigs, false beards, eyebrow, eyelashes, switches; articles of human hair to Cyprus was US$3.43 Thousand during 2021, according to the United Nations COMTRADE database on international trade.
T
Macau Imports of wigs, false beards, eyebrow, eyelashes, switches; articles...
tradingeconomics.com
csv, excel, json, xml
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). Macau Imports of wigs, false beards, eyebrow, eyelashes, switches; articles of human hair from Switzerland [Dataset]. https://tradingeconomics.com/macau/imports/switzerland/wigs-hair-human-hair-articles
Explore at:
csv, excel, json, xmlAvailable download formats
Dataset updated
Apr 24, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1990 - Dec 31, 2025
Area covered
Macao
Description
Macau Imports of wigs, false beards, eyebrow, eyelashes, switches; articles of human hair from Switzerland was US$30.14 Thousand during 2023, according to the United Nations COMTRADE database on international trade.
4
Data/software underlying the publication: Fault-tolerant structures for...
data.4tu.nl
zip
Updated Jan 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yves van Montfort; Sébastian de Bone; David Elkouss (2024). Data/software underlying the publication: Fault-tolerant structures for measurement-based quantum computation on a network [Dataset]. http://doi.org/10.4121/929e24f9-31fa-4816-99fa-3356e272df43.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/929e24f9-31fa-4816-99fa-3356e272df43.v1
Dataset updated
Jan 19, 2024
Dataset provided by
4TU.ResearchData
Authors
Yves van Montfort; Sébastian de Bone; David Elkouss
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset funded by
Dutch Research Council
Description
In this work, we introduce a method to construct fault-tolerant measurement-based quantum computation (MBQC) architectures and numerically estimate their performance over various types of networks. A possible application of such a paradigm is distributed quantum computation, where separate computing nodes work together on a fault-tolerant computation through entanglement. We gauge error thresholds of the architectures with an efficient stabilizer simulator to investigate the resilience against both circuit-level and network noise. We show that, for both monolithic (i.e., non-distributed) and distributed implementations, an architecture based on the diamond lattice may outperform the conventional cubic lattice. Moreover, the high erasure thresholds of non-cubic lattices may be exploited further in a distributed context, as their performance may be boosted through entanglement distillation by trading in entanglement success rates against erasure errors during the error decoding process. These results highlight the significance of lattice geometry in the design of fault-tolerant measurement-based quantum computing on a network, emphasizing the potential for constructing robust and scalable distributed quantum computers.
t
Impact of erroneous a priori information on the UT1-UTC determination from...
researchdata.tuwien.at
zip
Updated Jun 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lisa Kern; Lisa Kern; Matthias Schartner; Matthias Schartner; Sigrid Böhm; Sigrid Böhm (2024). Impact of erroneous a priori information on the UT1-UTC determination from VLBI Intensive sessions (simulation study) [Dataset]. http://doi.org/10.48436/08qqz-ymp66
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.48436/08qqz-ymp66
Dataset updated
Jun 25, 2024
Dataset provided by
TU Wien
Authors
Lisa Kern; Lisa Kern; Matthias Schartner; Matthias Schartner; Sigrid Böhm; Sigrid Böhm
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset was generated by researchers from the TU Wien Department of Geodesy and Geoinformation and the ETH Zürich Department of Civil, Environmental and Geomatic Engineering, as a fundamental part of a study related to the analysis of the impact of erroneous a priori information on the UT1-UTC determination from VLBI Intensive sessions. The corresponding publication (title: "On the importance of accurate pole and station coordinates for VLBI Intensive baselines") has been submitted to the Journal of Geodesy.
In addition, a conference paper, and presentations at the IVS General Meeting 2022, EGU General Assembly 2022 and REFAG 2022 are available.
Context and methodology
The dataset contains monthly simulated UT1-UTC values of an artificial global grid of VLBI antennas (VGOS) where realistic errors in the a priori values of the station coordinates, polar motion and nutation offsets are introduced. With the help of these simulated values the global impact of erroneous a priori information is analysed.
VieSched++ and VieVS (both developed at the TU Wien) were used to generate the schedules and simulations.
Technical details
The dataset is structured as follows. There are 7 subfolders in the zipped folder that contain the simulation results of the evaluations with modified a priori values:
folder "errSTAu" - error of 5 mm introduced in the up-direction of the second station
folder "errSTAe" - error of 5 mm introduced in the east-direction of the second station
folder "errSTAn" - error of 5 mm introduced in the north-direction of the second station
folder "errPMx" - error of 162 microarcseconds introduced in the x-component of the polar motion
folder "errPMy" - error of 162 microarcseconds introduced in the y-component of the polar motion
folder "errNUTx" - error of 162 microarcseconds introduced in the x-component of the nutation offsets
folder "errNUTy" - error of 162 microarcseconds introduced in the y-component of the nutation offsets
Within these folders, there are .txt files with the following naming convention: "N%E%_N%E%_d#.txt".
"N%E%_N%E%" represents the location (North and East in degrees = latitude and longitude) of the reference and remote station
"d#" again shows the error that has been introduced in the simulation process
The files contain the monthly simulation results of UT1-UTC and its accuracy in milliseconds.
f
Data from: Investigating students’ awareness of their own and others’...
tandf.figshare.com
docx
Updated May 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Linda Hämmerle; Andrea Möller; Alexander Bergmann-Gering; Theresa Krause-Wichmann; Judith Lederman (2025). Investigating students’ awareness of their own and others’ deviations from controlled science experiments [Dataset]. http://doi.org/10.6084/m9.figshare.28281315.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28281315.v1
Dataset updated
May 12, 2025
Dataset provided by
Taylor & Francis
Authors
Linda Hämmerle; Andrea Möller; Alexander Bergmann-Gering; Theresa Krause-Wichmann; Judith Lederman
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Despite an emphasis on experimentation in science curricula worldwide as part of efforts to improve scientific literacy, students encounter challenges, especially with designing unconfounded experiments as part of the control-of-variables strategy (CVS). Becoming aware of experimental design errors is a potential starting point for learners to enhance their experimental skills. Few studies investigated whether learners can accurately assess experiments and are aware of errors. This experimental study, conducted during the COVID-19 pandemic, investigates the accuracy of students’ assessment of experiments and their awareness of self-generated and others’ (vicarious) design errors. 127 students (grade 7–8) were randomly split into two groups. One group conducted an experiment themselves. The other examined an erroneous example of a fictitious peer. Afterwards, both received the same instructions on the CVS and prompts to assess the experiments. Data were collected via worksheets and photos of their experiments. Analysis reveals difficulties controlling all variables in an experiment, especially if they were continuous. Interestingly, while self- and peer assessment accuracy was generally high, students were significantly more aware of vicarious errors than of self-generated ones. This highlights the potential of using assessment of experimental design errors as learning opportunity for experimentation skills, especially when using vicarious errors.
T
Singapore Exports of wigs, false beards, eyebrow, eyelashes, switches;...
tradingeconomics.com
csv, excel, json, xml
Updated Nov 30, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2022). Singapore Exports of wigs, false beards, eyebrow, eyelashes, switches; articles of human hair to Turkey [Dataset]. https://tradingeconomics.com/singapore/exports/turkey/wigs-hair-human-hair-articles
Explore at:
json, excel, xml, csvAvailable download formats
Dataset updated
Nov 30, 2022
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1990 - Dec 31, 2025
Area covered
Singapore
Description
Singapore Exports of wigs, false beards, eyebrow, eyelashes, switches; articles of human hair to Turkey was US$9.07 Thousand during 2023, according to the United Nations COMTRADE database on international trade.
Veteran Status 2018-2022 - STATES
hub.arcgis.com
covid19-uscensus.hub.arcgis.com
Updated Feb 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
US Census Bureau (2024). Veteran Status 2018-2022 - STATES [Dataset]. https://hub.arcgis.com/maps/a66f7c567e014a0892d956d73a24bf74
Explore at:
Dataset updated
Feb 5, 2024
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
US Census Bureau
Area covered
Description
This service contains the 2018-2022 release of data from the American Community Survey (ACS) 5-year data about Veteran Status, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the percentage of the civilian population over the age of 18 that are Veterans.To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2018-2022ACS Table(s): DP02Data downloaded from: CensusBureau's API for American Community Survey Date of API call: January 18, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:Boundaries come from the Cartographic Boundaries via US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates, and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto Rico. The Counties (and equivalent) layer contains 3221 records - all counties and equivalent, Washington D.C., and Puerto Rico municipios. See Areas Published. Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells.Margin of error (MOE) values of -555555555 in the API (or "*****" (five asterisks) on data.census.gov) are displayed as 0 in this dataset. The estimates associated with these MOEs have been controlled to independent counts in the ACS weighting and have zero sampling error. So, the MOEs are effectively zeroes, and are treated as zeroes in MOE calculations. Other negative values on the API, such as -222222222, -666666666, -888888888, and -999999999, all represent estimates or MOEs that can't be calculated or can't be published, usually due to small sample sizes. All of these are rendered in this dataset as null (blank) values.
h
chunk_157
huggingface.co
Updated Jul 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
distilled-false-pos-one-sec-cv12 (2024). chunk_157 [Dataset]. https://huggingface.co/datasets/distilled-false-pos-one-sec-cv12/chunk_157
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 24, 2024
Dataset authored and provided by
distilled-false-pos-one-sec-cv12
Description
distilled-false-pos-one-sec-cv12/chunk_157 dataset hosted on Hugging Face and contributed by the HF Datasets community
2023 American Community Survey: B99258 | Allocation of Bedrooms (ACS 1-Year...
data.census.gov
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ACS, 2023 American Community Survey: B99258 | Allocation of Bedrooms (ACS 1-Year Estimates Detailed Tables) [Dataset]. https://data.census.gov/table?tid=ACSDT1Y2023.B99258
Explore at:
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
ACS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
2023
Description
Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, the decennial census is the official source of population totals for April 1st of each decennial year. In between censuses, the Census Bureau's Population Estimates Program produces and disseminates the official estimates of the population for the nation, states, counties, cities, and towns and estimates of housing units and the group quarters population for states and counties..Information about the American Community Survey (ACS) can be found on the ACS website. Supporting documentation including code lists, subject definitions, data accuracy, and statistical testing, and a full list of ACS tables and table shells (without estimates) can be found on the Technical Documentation section of the ACS website.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Source: U.S. Census Bureau, 2023 American Community Survey 1-Year Estimates.ACS data generally reflect the geographic boundaries of legal and statistical areas as of January 1 of the estimate year. For more information, see Geography Boundaries by Year..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables..Users must consider potential differences in geographic boundaries, questionnaire content or coding, or other methodological issues when comparing ACS data from different years. Statistically significant differences shown in ACS Comparison Profiles, or in data users' own analysis, may be the result of these differences and thus might not necessarily reflect changes to the social, economic, housing, or demographic characteristics being compared. For more information, see Comparing ACS Data..When information is missing or inconsistent, the Census Bureau logically assigns an acceptable value using the response to a related question or questions. If a logical assignment is not possible, data are filled using a statistical process called allocation, which uses a similar individual or household to provide a donor value. The "Allocated" section is the number of respondents who received an allocated value for a particular subject..Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on 2020 Census data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Explanation of Symbols:- The estimate could not be computed because there were an insufficient number of sample observations. For a ratio of medians estimate, one or both of the median estimates falls in the lowest interval or highest interval of an open-ended distribution. For a 5-year median estimate, the margin of error associated with a median was larger than the median itself.N The estimate or margin of error cannot be displayed because there were an insufficient number of sample cases in the selected geographic area. (X) The estimate or margin of error is not applicable or not available.median- The median falls in the lowest interval of an open-ended distribution (for example "2,500-")median+ The median falls in the highest interval of an open-ended distribution (for example "250,000+").** The margin of error could not be computed because there were an insufficient number of sample observations.*** The margin of error could not be computed because the median falls in the lowest interval or highest interval of an open-ended distribution.***** A margin of error is not appropriate because the corresponding estimate is controlled to an independent population or housing estimate. Effectively, the corresponding estimate has no sampling error and the margin of error may be treated as zero.
DeepLabCut network trained to track mouse body parts during open field...
zenodo.org
zip
Updated Sep 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marie A. Labouesse; Marie A. Labouesse; Shana Gershbaum; Julia Greenwald; Christoph Kellendonk; Shana Gershbaum; Julia Greenwald; Christoph Kellendonk (2023). DeepLabCut network trained to track mouse body parts during open field locomotion (top-down view) [Dataset]. http://doi.org/10.5281/zenodo.6448595
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6448595
Dataset updated
Sep 15, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Marie A. Labouesse; Marie A. Labouesse; Shana Gershbaum; Julia Greenwald; Christoph Kellendonk; Shana Gershbaum; Julia Greenwald; Christoph Kellendonk
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
DeepLabCut (https://github.com/DeepLabCut/) (Mathis et al., 2018; Nath et al., 2019) was used for tracking body parts of mice in an open field arena or in the rotarod. DeepLabCut 2.1.8.2 (local version on Windows with CPU, using the GUI) and 2.1.10.2 (google colab to train the network) were used using default parameters and the pretrained resnet50 network with imgaug augmentation. Frames were extracted with the k-means method and outlier frames with the jump method. Open field: 20 images from 19 videos (10 or 30 fps) were extracted for a total of 380 labeled pictures. 8 body parts (snout, both ears, body center, both side laterals, tail base and tail end) and the 4 corners of the field arena were manually labeled and linked to each other using skeletons. A neural network was trained using these images for 170K iterations. 20 outlier frames were extracted from each video and relabeled. An additional 20 images from 19 videos with different recording conditions were labeled. The network was then refined for 210K iterations (from scratch), yielding a train error of 3.33 pixels and a test error of 8.83 pixels (with a likelihood p-cutoff of 0.6). This process was repeated a second time (using an additional 20 images from 15 new videos) to improve the pixel error; to a final 400 K iterations (train error: 2.65, test error: 3.71). 67 videos from 5 different experiments were analyzed on the final network.

Used to analyze videos for a publication (Labouesse et al., Nature Communications 2023)
g
1996 Czech Election: Post-Election Study June 1996
search.gesis.org
Updated Apr 13, 2010
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Toka, Gabor (2010). 1996 Czech Election: Post-Election Study June 1996 [Dataset]. http://doi.org/10.4232/1.3633
Explore at:
application/x-stata-dta(287772), application/x-spss-por(528900), application/x-spss-sav(311796)Available download formats
Unique identifier
https://doi.org/10.4232/1.3633
Dataset updated
Apr 13, 2010
Dataset provided by
GESIS search
GESIS Data Archive
Authors
Toka, Gabor
License
https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms
Time period covered
Jun 9, 1996 - Jun 19, 1996
Area covered
Czechia
Variables measured
V148 - gender, V161 - industry, V1 - ZA Studynumber, V167 - denomination, V72 - LAST ELECTION, V87 - SYMPATHY: ODA, V88 - SYMPATHY: ODS, V117 - Death penalty, V84 - SYMPATHY: CSSD, V86 - SYMPATHY: KSCM, and 169 more
Description
Voting behaviour and political attitudes. Topics: Husehold finances in last and next 12 months; national economy in last and next 12 months; participated in the 1996 lower house election; vote in the 1996 lower house election; spouse: vote in the 1996 election; use of preference vote in 1996; tactical voting; knowledge: representation of party respondent voted for; radio and TV: sufficient information; radio and TV: impartial party which media favoured most; like about the parties (open question); government performance; provide a job for everyone; reducing income differences is harmful; the economic situation is unfavourable; privatisation is going to help; unprofitable enterprises should be closed down; atheists are unfit for public office; nationalism is always harmful chance of getting ahead; politicians should care more about crime; abortion should be allowed; preference of patriotic politician; church has too much influence; split of Czechoslovakia was wrong; restitution was wrong; left-right self-placement (7-point scale); satisfaction with democracy; last election; respondent close to any party; first party close to respondent second party close to respondent; third party close to respondent; party closest to respondent; any party closer than others; which party closer than others; how close to closest party; parties care what people want; parties are necessary; recall of name of candidate; sympathy of parties; state of economy; change in economic situation; MPs´ idea what people think contact with MP; who is in power; the way people vote; people say or hide opinion; left-right placement of parties; elections help to keep politicians honest; in election campaigns people can learn; elections divide the country; benefits of elections far outweigh the costs; death penalty; husband is to earn the money; clergy should not influence vote too many people rely on government assistance; smooth cooperation in firms is impossible; not enough respect for traditional Czech values; schools should teach children to obey; get rid of conflicts between the parties; for democracy turnout does not matter; voters decide how things are run; most voters cannot make intelligent decisions; better leaders would be chosen through exams; Czech Rep join the NATO; Czech Rep join the EEC; preferred relationship between Czech R and Slovakia; present regime compared to pre-1989 regime; people should refrain from criticizing Czech officials; politician may act contrary to the law; some people earn millions; people are responsible for their poverty; competent people can earn a lot of money; people get rich here mainly in an illegal way; private ownership should be expanded; more efforts to reduce inequalities; less government intervention; more toughness needed against Romany offenders; Romanians should be let to lead their own way of live; knowledge about electoral threshold, name of Minister of Transport, number of seats in Czech lower house; language spoken at home; occupation (respondent and spouse): ISCO code, EGP-10 classification and EGP-6 classification; strength of religious belief; frequency of church attendance; denomination; union membership: respondent; union membership: somebody else in household; gypsy or not(judgment of interviewer); date and length of the interview; number of contact attempts for interview; interview demanding; respondents primary electoral district.

Facebook

Twitter

Click to copy link

Link copied

Cite

Marcel Neunhoeffer; Sebastian Sternberg (2018). Replication Data for: How Cross-Validation Can Go Wrong and What to Do About it. [Dataset]. http://doi.org/10.7910/DVN/Y9KMJW

Replication Data for: How Cross-Validation Can Go Wrong and What to Do About it.

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.7910/DVN/Y9KMJW

Dataset updated

Jul 19, 2018

Dataset provided by

Harvard Dataverse

Authors

Marcel Neunhoeffer; Sebastian Sternberg

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

The introduction of new “machine learning” methods and terminology to political science complicates the interpretation of results. Even more so, when one term – like cross-validation – can mean very different things. We find different meanings of cross-validation in applied political science work. In the context of predictive modeling, cross-validation can be used to obtain an estimate of true error or as a procedure for model tuning. Using a single cross-validation procedure to obtain an estimate of the true error and for model tuning at the same time leads to serious misreporting of performance measures. We demonstrate the severe consequences of this problem with a series of experiments. We also observe this problematic usage of cross-validation in applied research. We look at Muchlinski et al. (2016) on the prediction of civil war onsets to illustrate how the problematic cross-validation can affect applied work. Applying cross-validation correctly, we are unable to reproduce their findings. We encourage researchers in predictive modeling to be especially mindful when applying cross-validation.

Clear search

Close search

Google apps

Main menu

Replication Data for: How Cross-Validation Can Go Wrong and What to Do About...

Error Detection V2 Dataset

Error Detection V2

AKCES-GEC Grammatical Error Correction Dataset for Czech

Data from: A Streamlined and High-Throughput Error-Corrected Next-Generation...

Relative L2 error (%) on the test set for the source domain (TL7).

Citation Trends for "Many-electron self-interaction error in approximate...

Mckenzie Road Cross Street Data in Bad Axe, MI

Error in the understanding estimation using eye gaze features or the answer...

Predictor importance in RF.

Lebanon Exports of wigs, false beards, eyebrow, eyelashes, switches;...

Macau Imports of wigs, false beards, eyebrow, eyelashes, switches; articles...

Data/software underlying the publication: Fault-tolerant structures for...

Impact of erroneous a priori information on the UT1-UTC determination from...

Context and methodology

Technical details

Data from: Investigating students’ awareness of their own and others’...

Singapore Exports of wigs, false beards, eyebrow, eyelashes, switches;...

Veteran Status 2018-2022 - STATES

chunk_157

2023 American Community Survey: B99258 | Allocation of Bedrooms (ACS 1-Year...

DeepLabCut network trained to track mouse body parts during open field...

1996 Czech Election: Post-Election Study June 1996

Replication Data for: How Cross-Validation Can Go Wrong and What to Do About it.