100+ datasets found

Sexual orientation (detailed), comparison of corrected and original data,...
ons.gov.uk
xlsx
Updated Nov 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2023). Sexual orientation (detailed), comparison of corrected and original data, England and Wales: Census 2021 [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/culturalidentity/sexuality/datasets/sexualorientationdetailedcomparisonofcorrectedandoriginaldataenglandandwalescensus2021
Explore at:
xlsxAvailable download formats
Dataset updated
Nov 1, 2023
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Area covered
England, Wales
Description
Dataset provided to help users interpret the correction made to the detailed Census 2021 sexual orientation estimates. More information in quality notice.
m
The banksia plot: a method for visually comparing point estimates and...
bridges.monash.edu
researchdata.edu.au
txt
Updated Oct 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Simon Turner; Amalia Karahalios; Elizabeth Korevaar; Joanne E. McKenzie (2024). The banksia plot: a method for visually comparing point estimates and confidence intervals across datasets [Dataset]. http://doi.org/10.26180/25286407.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.26180/25286407.v2
Dataset updated
Oct 15, 2024
Dataset provided by
Monash University
Authors
Simon Turner; Amalia Karahalios; Elizabeth Korevaar; Joanne E. McKenzie
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Companion data for the creation of a banksia plot:Background:In research evaluating statistical analysis methods, a common aim is to compare point estimates and confidence intervals (CIs) calculated from different analyses. This can be challenging when the outcomes (and their scale ranges) differ across datasets. We therefore developed a plot to facilitate pairwise comparisons of point estimates and confidence intervals from different statistical analyses both within and across datasets.Methods:The plot was developed and refined over the course of an empirical study. To compare results from a variety of different studies, a system of centring and scaling is used. Firstly, the point estimates from reference analyses are centred to zero, followed by scaling confidence intervals to span a range of one. The point estimates and confidence intervals from matching comparator analyses are then adjusted by the same amounts. This enables the relative positions of the point estimates and CI widths to be quickly assessed while maintaining the relative magnitudes of the difference in point estimates and confidence interval widths between the two analyses. Banksia plots can be graphed in a matrix, showing all pairwise comparisons of multiple analyses. In this paper, we show how to create a banksia plot and present two examples: the first relates to an empirical evaluation assessing the difference between various statistical methods across 190 interrupted time series (ITS) data sets with widely varying characteristics, while the second example assesses data extraction accuracy comparing results obtained from analysing original study data (43 ITS studies) with those obtained by four researchers from datasets digitally extracted from graphs from the accompanying manuscripts.Results:In the banksia plot of statistical method comparison, it was clear that there was no difference, on average, in point estimates and it was straightforward to ascertain which methods resulted in smaller, similar or larger confidence intervals than others. In the banksia plot comparing analyses from digitally extracted data to those from the original data it was clear that both the point estimates and confidence intervals were all very similar among data extractors and original data.Conclusions:The banksia plot, a graphical representation of centred and scaled confidence intervals, provides a concise summary of comparisons between multiple point estimates and associated CIs in a single graph. Through this visualisation, patterns and trends in the point estimates and confidence intervals can be easily identified.This collection of files allows the user to create the images used in the companion paper and amend this code to create their own banksia plots using either Stata version 17 or R version 4.3.1
i
Title: Comparing Transaction Logs to ILL - Raw Data Open Access Deposited
datacore.iu.edu
Updated May 8, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cohen, Rachael; Michaels, Sherri (2018). Title: Comparing Transaction Logs to ILL - Raw Data Open Access Deposited [Dataset]. https://datacore.iu.edu/concern/data_sets/z603qx40z?locale=en
Explore at:
Dataset updated
May 8, 2018
Dataset provided by
IU Scholarworks
Authors
Cohen, Rachael; Michaels, Sherri
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset for "Comparing Transaction Logs to ILL requests to Determine the Persistence of Library Patrons In Obtaining Materials" article. Excel file contains all data in four worksheets Zip file contains four csv files, one for each worksheet: - Comparing Transaction Logs to ILL - 2016 ILL Raw ...Data.csv - Comparing Transaction Logs to ILL - 2015 ILL Raw Data.csv - Comparing Transaction Logs to ILL - 2016 Zero Search Raw Data.csv - Comparing Transaction Logs to ILL - 2015 Zero Search Raw Data.csv [more]
i
Experimental (raw) Data of Statistical Comparison Between Formal and...
ieee-dataport.org
Updated May 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chris Karanikolas (2020). Experimental (raw) Data of Statistical Comparison Between Formal and Simulated Models’ Outcomes for CIBI vs. CVP General Problem [Dataset]. https://ieee-dataport.org/documents/experimental-raw-data-statistical-comparison-between-formal-and-simulated-models-outcomes
Explore at:
Dataset updated
May 9, 2020
Authors
Chris Karanikolas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data corresponds to quantitative (raw) effort assessments/predictions during maintenance process of a sample of 1000 possible instances of the general selection problem among Visitor and Inheritance Based Implementation over the Composite design patterns (CIBI vs CVP).
R
Data from: Sample-comparison mapping and joint stimulus control
datarepositorium.uminho.pt
pdf, tsv
Updated Apr 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Repositório de Dados da Universidade do Minho (2025). Sample-comparison mapping and joint stimulus control [Dataset]. http://doi.org/10.34622/datarepositorium/9SRSKQ
Explore at:
tsv(3692), pdf(20125)Available download formats
Unique identifier
https://doi.org/10.34622/datarepositorium/9SRSKQ
Dataset updated
Apr 7, 2025
Dataset provided by
Repositório de Dados da Universidade do Minho
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset funded by
FCT
Description
Data from experiment "Sample-comparison mapping and joint stimulus control"
Supplementary material from "Visual comparison of two data sets: Do people...
figshare.com
xlsx
Updated Mar 14, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Robin Kramer; Caitlin Telfer; Alice Towler (2017). Supplementary material from "Visual comparison of two data sets: Do people use the means and the variability?" [Dataset]. http://doi.org/10.6084/m9.figshare.4751095.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.4751095.v1
Dataset updated
Mar 14, 2017
Dataset provided by
Figsharehttp://figshare.com/
Authors
Robin Kramer; Caitlin Telfer; Alice Towler
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In our everyday lives, we are required to make decisions based upon our statistical intuitions. Often, these involve the comparison of two groups, such as luxury versus family cars and their suitability. Research has shown that the mean difference affects judgements where two sets of data are compared, but the variability of the data has only a minor influence, if any at all. However, prior research has tended to present raw data as simple lists of values. Here, we investigated whether displaying data visually, in the form of parallel dot plots, would lead viewers to incorporate variability information. In Experiment 1, we asked a large sample of people to compare two fictional groups (children who drank ‘Brain Juice’ versus water) in a one-shot design, where only a single comparison was made. Our results confirmed that only the mean difference between the groups predicted subsequent judgements of how much they differed, in line with previous work using lists of numbers. In Experiment 2, we asked each participant to make multiple comparisons, with both the mean difference and the pooled standard deviation varying across data sets they were shown. Here, we found that both sources of information were correctly incorporated when making responses. Taken together, we suggest that increasing the salience of variability information, through manipulating this factor across items seen, encourages viewers to consider this in their judgements. Such findings may have useful applications for best practices when teaching difficult concepts like sampling variation.
d
Percentage Differences Streamflow
catalog.data.gov
data.usgs.gov
+1more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Percentage Differences Streamflow [Dataset]. https://catalog.data.gov/dataset/percentage-differences-streamflow
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
U.S. Geological Survey
Description
A comma separated values (csv) file that is a snapshot of percent difference between November 19, 2008 and November 14, 2016 peak streamflow. The file lists station identification, water year, original (2008) peak Q, current (2016) peak Q and percent difference calculated per water year. The percent difference was calculated as the absolute value of [(current peak Q - original peak Q)/(original peak Q) x 100], where current peak Q is the 2016 peak and the original peak Q is the 2008 peak. When an original peak Q value is 0, the resultant percent difference calculation is undefined because of division by 0. In these cases, the percent difference field is populated with NA. Those entries are included in the data file so that users can make their own comparisons between the 2008 and 2016 peaks for those cases where the original peak value was 0.
CONGRUENCE
figshare.com
application/x-rar
Updated Feb 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ayman baniamer (2025). CONGRUENCE [Dataset]. http://doi.org/10.6084/m9.figshare.28462568.v1
Explore at:
application/x-rarAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28462568.v1
Dataset updated
Feb 21, 2025
Dataset provided by
figshare
Authors
ayman baniamer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
comparisons of MI VS ORIGINAL, EM VS ORIGINAL, and CIM VS ORIGINAL
Benchmark Multi-Omics Datasets for Methods Comparison
zenodo.org
data.niaid.nih.gov
bin, zip
Updated Nov 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gabriel Odom; Gabriel Odom; Lily Wang; Lily Wang (2021). Benchmark Multi-Omics Datasets for Methods Comparison [Dataset]. http://doi.org/10.5281/zenodo.5683002
Explore at:
bin, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5683002
Dataset updated
Nov 14, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Gabriel Odom; Gabriel Odom; Lily Wang; Lily Wang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Pathway Multi-Omics Simulated Data

These are synthetic variations of the TCGA COADREAD data set (original data available at http://linkedomics.org/data_download/TCGA-COADREAD/). This data set is used as a comprehensive benchmark data set to compare multi-omics tools in the manuscript "pathwayMultiomics: An R package for efficient integrative analysis of multi-omics datasets with matched or un-matched samples".

There are 100 sets (stored as 100 sub-folders, the first 50 in "pt1" and the second 50 in "pt2") of random modifications to centred and scaled copy number, gene expression, and proteomics data saved as compressed data files for the R programming language. These data sets are stored in subfolders labelled "sim001", "sim002", ..., "sim100". Each folder contains the following contents: 1) "indicatorMatricesXXX_ls.RDS" is a list of simple triplet matrices showing which genes (in which pathways) and which samples received the synthetic treatment (where XXX is the simulation run label: 001, 002, ...), (2) "CNV_partitionA_deltaB.RDS" is the synthetically modified copy number variation data (where A represents the proportion of genes in each gene set to receive the synthetic treatment [partition 1 is 20%, 2 is 40%, 3 is 60% and 4 is 80%] and B is the signal strength in units of standard deviations), (3) "RNAseq_partitionA_deltaB.RDS" is the synthetically modified gene expression data (same parameter legend as CNV), and (4) "Prot_partitionA_deltaB.RDS" is the synthetically modified protein expression data (same parameter legend as CNV).

Supplemental Files

The file "cluster_pathway_collection_20201117.gmt" is the collection of gene sets used for the simulation study in Gene Matrix Transpose format. Scripts to create and analyze these data sets available at: https://github.com/TransBioInfoLab/pathwayMultiomics_manuscript_supplement
d
Taichung City's new and old land number comparison data
data.gov.tw
csv, json, xml
Updated Jun 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Land Administration Bureau, Taichung City Government (2025). Taichung City's new and old land number comparison data [Dataset]. https://data.gov.tw/en/datasets/130155
Explore at:
xml, json, csvAvailable download formats
Dataset updated
Jun 13, 2025
Dataset authored and provided by
Land Administration Bureau, Taichung City Government
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Area covered
Taichung City
Description
Handle the re-survey of cadastre maps or cadastre organization areas, and the comparison table of old and new sections and plot numbers.
h
llm-comparison
huggingface.co
Updated Dec 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Karev (2024). llm-comparison [Dataset]. https://huggingface.co/datasets/alex-karev/llm-comparison
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 20, 2024
Authors
Alex Karev
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
LLM Similarity Comparison Dataset

This dataset is pased on the original Alpaca dataset and was synthetically genearted for LLM similarity comparison using ConSCompF framework as described in the original paper. The script used for generating data is available on Kaggle. It is divided into 3 subsets:

quantization - contains 156,000 samples (5,200 for each model) generated by the original Tinyllama and its 8-bit, 4-bit, and 2-bit GGUF quantized versions. comparison - contains 28,600… See the full description on the dataset page: https://huggingface.co/datasets/alex-karev/llm-comparison.
j
Original data for article: Comparison of epifluorescence microscopy and flow...
jyx.jyu.fi
Updated Feb 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pauliina Salmi; Anita Mäki; Anu Mikkonen; Veli-Mikko Puupponen; Kristiina Vuorio; Marja Tiirola (2025). Original data for article: Comparison of epifluorescence microscopy and flow cytometry in counting freshwater picophytoplankton [Dataset]. http://doi.org/10.17011/jyx/dataset/66278
Explore at:
Unique identifier
https://doi.org/10.17011/jyx/dataset/66278
Dataset updated
Feb 13, 2025
Authors
Pauliina Salmi; Anita Mäki; Anu Mikkonen; Veli-Mikko Puupponen; Kristiina Vuorio; Marja Tiirola
License
https://rightsstatements.org/page/InC/1.0/https://rightsstatements.org/page/InC/1.0/
Description
The dataset is divided into four subfolders: 1) "SEM experiment data" contains Scanning Electron Microscopy data, epifluorescence microscopy data and flow cytometry data of cultured Synechococcus, Chroococcus and Snowella 2) "raw data" contains epifluorescence microscopy and flow cytometry data of picophytoplankton from Finnish lakes. This has two sub folders "flow cytometry raw" and "microscopy raw" 3) "flow cytometry calibration data" contains data for cell size calibration with latex beads and volumetric calibration for the flow cytometer 4) "processed flow and microscopy data" contains excel workbooks for the figures shown in the manuscipt
Comparison of original and final budgets 2009-10
data.wu.ac.at
csv
Updated Mar 1, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HM Treasury (2014). Comparison of original and final budgets 2009-10 [Dataset]. https://data.wu.ac.at/odso/data_gov_uk/NWZjNmVlMDAtMTExNi00NDkyLTg3YWYtMDA5YjkxYzZmYTk3
Explore at:
csvAvailable download formats
Dataset updated
Mar 1, 2014
Dataset provided by
HM Treasuryhttps://gov.uk/hm-treasury
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Comparison of outturn information with final plans by department for 2009-10, taken from snapshots 31 and 11 (Main Estimate outturn snapshot April 2010 and Spring Supplementary Estimates plans snapshot February 2010). The 2009-10 data are consistent with the raw COINS data published in June 2010. The 2009-10 data will not match the provisional outturn for 2009-10 published by the Treasury on 26 July 2010. These datasets, and the COINS raw data will be updated at the end of September, to reflect the latest outturn for 2009-10, once all related national statistic releases have taken place.
g
Data from: Social Media as an Alternative to Surveys of Opinions about the...
datasearch.gesis.org
openicpsr.org
Updated May 3, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Conrad, Frederick (2019). Social Media as an Alternative to Surveys of Opinions about the Economy [Dataset]. http://doi.org/10.3886/E109581V1
Explore at:
Unique identifier
https://doi.org/10.3886/E109581V1
Dataset updated
May 3, 2019
Dataset provided by
da|ra (Registration agency for social science and economic data)
Authors
Conrad, Frederick
Description
There is interest in using social media content to supplement or even substitute for survey data. O’Connor et al. (2010) report reasonably high correlations between the sentiment of tweets containing the word “jobs” and survey-based measures of consumer confidence in 2008-2009. Other researchers report a similar relationship through 2011 but after that time it is no longer observed, suggesting such tweets may not be as promising an alternative to survey responses as originally hoped. But, it’s possible that with the right analytic techniques, the sentiment of “jobs” tweets might still be an acceptable alternative. We explore this possibility by attempting to strengthen the original relationship and then extending the most successful approaches to more recent years. We classify “jobs” tweets into categories whose content is related to employment and categories whose content is not, to see if sentiment of the former correlates more highly with a survey-based measure of consumer sentiment. We use five sentiment-scoring tools, calculate daily sentiment three different ways, and use a measure of association less sensitive to outliers than correlation. None of these approaches improved the size of the relationship in the original or more recent data. We discuss the possibility that weighting and better understanding why users tweet might help recover the original relationship between the sentiment of tweets and survey responses. However, despite the earlier promise of tweets as an alternative to survey responses, we find no evidence that the original relationship was more than a chance occurrence.
s
Data from: Raw data for Comparison of Self-reported Measures of Hearing to...
eprints.soton.ac.uk
data.mendeley.com
Updated Aug 15, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tsimpida, Dalia (2020). Raw data for Comparison of Self-reported Measures of Hearing to an Objective Audiometric Measure in Adults in the English Longitudinal Study of Ageing (ELSA) [Dataset]. https://eprints.soton.ac.uk/486931/
Explore at:
Dataset updated
Aug 15, 2020
Dataset provided by
Mendeley Data
Authors
Tsimpida, Dalia
Description
Raw data, computed data and statistical code for all main analyses and subgroup analyses presented in JAMA Netw Open. 2020;3(8):e2015009. doi:10.1001/jamanetworkopen.2020.15009 Data sharing statement: Access to The English Longitudinal Study of Ageing (ELSA) dataset is publicly available via the UK Data Service (https://www.ukdataservice.ac.uk) Note: Statistical code to create the subcategories of some demographic variables included in the analyses (e.g. age categories of participants) may not be available in the current dataset. Additional statistical code is available from the corresponding author upon reasonable request at: dialechti.tsimpida@manchester.ac.uk
NACP Regional: Original Observation Data and Biosphere and Inverse Model...
data.nasa.gov
data.staging.idas-ds1.appdat.jsc.nasa.gov
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). NACP Regional: Original Observation Data and Biosphere and Inverse Model Outputs - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/nacp-regional-original-observation-data-and-biosphere-and-inverse-model-outputs-7a660
Explore at:
Dataset updated
Apr 1, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description
This data set contains the originally-submitted observation measurement data, terrestrial biosphere model output data, and inverse model simulations that various investigator teams contributed to the North American Carbon Program (NACP) Regional Synthesis activities. The data set provides nine (9) data packages of remote sensing and ground observation measurements (OM) (MODIS gross primary productivity (GPP), MODIS net primary production (NPP), MODIS fraction of photosynthetically active radiation (fPar), MODIS leaf area index (LAI), MODIS enhanced vegetation index (EVI), MODIS normalize difference vegetation index (NDVI), Forest Inventory and Analysis (FIA) forest biomass, National Agricultural Statistics Service (NASS) crop NPP, and Flux Anomaly). The data set also provides data packages of simulation results from 19 terrestrial biosphere models (TBM) and eight (8) inverse models (IM). The data packages are respectively OM, TBM, and IM data files listed in Tables 4-6. Each OM, TBM, and IM data package contains all of the original data (and documentation, if any) that the NACP Modeling and Synthesis Thematic Data Center (MAST-DC) acquired or received. These originally-submitted data were processed by the MAST-DC to produce the three standardized gridded data sets of carbon flux for inter-comparison purposes (see Related Data Products below). These original data and documentation are provided to allow users of the standardized gridded data products to be able to trace back to the data origins when needed. The Data Center (ORNL DAAC) transformed some of the originally-submitted data files to file formats that are more suitable for long-term archiving. For example, .xlsx files were saved as .csv, ERDAS Imagine files were converted to GeoTIFFs, and MATLAB files were converted to GeoTIFF and NetCDF formats as appropriate. Files received in NetCDF, GeoTIFF, and HDF formats were not transformed.
d
Data comparison of old and new land numbers in the 105th year of the Bade...
data.gov.tw
csv
Updated Jun 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Land Administration, Taoyuan (2024). Data comparison of old and new land numbers in the 105th year of the Bade district [Dataset]. https://data.gov.tw/en/datasets/28720
Explore at:
csvAvailable download formats
Dataset updated
Jun 10, 2024
Dataset authored and provided by
Department of Land Administration, Taoyuan
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Area covered
Bade District
Description
The data of the comparison of new and old land numbers for the re-testing business in the history of the Bade District (until the end of 2015)
Z
One Classifier Ignores a Feature
data.niaid.nih.gov
Updated Apr 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maier, Karl (2022). One Classifier Ignores a Feature [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6502642
Explore at:
Dataset updated
Apr 29, 2022
Dataset authored and provided by
Maier, Karl
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data sets are used in a controlled experiment, where two classifiers should be compared. train_a.csv and explain.csv are slices from the original data set. train_b.csv contains the same instances as in train_a.csv, but with feature x1 set to 0 to make it unusable to classifier B.

The original data set was created and split using this Python code:

from sklearn.datasets import make_classification from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression

X, y = make_classification(n_samples=300, n_features=2, n_redundant=0, n_informative=2, n_clusters_per_class=1, class_sep=0.75, random_state=0) X *= 100

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=0) lm = LogisticRegression() lm.fit(X_train, y_train) clf_a = lm

clf_b = LogisticRegression() X2 = X.copy() X2[:, 0] = 0 X2_train, X2_test, y2_train, y2_test = train_test_split(X2, y, test_size=0.5, random_state=0) clf_b.fit(X2_train, y2_train)

X_explain = X_test y_explain = y_test
Z
Simulation Data & R scripts for: "Introducing recurrent events analyses to...
data.niaid.nih.gov
Updated Apr 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ferry, Nicolas (2024). Simulation Data & R scripts for: "Introducing recurrent events analyses to assess species interactions based on camera trap data: a comparison with time-to-first-event approaches" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11085005
Explore at:
Dataset updated
Apr 29, 2024
Dataset authored and provided by
Ferry, Nicolas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Files descriptions:

All csv files refer to results from the different models (PAMM, AARs, Linear models, MRPPs) on each iteration of the simulation. One row being one iteration. "results_perfect_detection.csv" refers to the results from the first simulation part with all the observations."results_imperfect_detection.csv" refers to the results from the first simulation part with randomly thinned observations to mimick imperfect detection.

ID_run: identified of the iteration (N: number of sites, D_AB: duration of the effect of A on B, D_BA: duration of the effect of B on A, AB: effect of A on B, BA: effect of B on A, Se: seed number of the iteration).PAMM30: p-value of the PAMM running on the 30-days survey.PAMM7: p-value of the PAMM running on the 7-days survey.AAR1: ratio value for the Avoidance-Attraction-Ratio calculating AB/BA.AAR2: ratio value for the Avoidance-Attraction-Ratio calculating BAB/BB.Harmsen_P: p-value from the linear model with interaction Species1*Species2 from Harmsen et al. (2009).Niedballa_P: p-value from the linear model comparing AB to BA (Niedballa et al. 2021).Karanth_permA: rank of the observed interval duration median (AB and BA undifferenciated) compared to the randomized median distribution, when permuting on species A (Karanth et al. 2017).MurphyAB_permA: rank of the observed AB interval duration median compared to the randomized median distribution, when permuting on species A (Murphy et al. 2021). MurphyBA_permA: rank of the observed BA interval duration median compared to the randomized median distribution, when permuting on species A (Murphy et al. 2021). Karanth_permB: rank of the observed interval duration median (AB and BA undifferenciated) compared to the randomized median distribution, when permuting on species B (Karanth et al. 2017).MurphyAB_permB: rank of the observed AB interval duration median compared to the randomized median distribution, when permuting on species B (Murphy et al. 2021). MurphyBA_permB: rank of the observed BA interval duration median compared to the randomized median distribution, when permuting on species B (Murphy et al. 2021).

"results_int_dir_perf_det.csv" refers to the results from the second simulation part, with all the observations."results_int_dir_imperf_det.csv" refers to the results from the second simulation part, with randomly thinned observations to mimick imperfect detection.ID_run: identified of the iteration (N: number of sites, D_AB: duration of the effect of A on B, D_BA: duration of the effect of B on A, AB: effect of A on B, BA: effect of B on A, Se: seed number of the iteration).p_pamm7_AB: p-value of the PAMM running on the 7-days survey testing for the effect of A on B.p_pamm7_AB: p-value of the PAMM running on the 7-days survey testing for the effect of B on A.AAR1: ratio value for the Avoidance-Attraction-Ratio calculating AB/BA.AAR2_BAB: ratio value for the Avoidance-Attraction-Ratio calculating BAB/BB.AAR2_ABA: ratio value for the Avoidance-Attraction-Ratio calculating ABA/AA.Harmsen_P: p-value from the linear model with interaction Species1*Species2 from Harmsen et al. (2009).Niedballa_P: p-value from the linear model comparing AB to BA (Niedballa et al. 2021).Karanth_permA: rank of the observed interval duration median (AB and BA undifferenciated) compared to the randomized median distribution, when permuting on species A (Karanth et al. 2017).MurphyAB_permA: rank of the observed AB interval duration median compared to the randomized median distribution, when permuting on species A (Murphy et al. 2021). MurphyBA_permA: rank of the observed BA interval duration median compared to the randomized median distribution, when permuting on species A (Murphy et al. 2021). Karanth_permB: rank of the observed interval duration median (AB and BA undifferenciated) compared to the randomized median distribution, when permuting on species B (Karanth et al. 2017).MurphyAB_permB: rank of the observed AB interval duration median compared to the randomized median distribution, when permuting on species B (Murphy et al. 2021). MurphyBA_permB: rank of the observed BA interval duration median compared to the randomized median distribution, when permuting on species B (Murphy et al. 2021).

Scripts files description:1_Functions: R script containing the functions: - MRPP from Karanth et al. (2017) adapted here for time efficiency. - MRPP from Murphy et al. (2021) adapted here for time efficiency. - Version of the ct_to_recurrent() function from the recurrent package adapted to process parallized on the simulation datasets. - The simulation() function used to simulate two species observations with reciprocal effect on each other.2_Simulations: R script containing the parameters definitions for all iterations (for the two parts of the simulations), the simulation paralellization and the random thinning mimicking imperfect detection.3_Approaches comparison: R script containing the fit of the different models tested on the simulated data.3_1_Real data comparison: R script containing the fit of the different models tested on the real data example from Murphy et al. 2021.4_Graphs: R script containing the code for plotting results from the simulation part and appendices.5_1_Appendix - Check for similarity between codes for Karanth et al 2017 method: R script containing Karanth et al. (2017) and Murphy et al. (2021) codes lines and the adapted version for time-efficiency matter and a comparison to verify similarity of results.5_2_Appendix - Multi-response procedure permutation difference: R script containing R code to test for difference of the MRPPs approaches according to the species on which permutation are done.
Z
Raw data for "Modular comparison of untargeted metabolomics processing...
data.niaid.nih.gov
zenodo.org
Updated Oct 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aigensberger, Markus (2024). Raw data for "Modular comparison of untargeted metabolomics processing steps" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13643189
Explore at:
Dataset updated
Oct 28, 2024
Dataset authored and provided by
Aigensberger, Markus
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Raw data for the Paper titled "Modular comparison of untargeted metabolomics processing steps". The dataset encompasses 42 samples, with 3 solvent blanks, 7 QC samples, and 32 biological samples (4 biological replicates: Banane, Bergrose, Narbe, Ricky) spiked with 42 compounds in different concentrations (0 ngmL, 30 ngmL, 100 ngmL, 300 ngmL).

Facebook

Twitter

Click to copy link

Link copied

Cite

Office for National Statistics (2023). Sexual orientation (detailed), comparison of corrected and original data, England and Wales: Census 2021 [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/culturalidentity/sexuality/datasets/sexualorientationdetailedcomparisonofcorrectedandoriginaldataenglandandwalescensus2021

Sexual orientation (detailed), comparison of corrected and original data, England and Wales: Census 2021

Explore at:

xlsxAvailable download formats

Dataset updated

Nov 1, 2023

Dataset provided by

Office for National Statisticshttp://www.ons.gov.uk/

License

Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically

Area covered

England, Wales

Description

Dataset provided to help users interpret the correction made to the detailed Census 2021 sexual orientation estimates. More information in quality notice.

Clear search

Close search

Google apps

Main menu

Sexual orientation (detailed), comparison of corrected and original data,...

The banksia plot: a method for visually comparing point estimates and...

Title: Comparing Transaction Logs to ILL - Raw Data Open Access Deposited

Experimental (raw) Data of Statistical Comparison Between Formal and...

Data from: Sample-comparison mapping and joint stimulus control

Supplementary material from "Visual comparison of two data sets: Do people...

Percentage Differences Streamflow

CONGRUENCE

Benchmark Multi-Omics Datasets for Methods Comparison

Taichung City's new and old land number comparison data

llm-comparison

Original data for article: Comparison of epifluorescence microscopy and flow...

Comparison of original and final budgets 2009-10

Data from: Social Media as an Alternative to Surveys of Opinions about the...

Data from: Raw data for Comparison of Self-reported Measures of Hearing to...

NACP Regional: Original Observation Data and Biosphere and Inverse Model...

Data comparison of old and new land numbers in the 105th year of the Bade...

One Classifier Ignores a Feature

Simulation Data & R scripts for: "Introducing recurrent events analyses to...

Raw data for "Modular comparison of untargeted metabolomics processing...

Sexual orientation (detailed), comparison of corrected and original data, England and Wales: Census 2021