CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The receiver operating characteristics (ROC) curve is typically employed when one wants to evaluate the discriminatory capability of a continuous or ordinal biomarker in the case where two groups are to be distinguished, commonly the ’healthy’ and the ’diseased’. There are cases for which the disease status has three categories. Such cases employ the (ROC) surface, which is a natural generalization of the ROC curve for three classes. In this paper, we explore new methodologies for comparing two continuous biomarkers that refer to a trichotomous disease status, when both markers are applied to the same patients. Comparisons based on the volume under the surface have been proposed, but that measure is often not clinically relevant. Here, we focus on comparing two correlated ROC surfaces at given pairs of true classification rates, which are more relevant to patients and physicians. We propose delta-based parametric techniques, power transformations to normality, and bootstrap-based smooth nonparametric techniques to investigate the performance of an appropriate test. We evaluate our approaches through an extensive simulation study and apply them to a real data set from prostate cancer screening.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This excel file will do a statistical tests of whether two ROC curves are different from each other based on the Area Under the Curve. You'll need the coefficient from the presented table in the following article to enter the correct AUC value for the comparison: Hanley JA, McNeil BJ (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148:839-843.
The Excel file contains the model input-out data sets that where used to evaluate the two-layer soil moisture and flux dynamics model. The model is original and was developed by Dr. Hantush by integrating the well-known Richards equation over the root layer and the lower vadose zone. The input-output data are used for: 1) the numerical scheme verification by comparison against HYDRUS model as a benchmark; 2) model validation by comparison against real site data; and 3) for the estimation of model predictive uncertainty and sources of modeling errors. This dataset is associated with the following publication: He, J., M.M. Hantush, L. Kalin, and S. Isik. Two-Layer numerical model of soil moisture dynamics: Model assessment and Bayesian uncertainty estimation. JOURNAL OF HYDROLOGY. Elsevier Science Ltd, New York, NY, USA, 613 part A: 128327, (2022).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Statistical comparison of multiple time series in their underlying frequency patterns has many real applications. However, existing methods are only applicable to a small number of mutually independent time series, and empirical results for dependent time series are only limited to comparing two time series. We propose scalable methods based on a new algorithm that enables us to compare the spectral density of a large number of time series. The new algorithm helps us efficiently obtain all pairwise feature differences in frequency patterns between M time series, which plays an essential role in our methods. When all M time series are independent of each other, we derive the joint asymptotic distribution of their pairwise feature differences. The asymptotic dependence structure between the feature differences motivates our proposed test for multiple mutually independent time series. We then adapt this test to the case of multiple dependent time series by partially accounting for the underlying dependence structure. Additionally, we introduce a global test to further enhance the approach. To examine the finite sample performance of our proposed methods, we conduct simulation studies. The new approaches demonstrate the ability to compare a large number of time series, whether independent or dependent, while exhibiting competitive power. Finally, we apply our methods to compare multiple mechanical vibrational time series.
This data release contains time series and plots summarizing mean monthly temperature (TAVE) and total monthly precipitation (PPT), and runoff (RO) from the U.S. Geological Survey Monthly Water Balance Model at 115 National Wildlife Refuges within the U.S. Fish and Wildlife Service Mountain-Prairie Region (CO, KS, MT, NE, ND, SD, UT, and WY). These three variables are derived from two sets of statistically-downscaled general circulation models from 1951 through 2099. Three variables (TAVE, PPT, and RO for refuge areas) were summarized for comparison across four 19-year periods: historic (1951-1969), baseline (1981-1999), 2050 (2041-2059), and 2080 (2071-2089). For each refuge, mean monthly plots, seasonal box plots, and annual envelope plots were produced for each of the four periods.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data was collected in order to compare quality of the signal acquired by two devices – BITalino (Da Silva, Guerreiro, Lourenço, Fred, & Martins, 2014) and BioNomadix (BIOPAC Systems Inc., Goleta, CA, USA).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Three experimental data sets (WNRA0103, WNRA0305 and WNRA0506) involving three grapevine varieties and a range of deficit irrigation and pruning treatments are described. The purpose for obtaining the data sets was two-fold, (1) to meet the research goals of the Cooperative Research Centre for Viticulture (CRCV) during its tenure 1999-2006, and (2) to test the capacity of the VineLOGIC grapevine growth and development model to predict timing of bud burst, flowering, veraison and harvest, yield and yield components, berry attributes and components of water balance. A test script, included with the VineLOGIC source code publication (https://doi.org/10.25919/5eb3536b6a8a8), enables comparison between model predicted and measured values for key variables. Key references relating to the model and data sets are provided under Related Links. A description of selected terms and outcomes of regression analysis between values predicted by the model and observed values are provided under Supporting Files. Version 3 included the following amendments: (1) to WNRA0103 – alignment of settings for irrigation simulation control and initial soil water contents for soil layers with those in WNRA0305 and WNRA0506, and addition of missing berry anthocyanin data for season 2002-03; (2) to WNRA0305 - minor corrections to values for berry and bunch number and weight, and correction of target Brix value for harvest to 24.5 Brix; (3) minor corrections to some measured berry anthocyanin concentrations as mg/g fresh weight; minor amendments to treatment names for consistency across data sets, and to the name for irrigation type to improve clarity; and (4) update of regression analysis between VineLOGIC-predicted versus observed values for key variables. Version 4 (this version) includes a metadata only amendment with two additions to Related links: ‘VineLOGIC View’ and a recent publication. Lineage: The data sets were obtained at a commercial wine company vineyard in the Mildura region of north western Victoria, Australia. Vines were spaced 2.4 m within rows and 3 m between rows, trained to a two-wire vertical trellis and drip irrigated. The soil was a Nookamka sandy loam. Data Set 1 (WNRA0103): An experiment comparing the effects on grapevine growth and development of three pruning treatments, spur, light mechanical hedging and minimal pruning, involving Shiraz on Schwarzmann rootstock, irrigated with industry standard drip irrigation and collected over three seasons 2000-01, 2001-02 and 2002-03. The experiment was established and conducted by Dr Rachel Ashley with input from Peter Clingeleffer (CSIRO), Dr Bob Emmett (Department of Primary Industries, Victoria) and Dr Peter Dry (University of Adelaide). Seasons in the southern hemisphere span two calendar years, with budburst in the second half of the first calendar year and harvest in the first half of the second calendar year. Data Set 2 (WNRA0305): An experiment comparing the effects of three irrigation treatments, industry standard drip, Regulated Deficit (RDI) and Prolonged Deficit (PD) irrigation involving Cabernet Sauvignon on own roots and pruned by light mechanical hedging, over three seasons 2002-03, 2003-04 and 2004-05. The RDI treatment involved application of a water deficit in the post-fruit set to pre-veraison period. The PD treatment was initially the same as RDI but with an extended period of extreme deficit (no irrigation) after the RDI stress period until veraison. The experiment was established and conducted by Dr Nicola Cooley with input from Peter Clingeleffer and Dr Rob Walker (CSIRO). Data Set 3 (WNRA0506): Compared basic grapevine growth, development and berry maturation post fruit set at three Trial Sites over two seasons 2004-05 and 2005-06. Trial Site one is the same site used to collect Data Set 1. Data were collected from all three pruning treatments in season 2004-05 but only from the spur and light mechanical hedging treatments in season 2005-06. Trial Site two involved comparison of two scions, Chardonnay and Shiraz, both on Schwarzmann rootstock, irrigated with industry standard drip irrigation and pruned using light mechanical hedging. Data were collected in season 2004-05. Trial Site three is the same site used to collect Data Set 2. Data were collected from all three irrigation treatments in season 2004-05 but only from the industry standard drip and PD treatments in 2005-06. Establishment and conduct of experiments at Trial Sites one, two and three was by Dr Anne Pellegrino and Deidre Blackmore with input from Peter Clingeleffer and Dr Rob Walker. The decision to develop Data Set 3 followed a mid-term CRCV review and analysis of available Australian data sets and relevant literature, which identified the need to obtain a data set covering all of the required variables necessary to run VineLOGIC and in particular, to obtain data on berry development commencing as soon as possible after fruit set. Most prior data sets were from veraison onwards, which is later than desirable from a modelling perspective. Data Set 1, 2 and 3 compilation for VineLOGIC was by Deidre Blackmore with input from Dr Doug Godwin. Review and testing of the Data Sets with VineLOGIC was conducted by David Benn with input from Dr Paul Petrie (South Australian Research and Development Institute), Dr Vinay Pagay (University of Adelaide) and Drs Everard Edwards and Rob Walker (CSIRO). A collaboration agreement with University of Adelaide established in 2017 enabled further input to review of the Data Sets and their testing with VineLOGIC by Dr Sam Culley.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
General information
This data is supplementary material to the paper by Watson et al. on sex differences in global reporting of adverse drug reactions [1]. Readers are referred to this paper for a detailed description of the context in which the data was generated. Anyone intending to use this data for any purpose should read the publicly available information on the VigiBase source data [2, 3]. The conditions specified in the caveat document [3] must be adhered to.
Source dataset
The dataset published here is based on analyses performed in VigiBase, the WHO global database of individual case safety reports [4]. All reports entered into VigiBase from its inception in 1967 up to 2 January 2018 with patient sex coded as either female or male have been included, except suspected duplicate reports [5]. In total, the source dataset contained 9,056,566 female and 6,012,804 male reports.
Statistical analysis
The characteristics of the female reports were compared to those of the male reports using a method called vigiPoint [6]. This is a method for comparing two or more sets of reports (here female and male reports) on a large set of reporting variables, and highlight any feature in which the sets are different in a statistically and clinically relevant manner. For example, patient age group is a reporting variable, and the different age groups 0 - 27 days, 28 days - 23 months et cetera are features within this variable. The statistical analysis is based on shrinkage log odds ratios computed as a comparison between the two sets of reports for each feature, including all reports without missing information for the variable under consideration. The specific output from vigiPoint is defined precisely below. Here, the results for 18 different variables with a total of 44,486 features are presented. 74 of these features were highlighted as so called vigiPoint key features, suggesting a statistically and clinically significant difference between female and male reports in VigiBase.
Description of published dataset
The dataset is provided in the form of a MS Excel spreadsheet (.xlsx file) with nine columns and 44,486 rows (excluding the header), each corresponding to a specific feature. Below follows a detailed description of the data included in the different columns.
Variable: This column indicates the reporting variable to which the specific feature belongs. Six of these variables are described in the original publication by Watson et al.: country of origin, geographical region of origin, type of reporter, patient age group, MedDRA SOC, ATC level 2 of reported drugs, seriousness, and fatality [1]. The remaining 12 are described here:
The Variable column can be useful for filtering the data, for example if one is interested in one or a few specific variables.
Feature: This column contains each of the 44,486 included features. The vast majority should be self-explanatory, or else they have been explained above, or in the original paper [1].
Female reports and Male reports: These columns show the number of female and male reports, respectively, for which the specific feature is present.
Proportion among female reports and Proportion among male reports: These columns show the proportions within the female and male reports, respectively, for which the specific feature is present. Comparing these crude proportions is the simplest and most intuitive way to contrast the female and male reports, and a useful complement to the specific vigiPoint output.
Odds ratio: The odds ratio is a basic measure of association between the classification of reports into female and male reports and a given reporting feature, and hence can be used to compare female and male reports with respect to this feature. It is formally defined as a / (bc / d), where
This crude odds ratio can also be computed as (pfemale / (1-pfemale)) / (pmale / (1-pmale)), where pfemale and pmale are the proportions described earlier. If the odds ratio is above 1, the feature is more common among the female than the male reports; if below 1, the feature is less common among the female than the male reports. Note that the odds ratio can be mathematically undefined, in which case it is missing in the published data.
vigiPoint score: This score is defined based on an odds ratio with added statistical shrinkage, defined as (a + k) / ((bc / d) + k), where k is 1% of the total number of female reports, or about 9,000. While the shrinkage adds robustness to the measure of association, it makes interpretation more difficult, which is why the crude proportions and unshrunk odds ratios are also presented. Further, 99% credibility intervals are computed for the shrinkage odds ratios, and these intervals are transformed onto a log2 scale [6]. The vigiPoint score is then defined as the lower endpoint of the interval, if that endpoint is above 0; as the higher endpoint of the interval, if that endpoint is below 0; and otherwise as 0. The vigiPoint score is useful for sorting the features from strongest positive to strongest negative associations, and/or to filter the features according to some user-defined criteria.
vigiPoint key feature: Features are classified as vigiPoint key features if their vigiPoint score is either above 0.5 or below -0.5. The specific thereshold of 0.5 is arbitrary, but chosen to identify features where the two sets of reports (here female and male reports) differ in a clinically significant way.
References
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
2D and 3D QSAR techniques are widely used in lead optimization-like processes. A compilation of 40 diverse data sets is described. It is proposed that these can be used as a common benchmark sample for comparisons of QSAR methodologies, primarily in terms of predictive ability. Use of this benchmark set will be useful for both assessment of new methods and for optimization of existing methods.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Until recently, researchers who wanted to examine the determinants of state respect for most specific negative rights needed to rely on data from the CIRI or the Political Terror Scale (PTS). The new V-DEM dataset offers scholars a potential alternative to the individual human rights variables from CIRI. We analyze a set of key Cingranelli-Richards (CIRI) Human Rights Data Project and Varieties of Democracy (V-DEM) negative rights indicators, finding unusual and unexpectedly large patterns of disagreement between the two sets. First, we discuss the new V-DEM dataset by comparing it to the disaggregated CIRI indicators, discussing the history of each project, and describing its empirical domain. Second, we identify a set of disaggregated human rights measures that are similar across the two datasets and discuss each project's measurement approach. Third, we examine how these measures compare to each other empirically, showing that they diverge considerably across both time and space. These findings point to several important directions for future work, such as how conceptual approaches and measurement strategies affect rights scores. For the time being, our findings suggest that researchers should think carefully about using the measures as substitutes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data used in the development of the research entitled: "Comparison between machine learning classification and trajectory-based change detection for identifying eucalyptus areas in Landsat time series"
The dataset was created by this notebook: https://www.kaggle.com/douglaskgaraujo/sentence-complexity-comparison-dataset
This data is a pairwise comparison of sentences, together with information about their relative complexity. The original dataset is from the CommonLit Readability Prize competition, and interested readers are referred there (especially the competitions' discussion forums) for more information on the data itself.
Important notice! As per that competition's rules, the license is as follows:
A. Data Access and Use. Competition Use and Non-Commercial & Academic Research: *You may access and use the Competition Data for non-commercial purposes only, including for participating in the Competition and on Kaggle.com forums, and for academic research and education. *The Competition Sponsor reserves the right to disqualify any participant who uses the Competition Data other than as permitted by the Competition Website and these Rules.
B. Data Security. You agree to use reasonable and suitable measures to prevent persons who have not formally agreed to these Rules from gaining access to the Competition Data. You agree not to transmit, duplicate, publish, redistribute or otherwise provide or make available the Competition Data to any party not participating in the Competition. You agree to notify Kaggle immediately upon learning of any possible unauthorized transmission of or unauthorized access to the Competition Data and agree to work with Kaggle to rectify any unauthorized transmission or access.
C. External Data. You may use data other than the Competition Data (“External Data”) to develop and test your Submissions. However, you will ensure the External Data is publicly available and equally accessible to use by all participants of the Competition for purposes of the competition at no cost to the other participants. The ability to use External Data under this Section 7.C (External Data) does not limit your other obligations under these Competition Rules, including but not limited to Section 11 (Winners Obligations).
This dataset is a pairwise comparison of each sentence in the CommonLit competition with 500 other randomly-matched sentences. Sentences are divided into a training and validation datasets before being matched randomly. The relative complexity of each sentence is measured, and features such as the distance between this score for both sentences, and a column indicating whether or not the first sentence's readability score is greater than or equal to the score of the second sentence.
Thank you for the organisers of this competition for providing this dataset.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Estimates of excess deaths can provide information about the burden of mortality potentially related to the COVID-19 pandemic, including deaths that are directly or indirectly attributed to COVID-19. Excess deaths are typically defined as the difference between the observed numbers of deaths in specific time periods and expected numbers of deaths in the same time periods. This visualization provides weekly estimates of excess deaths by the jurisdiction in which the death occurred. Weekly counts of deaths are compared with historical trends to determine whether the number of deaths is significantly higher than expected.Counts of deaths from all causes of death, including COVID-19, are presented. As some deaths due to COVID-19 may be assigned to other causes of deaths (for example, if COVID-19 was not diagnosed or not mentioned on the death certificate), tracking all-cause mortality can provide information about whether an excess number of deaths is observed, even when COVID-19 mortality may be undercounted. Additionally, deaths from all causes excluding COVID-19 were also estimated. Comparing these two sets of estimates — excess deaths with and without COVID-19 — can provide insight about how many excess deaths are identified as due to COVID-19, and how many excess deaths are reported as due to other causes of death. These deaths could represent misclassified COVID-19 deaths, or potentially could be indirectly related to the COVID-19 pandemic (e.g., deaths from other causes occurring in the context of health care shortages or overburdened health care systems).Estimates of excess deaths can be calculated in a variety of ways, and will vary depending on the methodology and assumptions about how many deaths are expected to occur. Estimates of excess deaths presented in this webpage were calculated using Farrington surveillance algorithms (1). A range of values for the number of excess deaths was calculated as the difference between the observed count and one of two thresholds (either the average expected count or the upper bound of the 95% prediction interval), by week and jurisdiction.Provisional death counts are weighted to account for incomplete data. However, data for the most recent week(s) are still likely to be incomplete. Weights are based on completeness of provisional data in prior years, but the timeliness of data may have changed in 2020 relative to prior years, so the resulting weighted estimates may be too high in some jurisdictions and too low in others. As more information about the accuracy of the weighted estimates is obtained, further refinements to the weights may be made, which will impact the estimates. Any changes to the methods or weighting algorithm will be noted in the Technical Notes when they occur. More detail about the methods, weighting, data, and limitations can be found in the Technical Notes.This visualization includes several different estimates:Number of excess deaths: A range of estimates for the number of excess deaths was calculated as the difference between the observed count and one of two thresholds (either the average expected count or the upper bound threshold), by week and jurisdiction. Negative values, where the observed count fell below the threshold, were set to zero.Percent excess: The percent excess was defined as the number of excess deaths divided by the threshold.Total number of excess deaths: The total number of excess deaths in each jurisdiction was calculated by summing the excess deaths in each week, from February 1, 2020 to present. Similarly, the total number of excess deaths for the US overall was computed as a sum of jurisdiction-specific numbers of excess deaths (with negative values set to zero), and not directly estimated using the Farrington surveillance algorithms.Select a dashboard from the menu, then click on “Update Dashboard” to navigate through the different graphics.The first dashboard shows the weekly predicted counts of deaths from all causes, and the threshold for the expected number of deaths. Select a jurisdiction from the drop-down menu to show data for that jurisdiction.The second dashboard shows the weekly predicted counts of deaths from all causes and the weekly count of deaths from all causes excluding COVID-19. Select a jurisdiction from the drop-down menu to show data for that jurisdiction.The th
This Readme file summarizes the resultant files of my phylogenetic analyses deposited in the DRYAD repository of the paper: Cai, C., 2024. Ant backbone phylogeny resolved by modelling compositional heterogeneity among sites in genomic data. Communications Biology.
The results of my phylogenetic analyses are listed in two root folders, corresponding to the two previously published studies: Borowiec et al. (2019) and Romiguier et al. (2022).
1-Borowiec et al. 2019: This folder includes four folders, showing results based on four datasets of Borowiec et al. (2019) under the site-heterogeneous CAT-GTR+G4 model in PhyloBayes.
1-Full_data_set_unconstrained-7451 NT sites: Full 11-gene matrix (123 taxa, 7,451 nucleotide [NT] sites).
2-AT-rich_outgr_removed-7451 NT sites: Full matrix with the most AT-rich outgroups excluded (117 taxa, 7,451 NT sites).、
Western U.S. rangelands have been quantified as six fractional cover (0-100%) components over the Landsat archive (1985-2018) at 30-m resolution, termed the “Back-in-Time” (BIT) dataset. Robust validation through space and time is needed to quantify product accuracy. We leverage field data observed concurrently with HRS imagery over multiple years and locations in the Western U.S. to dramatically expand the spatial extent and sample size of validation analysis relative to a direct comparison to field observations and to previous work. We compare HRS and BIT data in the corresponding space and time. Our objectives were to evaluate the temporal and spatio-temporal relationships between HRS and BIT data, and to compare their response to spatio-temporal variation in climate. We hypothesize that strong temporal and spatio-temporal relationships will exist between HRS and BIT data and that they will exhibit similar climate response. We evaluated a total of 42 HRS sites across the western U.S. with 32 sites in Wyoming, and 5 sites each in Nevada and Montana. HRS sites span a broad range of vegetation, biophysical, climatic, and disturbance regimes. Our HRS sites were strategically located to collectively capture the range of biophysical conditions within a region. Field data were used to train 2-m predictions of fractional component cover at each HRS site and year. The 2-m predictions were degraded to 30-m, and some were used to train regional Landsat-scale, 30-m, “base” maps of fractional component cover representing circa 2016 conditions. A Landsat-imagery time-series spanning 1985-2018, excluding 2012, was analyzed for change through time. Pixels and times identified as changed from the base were trained using the base fractional component cover from the pixels identified as unchanged. Changed pixels were labeled with the updated predictions, while the base was maintained in the unchanged pixels. The resulting BIT suite includes the fractional cover of the six components described above for 1985-2018. We compare the two datasets, HRS and BIT, in space and time. Two tabular data presented here correspond to a temporal and spatio-temporal validation of the BIT data. First, the temporal data are HRS and BIT component cover and climate variable means by site by year. Second, the spatio-temporal data are HRS and BIT component cover and associated climate variables at individual pixels in a site-year.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
More and more customers demand online reviews of products and comments on the Web to make decisions about buying a product over another. In this context, sentiment analysis techniques constitute the traditional way to summarize a user’s opinions that criticizes or highlights the positive aspects of a product. Sentiment analysis of reviews usually relies on extracting positive and negative aspects of products, neglecting comparative opinions. Such opinions do not directly express a positive or negative view but contrast aspects of products from different competitors.
Here, we present the first effort to study comparative opinions in Portuguese, creating two new Portuguese datasets with comparative sentences marked by three humans. This repository consists of three important files: (1) lexicon that contains words frequently used to make a comparison in Portuguese; (2) Twitter dataset with labeled comparative sentences; and (3) Buscapé dataset with labeled comparative sentences.
The lexicon is a set of 176 words frequently used to express a comparative opinion in the Portuguese language. In these contexts, the lexicon is aggregated in a filter and used to build two sets of data with comparative sentences from two important contexts: (1) Social Network Online; and (2) Product reviews.
For Twitter, we collected all Portuguese tweets published in Brazil on 2018/01/10 and filtered all tweets that contained at least one keyword present in the lexicon, obtaining 130,459 tweets. Our work is based on the sentence level. Thus, all sentences were extracted and a sample with 2,053 sentences was created, which was labeled for three human manuals, reaching an 83.2% agreement with Fleiss' Kappa coefficient. For Buscapé, a Brazilian website (https://www.buscape.com.br/) used to compare product prices on the web, the same methodology was conducted by creating a set of 2,754 labeled sentences, obtained from comments made in 2013. This dataset was labeled by three humans, reaching an agreement of 83.46% with the Fleiss Kappa coefficient.
The Twitter dataset has 2,053 labeled sentences, of which 918 are comparative. The Buscapé dataset has 2,754 labeled sentences, of which 1,282 are comparative.
The datasets contain these labeled properties:
text: the sentence extracted from the review comment.
entity_s1: the first entity compared in the sentence.
entity_s2: the second entity compared in the sentence.
keyword: the comparative keyword used in the sentence to express comparison.
preferred_entity: the preferred entity.
id_start: the keyword's initial position in the sentence.
id_end: the keyword's final position in the sentence.
type: the sentence label, which specifies whether the phrase is a comparison.
Additional Information:
1 - The sentences were separated using a sentence tokenizer.
2 - If the compared entity is not specified, the field will receive a value: "_".
3 - The property "type" can contain five values, they are:
0: Non-comparative (Não Comparativa).
1: Non-Equal-Gradable (Gradativa com Predileção).
2: Equative (Equitativa).
3: Superlative (Superlativa).
4: Non-Equal-Gradable (Não Gradativa).
If you use this data, please cite our paper as follows:
"Daniel Kansaon, Michele A. Brandão, Julio C. S. Reis, Matheus Barbosa,Breno Matos, and Fabrício Benevenuto. 2020. Mining Portuguese Comparative Sentences in Online Reviews. In Brazilian Symposium on Multimedia and the Web (WebMedia ’20), November 30-December 4, 2020, São Luís, Brazil. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3428658.3431081"
Plus Information:
We make the raw sentences available in the dataset to allow future work to test different pre-processing steps. Then, if you want to obtain the exact sentences used in the paper above, you must reproduce the pre-processing step described in the paper (Figure 2).
For each sentence with more than one keyword in the dataset:
You need to extract three words before and three words after the comparative keyword, creating a new sentence that will receive the existing value in the “type” field as a label;
The original sentence will be divided into n new sentences. (n) is the number of keywords in the sentence;
The stopwords should not be accounted for as part of this range (3 words);
Note that: the final processed sentence can have more than six words because the stopwords are not counted as part of the range.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The top table shows the average classifier performance for cross-validation on the 9-locus public STR data. The bottom table is the performance for the same test, but on a 9-locus subset of our ground-truth training data. While overall performance is lower than the 15-locus cross-validation test on our ground-truth data (Table 1), the two data sets perform similarly here, indicating that increasing the number of markers in the data set can significantly improve performance.
Hydrodynamic and sediment transport time-series data, including water depth, velocity, turbidity, conductivity, and temperature, were collected by the U.S. Geological Survey (USGS) Pacific Coastal and Marine Science Center at shallow subtidal and intertidal sites in Corte Madera Bay and San Pablo Bay National Wildlife Refuge (SPNWF) in San Francisco Bay, CA, as well as on the marsh plain of SPNWF marsh and in a tidal creek and on the marsh plain of Corte Madera Marsh, in 2022 and 2023. Data files are grouped by station, San Pablo subtidal, San Pablo intertidal, San Pablo marsh, Corte Madera subtidal, Corte Madera intertidal, Corte Madera marsh, or Corte Madera tidal creek, then by instrument type. At most stations there were periods of low water when sensors were no longer submerged, resulting in spurious data. In addition, most instruments experienced some degree of biofouling, particularly at the subtidal and intertidal stations. The subtidal stations also occasionally show signs of platform rocking or movement due to strong water flow, and/or from accidental fisher/boater interference. Users are advised to assess data quality carefully, and to check the metadata for instrument information, as platform deployment times and data-processing methods varied.
This dataset provides NDVI time series data in comma-delimited format from the phenocam location using five satellite products: 1) Proba-V L1c product 2) Landsat 7 SR product 3) Sentinel-2 Level-1C product 4) Sentinel 2 Level-2A data product 5) Suomi National Polar-Orbiting Partnership (S-NPP) NASA Visible Infrared Imaging Radiometer Suite (VIIRS) VNP13A1 data product The dataset also includes scripts to download these data from Google Earth Engine. The data are provided in support of the following publication: "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States". The data and scripts allow users to replicate, test, or further explore results. The comma-delimited csv files are named according to the satellite and product. The javascript Google Earth Engine code files (within the folder "Code") are also named by satellite/product, with the Proba and VIIRS time series code combined into a single file, and the other products as separate files. A graph of the data is included as the file 'SensorCompare4SB.jpg' and shows the NDVI time series from the products described above. The data in this graph can also be viewed in figure 10 of the associated journal article.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In research evaluating statistical analysis methods, a common aim is to compare point estimates and confidence intervals (CIs) calculated from different analyses. This can be challenging when the outcomes (and their scale ranges) differ across datasets. We therefore developed a plot to facilitate pairwise comparisons of point estimates and confidence intervals from different statistical analyses both within and across datasets.
The plot was developed and refined over the course of an empirical study. To compare results from a variety of different studies, a system of centring and scaling is used. Firstly, the point estimates from reference analyses are centred to zero, followed by scaling confidence intervals to span a range of one. The point estimates and confidence intervals from matching comparator analyses are then adjusted by the same amounts. This enables the relative positions of the point estimates and CI widths to be quickly assessed while maintaining the relative magnitudes of the difference in point estimates and confidence interval widths between the two analyses. Banksia plots can be graphed in a matrix, showing all pairwise comparisons of multiple analyses. In this paper, we show how to create a banksia plot and present two examples: the first relates to an empirical evaluation assessing the difference between various statistical methods across 190 interrupted time series (ITS) data sets with widely varying characteristics, while the second example assesses data extraction accuracy comparing results obtained from analysing original study data (43 ITS studies) with those obtained by four researchers from datasets digitally extracted from graphs from the accompanying manuscripts.
In the banksia plot of statistical method comparison, it was clear that there was no difference, on average, in point estimates and it was straightforward to ascertain which methods resulted in smaller, similar or larger confidence intervals than others. In the banksia plot comparing analyses from digitally extracted data to those from the original data it was clear that both the point estimates and confidence intervals were all very similar among data extractors and original data.
The banksia plot, a graphical representation of centred and scaled confidence intervals, provides a concise summary of comparisons between multiple point estimates and associated CIs in a single graph. Through this visualisation, patterns and trends in the point estimates and confidence intervals can be easily identified.
This collection of files allows the user to create the images used in the companion paper and amend this code to create their own banksia plots using either Stata version 17 or R version 4.3.1
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The receiver operating characteristics (ROC) curve is typically employed when one wants to evaluate the discriminatory capability of a continuous or ordinal biomarker in the case where two groups are to be distinguished, commonly the ’healthy’ and the ’diseased’. There are cases for which the disease status has three categories. Such cases employ the (ROC) surface, which is a natural generalization of the ROC curve for three classes. In this paper, we explore new methodologies for comparing two continuous biomarkers that refer to a trichotomous disease status, when both markers are applied to the same patients. Comparisons based on the volume under the surface have been proposed, but that measure is often not clinically relevant. Here, we focus on comparing two correlated ROC surfaces at given pairs of true classification rates, which are more relevant to patients and physicians. We propose delta-based parametric techniques, power transformations to normality, and bootstrap-based smooth nonparametric techniques to investigate the performance of an appropriate test. We evaluate our approaches through an extensive simulation study and apply them to a real data set from prostate cancer screening.