Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Figures in scientific publications are critically important because they often show the data supporting key findings. Our systematic review of research articles published in top physiology journals (n = 703) suggests that, as scientists, we urgently need to change our practices for presenting continuous data in small sample size studies. Papers rarely included scatterplots, box plots, and histograms that allow readers to critically evaluate continuous data. Most papers presented continuous data in bar and line graphs. This is problematic, as many different data distributions can lead to the same bar or line graph. The full data may suggest different conclusions from the summary statistics. We recommend training investigators in data presentation, encouraging a more complete presentation of data, and changing journal editorial policies. Investigators can quickly make univariate scatterplots for small sample size studies using our Excel templates.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Journal of Physiology and British Journal of Pharmacology jointly published an editorial series in 2011 to improve standards in statistical reporting and data analysis. It is not known whether reporting practices changed in response to the editorial advice. We conducted a cross-sectional analysis of reporting practices in a random sample of research papers published in these journals before (n = 202) and after (n = 199) publication of the editorial advice. Descriptive data are presented. There was no evidence that reporting practices improved following publication of the editorial advice. Overall, 76-84% of papers with written measures that summarized data variability used standard errors of the mean, and 90-96% of papers did not report exact p-values for primary analyses and post-hoc tests. 76-84% of papers that plotted measures to summarize data variability used standard errors of the mean, and only 2-4% of papers plotted raw data used to calculate variability. Of papers that reported p-values between 0.05 and 0.1, 56-63% interpreted these as trends or statistically significant. Implied or gross spin was noted incidentally in papers before (n = 10) and after (n = 9) the editorial advice was published. Overall, poor statistical reporting, inadequate data presentation and spin were present before and after the editorial advice was published. While the scientific community continues to implement strategies for improving reporting practices, our results indicate stronger incentives or enforcements are needed.
Facebook
TwitterThis statistic depicts the distribution of tools used to compile data and present analytics and/or reports to management, according to a marketing survey of C-level executives, conducted in ************* by Black Ink. As of *************, * percent of respondents used statistical modeling tools, such as IBM's SPSS or the SAS Institute's Statistical Analysis System package, to compile and present their reports.
Facebook
TwitterThis guidance note sets out the recommended standard presentation of statistics for health areas within the countries and regions of the UK. (File Size 125 KB)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Most studies in the life sciences and other disciplines involve generating and analyzing numerical data of some type as the foundation for scientific findings. Working with numerical data involves multiple challenges. These include reproducible data acquisition, appropriate data storage, computationally correct data analysis, appropriate reporting and presentation of the results, and suitable data interpretation.Finding and correcting mistakes when analyzing and interpreting data can be frustrating and time-consuming. Presenting or publishing incorrect results is embarrassing but not uncommon. Particular sources of errors are inappropriate use of statistical methods and incorrect interpretation of data by software. To detect mistakes as early as possible, one should frequently check intermediate and final results for plausibility. Clearly documenting how quantities and results were obtained facilitates correcting mistakes. Properly understanding data is indispensable for reaching well-founded conclusions from experimental results. Units are needed to make sense of numbers, and uncertainty should be estimated to know how meaningful results are. Descriptive statistics and significance testing are useful tools for interpreting numerical results if applied correctly. However, blindly trusting in computed numbers can also be misleading, so it is worth thinking about how data should be summarized quantitatively to properly answer the question at hand. Finally, a suitable form of presentation is needed so that the data can properly support the interpretation and findings. By additionally sharing the relevant data, others can access, understand, and ultimately make use of the results.These quick tips are intended to provide guidelines for correctly interpreting, efficiently analyzing, and presenting numerical data in a useful way.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract This paper presents the results of the statistical graphs’ analysis according to the curricular guidelines and its implementation in eighteen primary education mathematical textbooks in Perú, which correspond to three complete series and are from different editorials. In them, through a content analysis, we analyzed sections where graphs appeared, identifying the type of activity that arises from the graphs involved, the demanded reading level and the semiotic complexity task involved. The textbooks are partially suited to the curricular guidelines regarding the graphs presentation by educational level and the number of activities proposed by the three editorials are similar. The main activity that is required in textbooks is calculating and building. The predominance of bar graphs, a basic reading level and the representation of an univariate data distribution in the graph are observed in this study.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sheet 1 (Raw-Data): The raw data of the study is provided, presenting the tagging results for the used measures described in the paper. For each subject, it includes multiple columns: A. a sequential student ID B an ID that defines a random group label and the notation C. the used notation: user Story or use Cases D. the case they were assigned to: IFA, Sim, or Hos E. the subject's exam grade (total points out of 100). Empty cells mean that the subject did not take the first exam F. a categorical representation of the grade L/M/H, where H is greater or equal to 80, M is between 65 included and 80 excluded, L otherwise G. the total number of classes in the student's conceptual model H. the total number of relationships in the student's conceptual model I. the total number of classes in the expert's conceptual model J. the total number of relationships in the expert's conceptual model K-O. the total number of encountered situations of alignment, wrong representation, system-oriented, omitted, missing (see tagging scheme below) P. the researchers' judgement on how well the derivation process explanation was explained by the student: well explained (a systematic mapping that can be easily reproduced), partially explained (vague indication of the mapping ), or not present.
Tagging scheme:
Aligned (AL) - A concept is represented as a class in both models, either
with the same name or using synonyms or clearly linkable names;
Wrongly represented (WR) - A class in the domain expert model is
incorrectly represented in the student model, either (i) via an attribute,
method, or relationship rather than class, or
(ii) using a generic term (e.g., user'' instead ofurban
planner'');
System-oriented (SO) - A class in CM-Stud that denotes a technical
implementation aspect, e.g., access control. Classes that represent legacy
system or the system under design (portal, simulator) are legitimate;
Omitted (OM) - A class in CM-Expert that does not appear in any way in
CM-Stud;
Missing (MI) - A class in CM-Stud that does not appear in any way in
CM-Expert.
All the calculations and information provided in the following sheets
originate from that raw data.
Sheet 2 (Descriptive-Stats): Shows a summary of statistics from the data collection,
including the number of subjects per case, per notation, per process derivation rigor category, and per exam grade category.
Sheet 3 (Size-Ratio):
The number of classes within the student model divided by the number of classes within the expert model is calculated (describing the size ratio). We provide box plots to allow a visual comparison of the shape of the distribution, its central value, and its variability for each group (by case, notation, process, and exam grade) . The primary focus in this study is on the number of classes. However, we also provided the size ratio for the number of relationships between student and expert model.
Sheet 4 (Overall):
Provides an overview of all subjects regarding the encountered situations, completeness, and correctness, respectively. Correctness is defined as the ratio of classes in a student model that is fully aligned with the classes in the corresponding expert model. It is calculated by dividing the number of aligned concepts (AL) by the sum of the number of aligned concepts (AL), omitted concepts (OM), system-oriented concepts (SO), and wrong representations (WR). Completeness on the other hand, is defined as the ratio of classes in a student model that are correctly or incorrectly represented over the number of classes in the expert model. Completeness is calculated by dividing the sum of aligned concepts (AL) and wrong representations (WR) by the sum of the number of aligned concepts (AL), wrong representations (WR) and omitted concepts (OM). The overview is complemented with general diverging stacked bar charts that illustrate correctness and completeness.
For sheet 4 as well as for the following four sheets, diverging stacked bar
charts are provided to visualize the effect of each of the independent and mediated variables. The charts are based on the relative numbers of encountered situations for each student. In addition, a "Buffer" is calculated witch solely serves the purpose of constructing the diverging stacked bar charts in Excel. Finally, at the bottom of each sheet, the significance (T-test) and effect size (Hedges' g) for both completeness and correctness are provided. Hedges' g was calculated with an online tool: https://www.psychometrica.de/effect_size.html. The independent and moderating variables can be found as follows:
Sheet 5 (By-Notation):
Model correctness and model completeness is compared by notation - UC, US.
Sheet 6 (By-Case):
Model correctness and model completeness is compared by case - SIM, HOS, IFA.
Sheet 7 (By-Process):
Model correctness and model completeness is compared by how well the derivation process is explained - well explained, partially explained, not present.
Sheet 8 (By-Grade):
Model correctness and model completeness is compared by the exam grades, converted to categorical values High, Low , and Medium.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Summary statistics are fundamental to data science, and are the buidling blocks of statistical reasoning. Most of the data and statistics made available on government web sites are aggregate, however, until now, we have not had a suitable linked data representation available. We propose a way to express summary statistics across aggregate groups as linked data using Web Ontology Language (OWL) Class based sets, where members of the set contribute to the overall aggregate value. Additionally, many clinical studies in the biomedical field rely on demographic summaries of their study cohorts and the patients assigned to each arm. While most data query languages, including SPARQL, allow for computation of summary statistics, they do not provide a way to integrate those values back into the RDF graphs they were computed from. We represent this knowledge, that would otherwise be lost, through the use of OWL 2 punning semantics, the expression of aggregate grouping criteria as OWL classes with variables, and constructs from the Semanticscience Integrated Ontology (SIO), and the World Wide Web Consortium's provenance ontology, PROV-O, providing interoperable representations that are well supported across the web of Linked Data. We evaluate these semantics using a Resource Description Framework (RDF) representation of patient case information from the Genomic Data Commons, a data portal from the National Cancer Institute.
Facebook
TwitterThis document sets out the general principles for the standard presentation of statistics in the UK. (File Size - 83 KB)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains several files related to our research paper titled "Attention Allocation to Projection Level Alleviates Overconfidence in Situation Awareness". These files are intended to provide a comprehensive overview of the data analysis process and the presentation of results. Below is a list of the files included and a brief description of each:
R Scripts: These are scripts written in the R programming language for data processing and analysis. The scripts detail the steps for data cleaning, transformation, statistical analysis, and the visualization of results. To replicate the study findings or to conduct further analyses on the dataset, users should run these scripts.
R Markdown File: Offers a dynamic document that combines R code with rich text elements such as paragraphs, headings, and lists. This file is designed to explain the logic and steps of the analysis in detail, embedding R code chunks and the outcomes of code execution. It serves as a comprehensive guide to understanding the analytical process behind the study.
HTML File: Generated from the R Markdown file, this file provides an interactive report of the results that can be viewed in any standard web browser. For those interested in browsing the study's findings without delving into the specifics of the analysis, this HTML file is the most convenient option. It presents the final analysis outcomes in an intuitive and easily understandable manner. For optimal viewing, we recommend opening the HTML file with the latest version of Google Chrome or any other modern web browser. This approach ensures that all interactive functionalities are fully operational.
Together, these files form a complete framework for the research analysis, aimed at enhancing the transparency and reproducibility of the study.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Visual analytics: Exposing the past, understanding the present, and looking to the future Dan Ariely, founder of The Center for Advanced Hindsight once posted on Facebook, “Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it...” This is especially true in Higher Education as much of the work being done to organize, connect, and analyze big data is happening in the for profit sector. This multimedia presentation (video, photos, and text) has three goals. (1) Discuss how the field visual analytics is tackling the problem of analyzing big data. (2) Explore when visual analytics is superior and inferior to typical statistics. (3) Tactics and tools for Institutional Researchers to use in their everyday work to change data into actionable intelligence.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract The essence of statistics is the possibility of extracting relevant information from data collected on aspects of reality. For this, the data are transformed into tables, graphs, and abstract measures. The student's understanding of these transformations is crucial for learning. In this article, we make theoretical reflections on the actions of students and the role of concrete material manipulated as ostensible in the management and display of qualitative statistics that undergo transformations in different semiotic records. For that, it uses the Theory of Semiotic Representation Records and the Anthropological Theory of Didactics, the latter limited to its perspective of ostensive and non-ostensive objects. The weightings indicate that the use of ostensive is motivating in the context of teaching and learning. Thus, it is expected that the reflections made here will contribute to a better understanding and teaching of statistical concepts, as well as to conducting new research.
Facebook
Twitterhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/4.2/customlicense?persistentId=doi:10.7910/DVN/BDWIC3https://dataverse.harvard.edu/api/datasets/:persistentId/versions/4.2/customlicense?persistentId=doi:10.7910/DVN/BDWIC3
Social Scientists rarely take full advantage of the information available in their statistical results. As a consequence, they miss opportunities to present quantities that are of greatest substantive interest for their research and express the appropriate degree of certainty about these quantities. In this article, we offer an approach, built on the technique of statistical simulation, to extract the currently overlooked information from any statistical method and to interpret and present it in a reader-friendly manner. Using this technique requires some expertise, which we try to provide herein, but its application should make the results of quantitative articles more informative and transparent. To illustrate our recommendations, we replicate the results of several published works, showing in each case how the authors' own concl usions can be expressed more sharply and informatively, and, without changing any data or statistical assumptions, how our approach reveals important new information about the research questions at hand. We also offer very easy-to-use Clarify software that implements our suggestions. See also: Unifying Statistical Analysis
Facebook
Twitterhttps://www.ons.gov.uk/methodology/geography/licenceshttps://www.ons.gov.uk/methodology/geography/licences
This document sets out the recommended standard presentation of statistics for administrative areas at regional and sub-regional levels in the UK. This version of the guidance is now available in accessible format. (File Size - 142 KB)
Facebook
Twitterhttps://data.gov.uk/dataset/4628452d-2806-4a88-84b0-9c4ac89256e5/guide-to-presenting-statistics-for-2011-travel-to-work-areas-november-2015#licence-infohttps://data.gov.uk/dataset/4628452d-2806-4a88-84b0-9c4ac89256e5/guide-to-presenting-statistics-for-2011-travel-to-work-areas-november-2015#licence-info
This document sets out the recommended standard presentation of statistics for 2011 travel to work areas in the UK.
Facebook
TwitterScientific investigation is of value only insofar as relevant results are obtained and communicated, a task that requires organizing, evaluating, analysing and unambiguously communicating the significance of data. In this context, working with ecological data, reflecting the complexities and interactions of the natural world, can be a challenge. Recent innovations for statistical analysis of multifaceted interrelated data make obtaining more accurate and meaningful results possible, but key decisions of the analyses to use, and which components to present in a scientific paper or report, may be overwhelming. We offer a 10-step protocol to streamline analysis of data that will enhance understanding of the data, the statistical models and the results, and optimize communication with the reader with respect to both the procedure and the outcomes. The protocol takes the investigator from study design and organization of data (formulating relevant questions, visualizing data collection, data...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data set has been used to generate the visual presentation using graphs and charts of the techniques for the current research trends within 6 years (from years 2013 to 2018).
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Measuring the quality of Question Answering (QA) systems is a crucial task to validate the results of novel approaches. However, there are already indicators of a reproducibility crisis as many published systems have used outdated datasets or use subsets of QA benchmarks, making it hard to compare results. We identified the following core problems: there is no standard data format, instead, proprietary data representations are used by the different partly inconsistent datasets; additionally, the characteristics of datasets are typically not reflected by the dataset maintainers nor by the system publishers. To overcome these problems, we established an ontology---Question Answering Dataset Ontology (QADO)---for representing the QA datasets in RDF. The following datasets were mapped into the ontology: the QALD series, LC-QuAD series, RuBQ series, ComplexWebQuestions, and Mintaka. Hence, the integrated data in QADO covers widely used datasets and multilinguality. Additionally, we did intensive analyses of the datasets to identify their characteristics to make it easier for researchers to identify specific research questions and to select well-defined subsets. The provided resource will enable the research community to improve the quality of their research and support the reproducibility of experiments.
Here, the mapping results of the QADO process, the SPARQL queries for data analytics, and the archived analytics results file are provided.
Up-to-date statistics can be created automatically by the script provided at the corresponding QADO GitHub RDFizer repository.
Facebook
TwitterThis guidance note sets out the recommended standard presentation of statistics for Middle Layer and Lower Layer Super Output Areas nested within relevant local authority districts in the UK and is now available in accessible format (File Size - 2.6 MB).
Facebook
TwitterOne of the first steps in a reference interview is determining what is it the user really wants or needs. In many cases, the question comes down to the unit of analysis: what is it that is being investigated or researched? This presentation will take us through the concept of the unit of analysis so that we can improve our reference service — and make our lives easier as a result! Note: This presentation precedes Working with Complex Surveys: Canadian Travel Survey by Chuck Humphrey (14-Mar-2002).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Figures in scientific publications are critically important because they often show the data supporting key findings. Our systematic review of research articles published in top physiology journals (n = 703) suggests that, as scientists, we urgently need to change our practices for presenting continuous data in small sample size studies. Papers rarely included scatterplots, box plots, and histograms that allow readers to critically evaluate continuous data. Most papers presented continuous data in bar and line graphs. This is problematic, as many different data distributions can lead to the same bar or line graph. The full data may suggest different conclusions from the summary statistics. We recommend training investigators in data presentation, encouraging a more complete presentation of data, and changing journal editorial policies. Investigators can quickly make univariate scatterplots for small sample size studies using our Excel templates.