100+ datasets found

f
Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm
plos.figshare.com
docx
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tracey L. Weissgerber; Natasa M. Milic; Stacey J. Winham; Vesna D. Garovic (2023). Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm [Dataset]. http://doi.org/10.1371/journal.pbio.1002128
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pbio.1002128
Dataset updated
May 31, 2023
Dataset provided by
PLOS Biology
Authors
Tracey L. Weissgerber; Natasa M. Milic; Stacey J. Winham; Vesna D. Garovic
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Figures in scientific publications are critically important because they often show the data supporting key findings. Our systematic review of research articles published in top physiology journals (n = 703) suggests that, as scientists, we urgently need to change our practices for presenting continuous data in small sample size studies. Papers rarely included scatterplots, box plots, and histograms that allow readers to critically evaluate continuous data. Most papers presented continuous data in bar and line graphs. This is problematic, as many different data distributions can lead to the same bar or line graph. The full data may suggest different conclusions from the summary statistics. We recommend training investigators in data presentation, encouraging a more complete presentation of data, and changing journal editorial policies. Investigators can quickly make univariate scatterplots for small sample size studies using our Excel templates.
H
Supplementary Materials for A Linked Data Representation for Summary...
dataverse.harvard.edu
Updated Aug 28, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
James McCusker (2019). Supplementary Materials for A Linked Data Representation for Summary Statistics and Grouping Criteria [Dataset]. http://doi.org/10.7910/DVN/OK0BUG
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/OK0BUG
Dataset updated
Aug 28, 2019
Dataset provided by
Harvard Dataverse
Authors
James McCusker
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Summary statistics are fundamental to data science, and are the buidling blocks of statistical reasoning. Most of the data and statistics made available on government web sites are aggregate, however, until now, we have not had a suitable linked data representation available. We propose a way to express summary statistics across aggregate groups as linked data using Web Ontology Language (OWL) Class based sets, where members of the set contribute to the overall aggregate value. Additionally, many clinical studies in the biomedical field rely on demographic summaries of their study cohorts and the patients assigned to each arm. While most data query languages, including SPARQL, allow for computation of summary statistics, they do not provide a way to integrate those values back into the RDF graphs they were computed from. We represent this knowledge, that would otherwise be lost, through the use of OWL 2 punning semantics, the expression of aggregate grouping criteria as OWL classes with variables, and constructs from the Semanticscience Integrated Ontology (SIO), and the World Wide Web Consortium's provenance ontology, PROV-O, providing interoperable representations that are well supported across the web of Linked Data. We evaluate these semantics using a Resource Description Framework (RDF) representation of patient case information from the Genomic Data Commons, a data portal from the National Cancer Institute.
Leading data compilation and analytics presentation/reporting tools in U.S....
statista.com
Updated Apr 30, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2016). Leading data compilation and analytics presentation/reporting tools in U.S. 2015 [Dataset]. https://www.statista.com/statistics/562654/united-states-data-analytics-data-compilation-and-presentation-tools/
Explore at:
Dataset updated
Apr 30, 2016
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
This statistic depicts the distribution of tools used to compile data and present analytics and/or reports to management, according to a marketing survey of C-level executives, conducted in ************* by Black Ink. As of *************, * percent of respondents used statistical modeling tools, such as IBM's SPSS or the SAS Institute's Statistical Analysis System package, to compile and present their reports.
f
Data from: Translating statistical information given in other registers into...
scielo.figshare.com
tiff
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
José António Fernandes; Paula Maria Barros (2023). Translating statistical information given in other registers into the tabular register: a study with prospective teachers of the first school years [Dataset]. http://doi.org/10.6084/m9.figshare.22774625.v1
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22774625.v1
Dataset updated
May 31, 2023
Dataset provided by
SciELO journals
Authors
José António Fernandes; Paula Maria Barros
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract This article deals with the problem of translating statistical information given in other registers into the tabular register, from the following two objectives: 1) to study the performance of prospective teachers in translating information given in the other registers into the tabular register; and 2) to compare the performance of future teachers in the different translations. The study included 30 students, future teachers of the first school years, who were attending the 1st or 2nd year of the Degree in Basic Education, at a Higher Education School in the north of Portugal. The data of the present study were obtained through the answers given by the students to four questions, which required the translation of statistical information given in the graphic, numeric-verbal and simple data list register into the tabular register. In terms of results, it is noteworthy that students were more successful in building the simple frequency tables than in building the two two-way tables and the data table grouped into class intervals, the latter being the one that proved to be the most difficult. These results, related to the translation of different registers into the tabular register, are the main contribution of the study and imply that the prospective teachers must deepen their skills of tabular representation.
f
Ten quick tips for getting the most scientific value out of numerical data
plos.figshare.com
pdf
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lars Ole Schwen; Sabrina Rueschenbaum (2023). Ten quick tips for getting the most scientific value out of numerical data [Dataset]. http://doi.org/10.1371/journal.pcbi.1006141
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1006141
Dataset updated
May 30, 2023
Dataset provided by
PLOS Computational Biology
Authors
Lars Ole Schwen; Sabrina Rueschenbaum
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Most studies in the life sciences and other disciplines involve generating and analyzing numerical data of some type as the foundation for scientific findings. Working with numerical data involves multiple challenges. These include reproducible data acquisition, appropriate data storage, computationally correct data analysis, appropriate reporting and presentation of the results, and suitable data interpretation.Finding and correcting mistakes when analyzing and interpreting data can be frustrating and time-consuming. Presenting or publishing incorrect results is embarrassing but not uncommon. Particular sources of errors are inappropriate use of statistical methods and incorrect interpretation of data by software. To detect mistakes as early as possible, one should frequently check intermediate and final results for plausibility. Clearly documenting how quantities and results were obtained facilitates correcting mistakes. Properly understanding data is indispensable for reaching well-founded conclusions from experimental results. Units are needed to make sense of numbers, and uncertainty should be estimated to know how meaningful results are. Descriptive statistics and significance testing are useful tools for interpreting numerical results if applied correctly. However, blindly trusting in computed numbers can also be misleading, so it is worth thinking about how data should be summarized quantitatively to properly answer the question at hand. Finally, a suitable form of presentation is needed so that the data can properly support the interpretation and findings. By additionally sharing the relevant data, others can access, understand, and ultimately make use of the results.These quick tips are intended to provide guidelines for correctly interpreting, efficiently analyzing, and presenting numerical data in a useful way.
f
UC_vs_US Statistic Analysis.xlsx
figshare.com
xlsx
Updated Jul 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
F. (Fabiano) Dalpiaz (2020). UC_vs_US Statistic Analysis.xlsx [Dataset]. http://doi.org/10.23644/uu.12631628.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.23644/uu.12631628.v1
Dataset updated
Jul 9, 2020
Dataset provided by
Utrecht University
Authors
F. (Fabiano) Dalpiaz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sheet 1 (Raw-Data): The raw data of the study is provided, presenting the tagging results for the used measures described in the paper. For each subject, it includes multiple columns: A. a sequential student ID B an ID that defines a random group label and the notation C. the used notation: user Story or use Cases D. the case they were assigned to: IFA, Sim, or Hos E. the subject's exam grade (total points out of 100). Empty cells mean that the subject did not take the first exam F. a categorical representation of the grade L/M/H, where H is greater or equal to 80, M is between 65 included and 80 excluded, L otherwise G. the total number of classes in the student's conceptual model H. the total number of relationships in the student's conceptual model I. the total number of classes in the expert's conceptual model J. the total number of relationships in the expert's conceptual model K-O. the total number of encountered situations of alignment, wrong representation, system-oriented, omitted, missing (see tagging scheme below) P. the researchers' judgement on how well the derivation process explanation was explained by the student: well explained (a systematic mapping that can be easily reproduced), partially explained (vague indication of the mapping ), or not present.

Tagging scheme: Aligned (AL) - A concept is represented as a class in both models, either

with the same name or using synonyms or clearly linkable names; Wrongly represented (WR) - A class in the domain expert model is incorrectly represented in the student model, either (i) via an attribute, method, or relationship rather than class, or (ii) using a generic term (e.g., user'' instead ofurban planner''); System-oriented (SO) - A class in CM-Stud that denotes a technical implementation aspect, e.g., access control. Classes that represent legacy system or the system under design (portal, simulator) are legitimate; Omitted (OM) - A class in CM-Expert that does not appear in any way in CM-Stud; Missing (MI) - A class in CM-Stud that does not appear in any way in CM-Expert.

All the calculations and information provided in the following sheets

originate from that raw data.

Sheet 2 (Descriptive-Stats): Shows a summary of statistics from the data collection,

including the number of subjects per case, per notation, per process derivation rigor category, and per exam grade category.

Sheet 3 (Size-Ratio):

The number of classes within the student model divided by the number of classes within the expert model is calculated (describing the size ratio). We provide box plots to allow a visual comparison of the shape of the distribution, its central value, and its variability for each group (by case, notation, process, and exam grade) . The primary focus in this study is on the number of classes. However, we also provided the size ratio for the number of relationships between student and expert model.

Sheet 4 (Overall):

Provides an overview of all subjects regarding the encountered situations, completeness, and correctness, respectively. Correctness is defined as the ratio of classes in a student model that is fully aligned with the classes in the corresponding expert model. It is calculated by dividing the number of aligned concepts (AL) by the sum of the number of aligned concepts (AL), omitted concepts (OM), system-oriented concepts (SO), and wrong representations (WR). Completeness on the other hand, is defined as the ratio of classes in a student model that are correctly or incorrectly represented over the number of classes in the expert model. Completeness is calculated by dividing the sum of aligned concepts (AL) and wrong representations (WR) by the sum of the number of aligned concepts (AL), wrong representations (WR) and omitted concepts (OM). The overview is complemented with general diverging stacked bar charts that illustrate correctness and completeness.

For sheet 4 as well as for the following four sheets, diverging stacked bar

charts are provided to visualize the effect of each of the independent and mediated variables. The charts are based on the relative numbers of encountered situations for each student. In addition, a "Buffer" is calculated witch solely serves the purpose of constructing the diverging stacked bar charts in Excel. Finally, at the bottom of each sheet, the significance (T-test) and effect size (Hedges' g) for both completeness and correctness are provided. Hedges' g was calculated with an online tool: https://www.psychometrica.de/effect_size.html. The independent and moderating variables can be found as follows:

Sheet 5 (By-Notation):

Model correctness and model completeness is compared by notation - UC, US.

Sheet 6 (By-Case):

Model correctness and model completeness is compared by case - SIM, HOS, IFA.

Sheet 7 (By-Process):

Model correctness and model completeness is compared by how well the derivation process is explained - well explained, partially explained, not present.

Sheet 8 (By-Grade):

Model correctness and model completeness is compared by the exam grades, converted to categorical values High, Low , and Medium.
Global Sales Presentation Tool Market Historical Impact Review 2025-2032
statsndata.org
excel, pdf
Updated May 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stats N Data (2025). Global Sales Presentation Tool Market Historical Impact Review 2025-2032 [Dataset]. https://www.statsndata.org/report/sales-presentation-tool-market-103539
Explore at:
pdf, excelAvailable download formats
Dataset updated
May 2025
Dataset authored and provided by
Stats N Data
License
https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order
Area covered
Global
Description
The Sales Presentation Tool market has evolved into a crucial component of the sales process, enabling businesses to convey their value propositions effectively and engage potential customers. These tools facilitate seamless presentations by providing dynamic features such as slides, templates, and interactive eleme
e
Collection, Editing and Presentation of Data
paper.erudition.co.in
html
Updated Jul 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2025). Collection, Editing and Presentation of Data [Dataset]. https://paper.erudition.co.in/makaut/bachelor-of-business-administration/1/fundamentals-of-statistics
Explore at:
htmlAvailable download formats
Dataset updated
Jul 13, 2025
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Question Paper Solutions of chapter Collection, Editing and Presentation of Data of Fundamentals of Statistics, 1st Semester , Bachelor of Business Administration
b
Diversity and Ethnic Representation Data
bernardhiller.com
Updated Jun 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Diversity and Ethnic Representation Data [Dataset]. https://bernardhiller.com/hollywood-acting-industry-statistics/
Explore at:
Dataset updated
Jun 18, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Representation statistics showing 29.2% of lead actors are people of color, representing an 8% increase
Amount of data created, consumed, and stored 2010-2023, with forecasts to...
statista.com
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Amount of data created, consumed, and stored 2010-2023, with forecasts to 2028 [Dataset]. https://www.statista.com/statistics/871513/worldwide-data-created/
Explore at:
Dataset updated
Jun 30, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
May 2024
Area covered
Worldwide
Description
The total amount of data created, captured, copied, and consumed globally is forecast to increase rapidly, reaching *** zettabytes in 2024. Over the next five years up to 2028, global data creation is projected to grow to more than *** zettabytes. In 2020, the amount of data created and replicated reached a new high. The growth was higher than previously expected, caused by the increased demand due to the COVID-19 pandemic, as more people worked and learned from home and used home entertainment options more often. Storage capacity also growing Only a small percentage of this newly created data is kept though, as just * percent of the data produced and consumed in 2020 was saved and retained into 2021. In line with the strong growth of the data volume, the installed base of storage capacity is forecast to increase, growing at a compound annual growth rate of **** percent over the forecast period from 2020 to 2025. In 2020, the installed base of storage capacity reached *** zettabytes.
Iceland No of Registered Enterprises: Media Representation
ceicdata.com
Updated Mar 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2023). Iceland No of Registered Enterprises: Media Representation [Dataset]. https://www.ceicdata.com/en/iceland/number-of-registered-enterprises-statistical-classification-of-economic-activities-revision-2/no-of-registered-enterprises-media-representation
Explore at:
Dataset updated
Mar 15, 2023
Dataset provided by
CEIC Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2008 - Dec 1, 2017
Area covered
Iceland
Variables measured
Enterprises Statistics
Description
Iceland Number of Registered Enterprises: Media Representation data was reported at 50.000 Unit in 2017. This stayed constant from the previous number of 50.000 Unit for 2016. Iceland Number of Registered Enterprises: Media Representation data is updated yearly, averaging 61.500 Unit from Dec 2008 (Median) to 2017, with 10 observations. The data reached an all-time high of 83.000 Unit in 2008 and a record low of 50.000 Unit in 2017. Iceland Number of Registered Enterprises: Media Representation data remains active status in CEIC and is reported by Statistics Iceland . The data is categorized under Global Database’s Iceland – Table IS.O013: Number of Registered Enterprises: Statistical Classification of Economic Activities Revision 2.
a
Data from: Constitutionalism and Democracy Dataset (CDD)
aura.american.edu
zip
Updated Feb 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Todd Eisenstadt; Carl LeVan; Tofigh Maboudi (2025). Constitutionalism and Democracy Dataset (CDD) [Dataset]. http://doi.org/10.57912/23889456.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.57912/23889456.v1
Dataset updated
Feb 12, 2025
Dataset provided by
American University (Washington, D.C.)
Authors
Todd Eisenstadt; Carl LeVan; Tofigh Maboudi
License
https://rightsstatements.org/vocab/UND/1.0/https://rightsstatements.org/vocab/UND/1.0/
Description
The downloadable ZIP file contains the dataset in Stata 13 format.
QADO: An RDF Representation of Question Answering Datasets and their...
figshare.com
zip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andreas Both; Oliver Schmidtke; Aleksandr Perevalov (2023). QADO: An RDF Representation of Question Answering Datasets and their Analyses for Improving Reproducibility [Dataset]. http://doi.org/10.6084/m9.figshare.21750029.v3
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21750029.v3
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Andreas Both; Oliver Schmidtke; Aleksandr Perevalov
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Measuring the quality of Question Answering (QA) systems is a crucial task to validate the results of novel approaches. However, there are already indicators of a reproducibility crisis as many published systems have used outdated datasets or use subsets of QA benchmarks, making it hard to compare results. We identified the following core problems: there is no standard data format, instead, proprietary data representations are used by the different partly inconsistent datasets; additionally, the characteristics of datasets are typically not reflected by the dataset maintainers nor by the system publishers. To overcome these problems, we established an ontology---Question Answering Dataset Ontology (QADO)---for representing the QA datasets in RDF. The following datasets were mapped into the ontology: the QALD series, LC-QuAD series, RuBQ series, ComplexWebQuestions, and Mintaka. Hence, the integrated data in QADO covers widely used datasets and multilinguality. Additionally, we did intensive analyses of the datasets to identify their characteristics to make it easier for researchers to identify specific research questions and to select well-defined subsets. The provided resource will enable the research community to improve the quality of their research and support the reproducibility of experiments.

Here, the mapping results of the QADO process, the SPARQL queries for data analytics, and the archived analytics results file are provided.

Up-to-date statistics can be created automatically by the script provided at the corresponding QADO GitHub RDFizer repository.
U
REPORT ON THE INDONESIAN INDUSTRIAL STATISTICS DATA COLLECTION AND...
unido.org
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The citation is currently not available for this dataset.
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
UNIDO
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
1983
Area covered
Indonesia, Asia and the Pacific
Description
UNIDO pub. Expert report on industrial statistics and data collecting in Indonesia with special reference to small scale industry - covers (1) summaries and survey of statistical data for small scale, large and medium scale industrys and cottage industrys (2) suggested creation of a central industrial register and carrying out a census of small scale industries (3) a flow chart for the programming of the industrial inquiry and relevant questionnaire; survey method; training of enumerators. Recommendations, statistics.
d
Replication Data for: Government Rhetoric and the Representation of Public...
dataone.org
search.dataone.org
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wratil, Christopher; Wäckerle, Jens; Proksch, Sven-Oliver (2023). Replication Data for: Government Rhetoric and the Representation of Public Opinion in International Negotiations [Dataset]. http://doi.org/10.7910/DVN/JCT3F7
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/JCT3F7
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Wratil, Christopher; Wäckerle, Jens; Proksch, Sven-Oliver
Description
This Dataverse provides data, code, and instructions to replicate all analyses and descriptive statistics in Wratil, Christopher, Wäckerle, Jens and Proksch, Sven-Oliver (forthcoming): "Government Rhetoric and the Representation of Public Opinion in International Negotiations", in: American Political Science Review. Please consider the readme.txt first.
Treemap Representation of UK Statistical Geographies (December 2017)
data.wu.ac.at
html, pdf
Updated Jul 28, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2018). Treemap Representation of UK Statistical Geographies (December 2017) [Dataset]. https://data.wu.ac.at/schema/data_gov_uk/YWE2YjgwOTUtNGM5Ni00Y2RhLWFkNmUtZTZjZGUzMmZkNzU3
Explore at:
pdf, htmlAvailable download formats
Dataset updated
Jul 28, 2018
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Area covered
United Kingdom
Description
(File Size - 611 KB) The 'Treemap Representation of Statistical Geographies' in the UK as at December 2017 shows the hierarchical representation of the different statistical geographies that ONS support, within their geography groups and geographical areas in a 'UK shaped treemap'.
a
Hierarchical Representation of UK Geographies (April 2021)
hub.arcgis.com
geoportal.statistics.gov.uk
+1more
Updated Apr 1, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2021). Hierarchical Representation of UK Geographies (April 2021) [Dataset]. https://hub.arcgis.com/documents/e8025cbedd1543a4809da7d826b5e52d
Explore at:
Dataset updated
Apr 1, 2021
Dataset authored and provided by
Office for National Statistics
License
https://www.ons.gov.uk/methodology/geography/licenceshttps://www.ons.gov.uk/methodology/geography/licences
Area covered

Description
The Hierarchical Representation of UK Statistical Geographies shows all UK statistical geographies within their geography groups and geographical areas, as at April 2021. The pdf is scaled to A3 paper size. (File Size - 4 MB)[Updated version published 12/04/2021 12:04PM]
U.S. unfair labor practice and representation cases filed to NLRB FY...
statista.com
Updated Jul 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). U.S. unfair labor practice and representation cases filed to NLRB FY 2012-2023 [Dataset]. https://www.statista.com/statistics/1331399/unfair-labor-practice-cases-us/
Explore at:
Dataset updated
Jul 9, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
In 2023, there were around ****** unfair labor and representation cases filed with the National Labor Relations Board in the United States. This is a significant increase from the previous year, when there were ****** cases filed.
Z
Data from: Optimizing representations for integrative structural modeling...
data.niaid.nih.gov
Updated Apr 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Viswanath, Shruthi (2024). Optimizing representations for integrative structural modeling using Bayesian model selection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10360718
Explore at:
Dataset updated
Apr 12, 2024
Dataset provided by
Arvindekar, Shreyas
Pathak, Aditi
Viswanath, Shruthi
Majila, Kartik
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Integrative structural modeling combines data from experiments, physical principles, statistics of previous structures, and prior models to obtain structures of macromolecular assemblies that are challenging to characterize experimentally. The choice of model representation is a key decision in integrative modeling, as it dictates the accuracy of scoring, efficiency of sampling, and resolution of analysis. But currently, the choice is usually made ad hoc, manually. Here, we have deposited NestOR (Nested Sampling for Optimizing Representation), a fully automated, statistically rigorous method based on Bayesian model selection to identify the optimal coarse-grained representation for a given integrative modeling setup. We have also deposited a benchmark of four macromolecular assemblies which was used to assess the performance of NestOR.
Representation in Children's Literature
kaggle.com
Updated Nov 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BENJAMIN WRIGHT (2020). Representation in Children's Literature [Dataset]. https://www.kaggle.com/benjaminwright/representation/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 24, 2020
Dataset provided by
Kaggle
Authors
BENJAMIN WRIGHT
Description
I wanted to find good data about representation and diversity in literature, which brought me to the following page of the Cooperative Children's Book Center (CCBC): https://ccbc.education.wisc.edu/literature-resources/ccbc-diversity-statistics/. The following is data on books by and about Black, Indigenous and People of Color published for children and teens compiled by the Cooperative Children’s Book Center, School of Education, University of Wisconsin-Madison.

There are two .csv files in the data set. One shows books received by the CCBC from US publishers per year that are authored and/or illustrated by a Black/African/Indigenous/Asian/Pacific Islander/Latinx person, and the other shows books received by the CCBC from US publishers per year that feature a BIPOC character. Further explanation can be found at the CCBC FAQ page.

Please note that for 2018 and 2019, the below .csv represent Asian/Pacific Islander people as one column, which is how the CCBC published the data between 2002-2017. Also note that the attached data are not the entire data collected by the CCBC. The CCBC also collects books from international publishers, and since 2018, the CCBC has been publishing data about books by/about Arabs.

All data was collected by the CCBC. Please see the following page (with the complete data) about how to cite the data in your publications/blogs/notebooks: https://ccbc.education.wisc.edu/literature-resources/ccbc-diversity-statistics/books-by-about-poc-fnn/.

I am curious to see what sorts of visualizations people can make in exploratory analysis of this data! Also, can you predict how many BIPOC books the CCBC will receive in 2020? What happens when you study against US population data?

Facebook

Twitter

Click to copy link

Link copied

Cite

Tracey L. Weissgerber; Natasa M. Milic; Stacey J. Winham; Vesna D. Garovic (2023). Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm [Dataset]. http://doi.org/10.1371/journal.pbio.1002128

Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm

Explore at:

327 scholarly articles cite this dataset (View in Google Scholar)

docxAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pbio.1002128

Dataset updated

May 31, 2023

Dataset provided by

PLOS Biology

Authors

Tracey L. Weissgerber; Natasa M. Milic; Stacey J. Winham; Vesna D. Garovic

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Figures in scientific publications are critically important because they often show the data supporting key findings. Our systematic review of research articles published in top physiology journals (n = 703) suggests that, as scientists, we urgently need to change our practices for presenting continuous data in small sample size studies. Papers rarely included scatterplots, box plots, and histograms that allow readers to critically evaluate continuous data. Most papers presented continuous data in bar and line graphs. This is problematic, as many different data distributions can lead to the same bar or line graph. The full data may suggest different conclusions from the summary statistics. We recommend training investigators in data presentation, encouraging a more complete presentation of data, and changing journal editorial policies. Investigators can quickly make univariate scatterplots for small sample size studies using our Excel templates.

Clear search

Close search

Google apps

Main menu

Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm

Supplementary Materials for A Linked Data Representation for Summary...

Leading data compilation and analytics presentation/reporting tools in U.S....

Data from: Translating statistical information given in other registers into...

Ten quick tips for getting the most scientific value out of numerical data

UC_vs_US Statistic Analysis.xlsx

Global Sales Presentation Tool Market Historical Impact Review 2025-2032

Collection, Editing and Presentation of Data

Diversity and Ethnic Representation Data

Amount of data created, consumed, and stored 2010-2023, with forecasts to...

Iceland No of Registered Enterprises: Media Representation

Data from: Constitutionalism and Democracy Dataset (CDD)

QADO: An RDF Representation of Question Answering Datasets and their...

REPORT ON THE INDONESIAN INDUSTRIAL STATISTICS DATA COLLECTION AND...

Replication Data for: Government Rhetoric and the Representation of Public...

Treemap Representation of UK Statistical Geographies (December 2017)

Hierarchical Representation of UK Geographies (April 2021)

U.S. unfair labor practice and representation cases filed to NLRB FY...

Data from: Optimizing representations for integrative structural modeling...

Representation in Children's Literature

Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm