62 datasets found

Data for Example I.
plos.figshare.com
txt
Updated Jul 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jularat Chumnaul; Mohammad Sepehrifar (2024). Data for Example I. [Dataset]. http://doi.org/10.1371/journal.pone.0297930.s002
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0297930.s002
Dataset updated
Jul 3, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Jularat Chumnaul; Mohammad Sepehrifar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data analysis can be accurate and reliable only if the underlying assumptions of the used statistical method are validated. Any violations of these assumptions can change the outcomes and conclusions of the analysis. In this study, we developed Smart Data Analysis V2 (SDA-V2), an interactive and user-friendly web application, to assist users with limited statistical knowledge in data analysis, and it can be freely accessed at https://jularatchumnaul.shinyapps.io/SDA-V2/. SDA-V2 automatically explores and visualizes data, examines the underlying assumptions associated with the parametric test, and selects an appropriate statistical method for the given data. Furthermore, SDA-V2 can assess the quality of research instruments and determine the minimum sample size required for a meaningful study. However, while SDA-V2 is a valuable tool for simplifying statistical analysis, it does not replace the need for a fundamental understanding of statistical principles. Researchers are encouraged to combine their expertise with the software’s capabilities to achieve the most accurate and credible results.
i
Supplement to Multivariate statistical analysis and partitioning of...
get.iedadata.org
search.dataone.org
+1more
xml
Updated 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Murray, Richard; Scudder, Rachel; Pisias, Nicklas (2014). Supplement to Multivariate statistical analysis and partitioning of sedimentary geochemical data sets: General principles and specific MATLAB scripts [Dataset]. http://doi.org/10.1594/IEDA/100422
Explore at:
xmlAvailable download formats
Unique identifier
https://doi.org/10.1594/IEDA/100422
Dataset updated
2014
Authors
Murray, Richard; Scudder, Rachel; Pisias, Nicklas
License
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Description
Abstract: We present here annotated MATLAB scripts (and specific guidelines for their use) for Q-mode factor analysis, a constrained least squares multiple linear regression technique, and a total inversion protocol, that are based on the well-known approaches taken by Dymond (1981), Leinen and Pisias (1984), Kyte et al. (1993), and their predecessors. Although these techniques have been used by investigators for the past decades, their application has been neither consistent nor transparent, as their code has remained in-house or in formats not commonly used by many of today's researchers (e.g., FORTRAN). In addition to providing the annotated scripts and instructions for use, we include a sample data set for the user to test their own manipulation of the scripts. Other Description: Pisias, N. G., R. W. Murray, and R. P. Scudder (2013), Multivariate statistical analysis and partitioning of sedimentary geochemical data sets: General principles and specific MATLAB scripts, Geochem. Geophys. Geosyst., 14, 4015–4020, doi:10.1002/ggge.20247.
q
Is everything bigger in Texas? Introduction to Statistics with the Rock...
qubeshub.org
Updated Feb 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julie Schlichte; Phillip Lavretsky; Vicky Zhuang (2024). Is everything bigger in Texas? Introduction to Statistics with the Rock Pocket Mouse (Chaetodipus intermedius) [Dataset]. http://doi.org/10.25334/JAW3-MD63
Explore at:
Unique identifier
https://doi.org/10.25334/JAW3-MD63
Dataset updated
Feb 28, 2024
Dataset provided by
QUBES
Authors
Julie Schlichte; Phillip Lavretsky; Vicky Zhuang
Description
Natural history museums often contain large collections of the same species and therefore, are a resource for studying intraspecific variation. This module uses 172 images of rock pocket mouse skulls from the UTEP Biodiversity Collections to introduce students to collecting data from images and principles of basic statistics. This module resource focuses on immersing students into the development of study design, analysis, discussion, and communication without overwhelming them. Students enter their data into a Google sheet app that combines data entry, statistical analysis, and presentation all in one. The collaborative framework asks students to work together, share resources, and develop their own questions while learning the principles behind taking measurements from images of museum specimens.
f
Data from: genRCT: a statistical analysis framework for generalizing RCT...
datasetcatalog.nlm.nih.gov
tandf.figshare.com
Updated Apr 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Berry, Mark; Stinchcombe, Tom; Wang, Xiaofei; Cohen, Harvey Jay; Yang, Shu; Lee, Dasom (2024). genRCT: a statistical analysis framework for generalizing RCT findings to real-world population [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001320858
Explore at:
Dataset updated
Apr 9, 2024
Authors
Berry, Mark; Stinchcombe, Tom; Wang, Xiaofei; Cohen, Harvey Jay; Yang, Shu; Lee, Dasom
Area covered
World
Description
When evaluating the real-world treatment effect, the analysis based on randomized clinical trials (RCTs) often introduces generalizability bias due to the difference in risk factors between the trial participants and the real-world patient population. This problem of lack of generalizability associated with the RCT-only analysis can be addressed by leveraging observational studies with large sample sizes that are representative of the real-world population. A set of novel statistical methods, termed “genRCT”, for improving the generalizability of the trial has been developed using calibration weighting, which enforces the covariates balance between the RCT and observational study. This paper aims to review statistical methods for generalizing the RCT findings by harnessing information from large observational studies that represent real-world patients. Specifically, we discuss the choices of data sources and variables to meet key theoretical assumptions and principles. We introduce and compare estimation methods for continuous, binary, and survival endpoints. We showcase the use of the R package genRCT through a case study that estimates the average treatment effect of adjuvant chemotherapy for the stage 1B non-small cell lung patients represented by a large cancer registry.
Comparative overview of the statistical packages available in moreThanANOVA...
plos.figshare.com
xls
Updated Jul 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jularat Chumnaul; Mohammad Sepehrifar (2024). Comparative overview of the statistical packages available in moreThanANOVA and SDA-V2. [Dataset]. http://doi.org/10.1371/journal.pone.0297930.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0297930.t001
Dataset updated
Jul 3, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Jularat Chumnaul; Mohammad Sepehrifar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparative overview of the statistical packages available in moreThanANOVA and SDA-V2.
e
Data analysis toolkits
envidat.ch
.pdf, not available +1
Updated May 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
James Kirchner (2025). Data analysis toolkits [Dataset]. http://doi.org/10.16904/envidat.177
Explore at:
pdf, not available, .pdfAvailable download formats
Unique identifier
https://doi.org/10.16904/envidat.177
Dataset updated
May 29, 2025
Dataset provided by
ETH Zurich
Authors
James Kirchner
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Area covered
Switzerland
Dataset funded by
University of California, Berkeley
Description
These are condensed notes covering selected key points in data analysis and statistics. They were developed by James Kirchner for the course "Analysis of Environmental Data" at Berkeley in the 1990's and 2000's. They are not intended to be comprehensive, and thus are not a substitute for a good textbook or a good education! License: These notes are released by James Kirchner under a Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.
H
Replication data for Hierarchy of Ethical Principles
dataverse.harvard.edu
Updated Mar 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Suneet Sood (2024). Replication data for Hierarchy of Ethical Principles [Dataset]. http://doi.org/10.7910/DVN/ALCGVF
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/ALCGVF
Dataset updated
Mar 28, 2024
Dataset provided by
Harvard Dataverse
Authors
Suneet Sood
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Background: Ethics is fundamental to all human interactions, yet understanding of its precise elements and their hierarchy remain superficial. Prior attempts to establish a ranking of ethical principles have yielded varied results, indicating the need for further exploration in this area. This study aims to contribute to the understanding of ethics by exploring the relative importance of its elements through a cross-sectional analysis. Objective: The aim of this study was to determine the relative ranking of the elements of ethics including justice, nonmaleficence, autonomy, beneficence, fidelity, veracity, public good and loyalty, in order to establish a working hierarchy of the principles. Methods: Participants were tasked with evaluating ethical conflicts depicted in scenarios. Using a scale of 1 to 10, participants rated their preferred responses to each scenario, allowing for the comparison of different ethical principles. Statistical analysis, including independent samples t-tests, was employed to determine significant differences in preferences. Results: Analysis of participant responses revealed discernible trends in the hierarchy of ethical principles. Notably, justice, non-maleficence, lawfulness, and autonomy emerged as top-tier principles, while beneficence and fidelity constituted second-tier elements. Public good, veracity, and loyalty comprised the third tier. These findings align with and extend upon existing literature, providing valuable insights into the relative importance of ethical principles. Conclusion: Our results indicate that Justice, Nonmaleficence, Lawfulness, and Autonomy, were most important as first-tier principles. Following them, Beneficence and Fidelity were recognized as second-tier principles, with Public Good, Veracity, and Loyalty falling into the third tier.
Comparison of features in SDA-V2 and well-known statistical analysis...
plos.figshare.com
xls
Updated Jul 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jularat Chumnaul; Mohammad Sepehrifar (2024). Comparison of features in SDA-V2 and well-known statistical analysis software packages (Minitab and SPSS). [Dataset]. http://doi.org/10.1371/journal.pone.0297930.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0297930.t002
Dataset updated
Jul 3, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Jularat Chumnaul; Mohammad Sepehrifar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Comparison of features in SDA-V2 and well-known statistical analysis software packages (Minitab and SPSS).
Iris dataset
kaggle.com
zip
Updated Jan 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ehsan Zafari (2024). Iris dataset [Dataset]. https://www.kaggle.com/datasets/ehsanzafari/iris-dataset
Explore at:
zip(955 bytes)Available download formats
Dataset updated
Jan 16, 2024
Authors
Ehsan Zafari
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
The Iris dataset is a classic dataset in the field of machine learning and statistics. It's often used for demonstrating various data analysis, machine learning, and statistical techniques. Here are some key details about it:

Background - Origin: The dataset was introduced by the British statistician and biologist Ronald Fisher in his 1936 paper titled "The use of multiple measurements in taxonomic problems." - Purpose: Fisher developed the dataset as an example of linear discriminant analysis.

Data Composition - Data Points: The dataset consists of 150 samples from three species of Iris flowers: Iris Setosa, Iris Versicolour, and Iris Virginica. - Features: There are four features measured in centimeters for each sample: 1. Sepal Length 2. Sepal Width 3. Petal Length 4. Petal Width - Classes: The dataset contains three classes, corresponding to the three species of Iris. Each class has 50 samples.

Usage - Classification: The Iris dataset is widely used for classification tasks, especially to illustrate the principles of supervised machine learning algorithms. - Testing Algorithms: It's often used to test out algorithms for linear regression, classification, and clustering due to its simplicity and small size. - Educational Purpose: Because of its clarity and simplicity, it's frequently used in teaching data science and machine learning.

Characteristics - Simple and Clean: The dataset is straightforward, with minimal preprocessing required, making it ideal for beginners. - Well-Behaved Classes: The species are relatively well separated, though there's some overlap between Versicolor and Virginica. - Multivariate Data: It involves understanding the relationship between multiple variables (the four features).

Applications - Benchmarking: The Iris dataset serves as a benchmark for evaluating the performance of different algorithms. - Visualization**: It's great for practicing data visualization, especially for exploring techniques like scatter plots, box plots, and pair plots to understand feature relationships.

Despite its simplicity, the Iris dataset remains one of the most famous datasets in the world of data science and machine learning. It serves as an excellent starting point for anyone new to the field and remains a baseline for testing algorithms and teaching concepts.
d
Data from an analysis of Dissolved Organic Matter in the Upper Klamath...
search.dataone.org
data.usgs.gov
Updated Feb 1, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Goldman, Jami (2018). Data from an analysis of Dissolved Organic Matter in the Upper Klamath River, Lost River, and Klamath Straits Drain, Oregon and California, 2013–16 [Dataset]. https://search.dataone.org/view/b3132f52-1f44-408a-97af-680304a1ec28
Explore at:
Dataset updated
Feb 1, 2018
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Goldman, Jami
Time period covered
Jan 1, 2013 - Jan 1, 2016
Area covered

Description
Concentrations of particulate organic carbon (POC) and dissolved organic carbon (DOC), which together comprise total organic carbon, were measured in a reconnaissance study at sampling sites in the Upper Klamath River, Lost River, and Klamath Straits Drain in 2013–16. In addition, data for total nitrogen and chlorophyll a were collected. Optical absorbance and fluorescence properties of dissolved organic matter (DOM), which contains DOC, also were analyzed. Excitation-Emission matrices (EEMs) were produced for each sample and full absorbance spectra. The EEMs were compiled and key data points and regions of the spectra were extracted from each site. Parallel factor analysis was used to decompose the optical fluorescence data into five key components for all samples.
National curriculum assessments: KS2 and KS3, 2011 (provisional)
gov.uk
Updated Aug 2, 2011
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department for Education (2011). National curriculum assessments: KS2 and KS3, 2011 (provisional) [Dataset]. https://www.gov.uk/government/statistics/interim-results-for-key-stage-2-and-3-national-curriculum-assessments-in-england-academic-year-2010-to-2011
Explore at:
Dataset updated
Aug 2, 2011
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Department for Education
Description
Reference Id: SFR18/2011

Publication Type: Statistical First Release

Local Authority data: LA data

Region: England

Release Date: 02 August 2011

Coverage status: Provisional

Publication Status: Published

National curriculum tests provide a snapshot of attainment at the end of key stage 2. Teacher assessment is the teachers’ judgement of pupils’ performance in the whole subject over the whole key stage programme of study.

The SFR contains statistics that were previously published separately in the academic year 2009 to 2010, on KS2 attainment of pupils in science. Science tests are only administered for a nationally representative sample of pupils at the end of key stage 2. This is used to monitor national standards in science, it is not designed to produce regional or local level statistics.

The Qualifications and Curriculum Development Agency (QCDA) undertakes the delivery of statutory tests at the end at Key Stage 2 and national data collection of Key Stages 2 and 3 teacher assessment and test results.

These statistics will be revised in late 2011.

Key Points

The percentages of pupils achieving Level 4 or above in the 2011 Key Stage 2 tests by subject are as follows:

English 81% (86% for girls, 77% for boys)

Reading 84% (87% for girls, 80% for boys)

Writing 75% (81% for girls, 68% for boys)

Mathematics 80% (80% for girls, 80% for boys).

The percentages of pupils achieving level 4 or above in the 2011 key stage 2 science sampling tests are as follows:

Boys & girls 84% (95% confidence interval: 83%, 85%)

Boys 83% (95% confidence interval: 82%, 84%)

Girls 85% (95% confidence interval: 84%, 86%)

The percentages of pupils achieving Level 5 in the 2011 Key Stage 2 tests by subject are as follows:

English 29% (35% for girls, 23% for boys)

Reading 42% (48% for girls, 37% for boys)

Writing 20% (25% for girls, 15% for boys)

Mathematics 35% (33% for girls, 37% for boys).

The percentages of pupils achieving Level 5 in the 2011 Key Stage 2 science sampling tests are as follows:

Boys & girls 36% (95 per cent confidence interval: 34%, 37%)

Boys 35% (95 per cent confidence interval: 36%, 37%)

Girls 38% (95 per cent confidence interval: 36%, 39%)

The percentage of pupils achieving level 4 or above in the 2011 Key Stage 2 teacher assessments by subject are as follows:

English 81% (86% for girls, 77% for boys)

Mathematics 82% (82% for girls, 81% for boys)

Science 85% (86% for girls, 83% for boys).

The percentage of pupils achieving level 5 or above in the 2011 Key Stage 3 teacher assessments by subject are as follows:

English 82% (88% for girls, 76% for boys)

Mathematics 81% (82% for girls, 80% for boys)

Science 83% (85% for girls, 81% for boys).

On 3 August 2011 a small issue was identified with some key stage 2 & 3 local authority figures arising from a small number of schools converting to academy status on 1 July 2011. Amended tables and underlying data were published on 5 August 2011. At key stage 2 no local authority figure was affected by more than a percentage point.

Adam Hatton - Attainment Statistics Team

Attainment.STATISTICS@education.gsi.gov.uk
N
Seven Points, TX Population Breakdown by Gender and Age
neilsberg.com
csv, json
Updated Sep 14, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2023). Seven Points, TX Population Breakdown by Gender and Age [Dataset]. https://www.neilsberg.com/research/datasets/678e8faf-3d85-11ee-9abe-0aa64bf2eeb2/
Explore at:
json, csvAvailable download formats
Dataset updated
Sep 14, 2023
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Seven Points, Texas
Variables measured
Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
Measurement technique
The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset tabulates the population of Seven Points by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Seven Points. The dataset can be utilized to understand the population distribution of Seven Points by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Seven Points. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Seven Points.

Key observations

Largest age group (population): Male # 0-4 years (106) | Female # 75-79 years (66). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Age groups:

Under 5 years

5 to 9 years

10 to 14 years

15 to 19 years

20 to 24 years

25 to 29 years

30 to 34 years

35 to 39 years

40 to 44 years

45 to 49 years

50 to 54 years

55 to 59 years

60 to 64 years

65 to 69 years

70 to 74 years

75 to 79 years

80 to 84 years

85 years and over

Scope of gender :

Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

Variables / Data Columns

Age Group: This column displays the age group for the Seven Points population analysis. Total expected values are 18 and are define above in the age groups section.

Population (Male): The male population in the Seven Points is shown in the following column.

Population (Female): The female population in the Seven Points is shown in the following column.

Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Seven Points for each age group.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for Seven Points Population by Gender. You can refer the same here

Segmentation and Key Points of Human Body

kaggle.com

zip

Updated Aug 29, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

maadaa.ai (2024). Segmentation and Key Points of Human Body [Dataset]. https://www.kaggle.com/datasets/maadaaai/segmentation-and-key-points-of-human-body

Explore at:

zip(14133681 bytes)Available download formats

Dataset updated

Aug 29, 2024

Authors

maadaa.ai

License

Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically

Description

Segmentation and Key Points of Human Body (MD-Image-053)

Introduction

The "Segmentation and Key Points of Human Body Dataset" is designed for the apparel and visual entertainment sectors, featuring a collection of internet-collected images with resolutions ranging from 1280 x 960 to 5184 x 3456 pixels. This dataset is comprehensive, including instance and semantic segmentation of 27 categories of body parts along with 24 key points annotations, providing detailed data for human body analysis and applications.

If you has interested in the full version of the datasets, featuring 6.6k annotated images, please visit our website maadaa.ai and leave a request.

Specification

Dataset ID	MD-Image-053
Dataset Name	Segmentation and Key Points of Human Body Dataset
Data Type	Image
Volume	About 6.6k
Data Collection	Internet collected images. Resolution ranges from 1280960 to 51843456
Annotation	Semantic Segmentation,Instance Segmentation
Annotation Notes	The dataset includes 27 categories of body parts and 24 key points.
Application Scenarios	Apparel, Visual Entertainment

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22149246%2F7dc65763d846e1ce51d51de554889b40%2Fsegmentation%20keypoint.jpg?generation=1724923513996127&alt=media" alt="">

About maadaa.ai

Since 2015, maadaa.ai has been dedicated to delivering specialized AI data services. Our key offerings include:

Data Collection: Comprehensive data gathering tailored to your needs.
Data Annotation: High-quality annotation services for precise data labeling.
Off-the-Shelf Datasets: Ready-to-use datasets to accelerate your projects.
Annotation Platform: Maid-X is our data annotation platform built for efficient data annotation.

We cater to various sectors, including automotive, healthcare, retail, and more, ensuring our clients receive the best data solutions for their AI initiatives.

4
Data underlying the PhD thesis: A Principle-based Framework for Audit...
data.4tu.nl
zip
Updated Mar 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mochammad Gilang Ramadhan; Marijn Janssen; Haiko van der Voort (2025). Data underlying the PhD thesis: A Principle-based Framework for Audit Analytics Implementation [Dataset]. http://doi.org/10.4121/fcfdd1db-b653-4647-9533-11d9231d3e7d.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/fcfdd1db-b653-4647-9533-11d9231d3e7d.v1
Dataset updated
Mar 28, 2025
Dataset provided by
4TU.ResearchData
Authors
Mochammad Gilang Ramadhan; Marijn Janssen; Haiko van der Voort
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Dataset funded by
LPDP
Description
This research aims to develop a principle-based framework for audit analytics implementation, which addresses the challenges of AA implementation and acknowledges its socio-technical complexities and interdependencies among challenges. This research relies on mixed methods to capture the phenomena from the research’s participants through various approaches, i.e., MICMAC-ISM, case study, and interview with practitioners, with literature exploration as the starting point. The raw data collected consists of multimedia data (audio and video recordings of interviews and focused group discussion), which is then transformed into a text file (transcript), complemented with a softcopy of the documents from the case study object.

The published data in this dataset, consists of the summarized or analyzed data, as the raw data (including transcript) is not allowed to be published according to the decision by the Human Research Ethics Committee pertinent to this research (Approval #1979, 14 February 2022). This dataset's published data are text files representing the summarized/analyzed raw data as an online appendices to the thesis.
AIG Actuarial Analyst
kaggle.com
zip
Updated Mar 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ankur kumar (2025). AIG Actuarial Analyst [Dataset]. https://www.kaggle.com/datasets/ankurkumar7078/aig-actuarial-analyst
Explore at:
zip(96850 bytes)Available download formats
Dataset updated
Mar 14, 2025
Authors
Ankur kumar
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Instructions

Analyze the dataset using the claims data resource

Examine the data: Start by thoroughly examining the dataset within the Claims Data resource. Focus on key variables such as claim dates, types of claims, amounts claimed, and additional details about the incidents. Manipulate the data: Derive the missing values in columns F, O, P, and Q. Use hints if needed. This step emphasizes data manipulation, a key component of account pricing analysis. Identify patterns and anomalies: Conduct EDA using the data in the Claims Data resource. Identify patterns, trends, and anomalies. Utilize visual tools such as histograms, scatter plots, and bar charts within Excel to help you visualize and interpret the data. 2. Apply actuarial principles to the data

Risk assessment: Use the actuarial principles you learned in Task 1 to assess the risks associated with the claims data. Calculate key metrics such as claim frequency, severity, and loss ratios based on the data provided. Calculate premiums: Develop a pricing model using experience-based rating. This involves adjusting historical data from the Claims Data resource to project future claims costs, considering factors such as inflation and changes in exposure. 3. Develop comprehensive reports in Excel

Analysis report: Compile your findings: Organize your EDA into a well-structured section within the Excel workbook. This section should include a detailed evaluation of the Marine Liability insurance claims data, visualizations of key findings, and a commentary on observed trends and anomalies. Commentary on risks and uncertainties: Provide a clear commentary on the risks and uncertainties associated with your assessment. Discuss how different scenarios could impact the pricing model and the potential financial implications for Oceanic Shipping Co. Pricing calculation: Perform a numbers-based premium calculation: Use the Claims Data resource to calculate the appropriate premiums for the Marine Liability insurance policy. Apply actuarial principles such as loss frequency, loss severity, and pure premium calculation, and adjust for expenses and profit margins. Sensitivity analysis: Include a sensitivity analysis within the Excel workbook to assess how changes in key assumptions (e.g., an increase in loss severity) could impact the final premium. Document your calculations: Ensure your premium calculation section in Excel clearly documents your methodology, assumptions, and final premium recommendations. Discuss the potential risks and uncertainties in your pricing model, including any external factors that could impact future claims.
Key stage 2 attainment: 2009 to 2010
gov.uk
Updated Aug 10, 2010
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department for Education (2010). Key stage 2 attainment: 2009 to 2010 [Dataset]. https://www.gov.uk/government/statistics/key-stage-2-attainment-of-pupils-in-science-in-england-academic-year-2009-to-2010
Explore at:
Dataset updated
Aug 10, 2010
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Department for Education
Description
Reference Id: SFR24/2010

Publication Type: Statistical First Release

Publication data: Underlying Statistical data

Region: England

Release Date: 10 August 2010

Coverage status: Final

Publication Status: Published

The figures in this SFR are produced from data provided to the department by the Qualifications and Curriculum Development Agency (QCDA) on 13 July 2010.

National curriculum assessment provides a measurement of achievement against the precise attainment targets of the national curriculum rather than any generalised concept of ability in any of the subject areas. The national Curriculum standards have been designed so that most pupils will progress by approximately one level every two years. This means that by the end of KS2 pupils are expected to achieve Level 4.

Details about the methodology and design of the sample can be found on the http://webarchive.nationalarchives.gov.uk/20110810144333/http://qcda.gov.uk/assessment/85.aspx">National Archives QCDA website

Key Points

Level 4 or above

The estimated percentages of children achieving Level 4 or above based on the 2010 Key Stage 2 science sample tests are as follows:

Boys and girls 81% (95% confidence interval: 80%, 82%)

Boys 80% (95% confidence interval: 79%, 81%)

Girls 81% (95% confidence interval: 80%, 82%)

Based on the confidence intervals given it is not possible to conclude that girls perform significantly better than boys.

When the whole key stage 2 cohort took tests in 2009 the overall percentage of pupils achieving Level 4 or above was 88%.

Comparisons with previous years are difficult as previous tests were taken under a policy of tests which fed the school accountability framework. These tests do not play any part in school accountability.

Level 5

The estimated percentages of children achieving Level 5 based on the 2010 Key Stage 2 science sample tests are as follows:

Boys and girls 28% (95% confidence interval: 27%, 30%)

Boys 29% (95% confidence interval: 28%, 31%)

Girls 28%t (95% confidence interval: 27%, 30%)

Based on the confidence intervals given it is not possible to conclude that boys perform significantly better than girls at level 5.

When the whole Key Stage 2 cohort took tests in 2009 the overall percentage of pupils achieving Level 5 was 43%.

The underlying data for this publication was made available on 29 September 2010.

Adam Hatton - Attainment Statistics Team

Attainment.STATISTICS@education.gsi.gov.uk
Data from: Automated analysis of bird head motion in unconstrained settings:...
data.niaid.nih.gov
datasetcatalog.nlm.nih.gov
+2more
zip
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paulo Correia; Ricardo Araujo; Romain David; Marco Lopes (2025). Automated analysis of bird head motion in unconstrained settings: A foundational study on semicircular canal evolution in archosaurs [Dataset]. http://doi.org/10.5061/dryad.bk3j9kdpb
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.bk3j9kdpb
Dataset updated
Apr 24, 2025
Dataset provided by
Instituto Superior Técnico
Natural History Museum
Authors
Paulo Correia; Ricardo Araujo; Romain David; Marco Lopes
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
This study presents a framework to automatically analyze head motion in birds from videos of natural behaviors. The process involves detecting birds, identifying key points on their heads, and tracking changes in their positions over time. Bird detection and key point extraction were trained on publicly available datasets, featuring videos and images of diverse bird species in uncontrolled settings. Initial challenges with complex video backgrounds causing misidentifications and inaccurate key points were addressed through validation, refinement, filtering, and smoothing. Head angular velocities and rotation frequencies were computed from the refined key points. The algorithm performed well at moderate speeds but was limited by the 30 Hz frame rate of most videos, which constrained measurable angular velocities and frequencies and caused motion blur, affecting key point detection. Our findings suggest that the framework may provide plausible estimates of head motion but also emphasize the importance of high frame rate videos in future research, including extensive comparisons against ground truth data, to fully characterize bird head movements. Importantly, this work is a foundational effort to understand the evolutionary drivers of the semicircular canals, the biosensor that monitors head rotations, for both extinct and extant tetrapods. Methods For the development of the bird head pose estimation (BHPE) module a new 2D BHPE annotated dataset is proposed, here entitled BirdGaze, which includes images from four prominent sources: the Animal Kingdom, NABirds, Birdsnap and eBird. These datasets represent the largest publicly available collections and are widely recognized in the literature for their significant role in avian research. Their extensive morphological diversity is crucial for this study. Besides the bird images, the proposed BirdGaze dataset includes a set of annotations, notably:

Center of the bounding box containing the bird body, described by its 2D coordinates; Scale factor, defining a multiplying factor to apply to the bird bounding box for resizing it to fit a fixed rectangle size, which is used as input to the adopted key point extraction model; Coordinates of the four selected 2D key points: top of head, tip of beak, left eye, right eye.
ADIDAS US SALES DATA ANALYSIS
kaggle.com
zip
Updated Feb 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MANISH SHARMA 95 (2024). ADIDAS US SALES DATA ANALYSIS [Dataset]. https://www.kaggle.com/datasets/manish9569/adidas-us-sales-data
Explore at:
zip(144904 bytes)Available download formats
Dataset updated
Feb 23, 2024
Authors
MANISH SHARMA 95
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Explore the dynamic world of Adidas US Sales with this comprehensive dataset. The dataset encapsulates detailed information on sales transactions, retailer details, product categories, and more. Each entry includes critical metrics such as total sales, operating profit, units sold, and various operational aspects.

Key Points: - Rich sales data spanning from 2020 to 2021. - Granular details on product types, retailers, and sales methods. - Insights into regional performance, pricing strategies, and operating margins. - Ideal for exploratory data analysis, predictive modeling, and business strategy formulation.

Dataset Columns Which I am Using For Analysis: - Retailer - Retailer ID - Invoice Date - Region - State - City - Product - Price per Unit - Units Sold - Total Sales - Operating Profit - Operating Margin - Sales Method - Year This dataset to derive actionable insights, refine business strategies, and elevate your data analysis skills. Dive into the world of Adidas US Sales and uncover the stories hidden in the numbers.
Driving test, theory test and instructor statistics, January to March 2012
gov.uk
Updated Jun 28, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department for Transport (2012). Driving test, theory test and instructor statistics, January to March 2012 [Dataset]. https://www.gov.uk/government/statistics/driver-and-rider-tests-and-instructor-statistics-2011-2012
Explore at:
Dataset updated
Jun 28, 2012
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Department for Transport
Description
This publication presents information on the number and pass rates of driving and riding tests conducted in Great Britain to 31st March 2012 (covering the whole of the 2011/12 financial year).

The statistics are derived from data held by the Driver and Vehicle Standards Agency (DVSA), which administers the driving test and training schemes in Great Britain.

A supplementary bulletin will be released in July. This will contain more detailed tables providing breakdowns of test passes by age of candidate and number of test attempts.

Key points

A total of 427,491 tests were conducted across all practical test categories (excluding ADI tests) between January and March 2012 (Q4 2011/12). This represents an overall decrease of 9.9 per cent in comparison with the same quarter in 2010/11.

In total, around 1.74 million practical driving / riding tests (excluding ADI tests) were conducted during 2011/12, down by 1.3 per cent from the 1.77 million tests conducted during 2010/11. However, the 2011/12 total was higher than the 2009/10 total of 1.68 million.

The practical car test pass rate has increased slightly from 44 per cent in 2007/08 to almost 47 per cent in 2011/12.

It is likely that the prevailing economic situation has led to fewer people undertaking commercial vehicle tests. The number of large goods vehicle tests fell from 70,766 in 2007/08 to 46,549 in 2011/12. The corresponding figures for passenger carrying vehicles were 10,331 in 2007/08 and 8,456 in 2011/12.

At the end of 2011/12, there were 46,569 approved driving instructors on the Register. Of these, over 99 per cent scored a grade four or better at their last check test.

Technical information

Information on Driver and Rider Test and Instructor statistics, including the pre-release access list, and related technical documentation can be found here.

Contact us

vehicles.stats@dft.gsi.gov.uk

Phone: 020 7944 3077
W
Data from: Statistical planning and analysis for treatments of tar sand...
cloud.csiss.gmu.edu
data.wu.ac.at
html
Updated Aug 8, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Energy Data Exchange (2019). Statistical planning and analysis for treatments of tar sand wastewater. Final report [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/statistical-planning-and-analysis-for-treatments-of-tar-sand-wastewater-final-report
Explore at:
htmlAvailable download formats
Dataset updated
Aug 8, 2019
Dataset provided by
Energy Data Exchange
Description
The first part of this report discusses the overall statistical planning, coordination and design for several tar sand wastewater treatment projects contracted by the Laramie Energy Technology Center (LETC) of the Department of Energy. A general discussion of the benefits of consistent statistical design and analysis for data-oriented projects is included, with recommendations for implementation. A detailed outline of the principles of general linear models design is followed by an introduction to recent developments in general linear models by ranks (GLMR) analysis and a comparison to standard analysis using Gaussian or normal theory (GLMN). A listing of routines contained in the VPI Nonparametric Statistics Package (NPSP), installed on the Cyber computer system at the University of Wyoming is included. Part 2 describes in detail the design and analysis for treatments by Gas Flotation, Foam Separation, Coagulation, and Ozonation, with comparisons among the first three methods. Rank methods are used for most analyses, and several detailed examples are included. For optimization studies, the powerful tools of response surface analysis (RSA) are employed, and several sections contain discussion on the benefits of RSA. All four treatment methods proved to be effective for removal of TOC and suspended solids from the wastewater. Because the processes and equipment designs were new, optimum removals were not achieved by these initial studies and reasons for that are discussed. Pollutant levels were nevertheless reduced to levels appropriate for recycling within the process, and for such reuses as steam generation, according to the DOE/LETC project officer. 12 refs., 8 figs., 21 tabs.

Facebook

Twitter

Click to copy link

Link copied

Cite

Jularat Chumnaul; Mohammad Sepehrifar (2024). Data for Example I. [Dataset]. http://doi.org/10.1371/journal.pone.0297930.s002

Data for Example I.

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0297930.s002

Dataset updated

Jul 3, 2024

Dataset provided by

PLOShttp://plos.org/

Authors

Jularat Chumnaul; Mohammad Sepehrifar

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Data analysis can be accurate and reliable only if the underlying assumptions of the used statistical method are validated. Any violations of these assumptions can change the outcomes and conclusions of the analysis. In this study, we developed Smart Data Analysis V2 (SDA-V2), an interactive and user-friendly web application, to assist users with limited statistical knowledge in data analysis, and it can be freely accessed at https://jularatchumnaul.shinyapps.io/SDA-V2/. SDA-V2 automatically explores and visualizes data, examines the underlying assumptions associated with the parametric test, and selects an appropriate statistical method for the given data. Furthermore, SDA-V2 can assess the quality of research instruments and determine the minimum sample size required for a meaningful study. However, while SDA-V2 is a valuable tool for simplifying statistical analysis, it does not replace the need for a fundamental understanding of statistical principles. Researchers are encouraged to combine their expertise with the software’s capabilities to achieve the most accurate and credible results.

Clear search

Close search

Google apps

Main menu

Data for Example I.

Supplement to Multivariate statistical analysis and partitioning of...

Is everything bigger in Texas? Introduction to Statistics with the Rock...

Data from: genRCT: a statistical analysis framework for generalizing RCT...

Comparative overview of the statistical packages available in moreThanANOVA...

Data analysis toolkits

Replication data for Hierarchy of Ethical Principles

Comparison of features in SDA-V2 and well-known statistical analysis...

Iris dataset

Data from an analysis of Dissolved Organic Matter in the Upper Klamath...

National curriculum assessments: KS2 and KS3, 2011 (provisional)

Key Points

Seven Points, TX Population Breakdown by Gender and Age

About this dataset

Content

Inspiration

Recommended for further research

Segmentation and Key Points of Human Body

Segmentation and Key Points of Human Body (MD-Image-053)

Introduction

Specification

About maadaa.ai

Data underlying the PhD thesis: A Principle-based Framework for Audit...

AIG Actuarial Analyst

Key stage 2 attainment: 2009 to 2010

Key Points

Level 4 or above

Level 5

Data from: Automated analysis of bird head motion in unconstrained settings:...

ADIDAS US SALES DATA ANALYSIS

Driving test, theory test and instructor statistics, January to March 2012

Key points

Technical information

Contact us

Data from: Statistical planning and analysis for treatments of tar sand...

Data for Example I.See More Versions

Data for Example I.