100+ datasets found

Random Test Data
figshare.com
bin
Updated Jan 19, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stefan Proell (2016). Random Test Data [Dataset]. http://doi.org/10.6084/m9.figshare.1096255.v2
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1096255.v2
Dataset updated
Jan 19, 2016
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Stefan Proell
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a test
18 excel spreadsheets by species and year giving reproduction and growth...
catalog.data.gov
data.wu.ac.at
Updated Aug 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2024). 18 excel spreadsheets by species and year giving reproduction and growth data. One excel spreadsheet of herbicide treatment chemistry. [Dataset]. https://catalog.data.gov/dataset/18-excel-spreadsheets-by-species-and-year-giving-reproduction-and-growth-data-one-excel-sp
Explore at:
Dataset updated
Aug 17, 2024
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
Excel spreadsheets by species (4 letter code is abbreviation for genus and species used in study, year 2010 or 2011 is year data collected, SH indicates data for Science Hub, date is date of file preparation). The data in a file are described in a read me file which is the first worksheet in each file. Each row in a species spreadsheet is for one plot (plant). The data themselves are in the data worksheet. One file includes a read me description of the column in the date set for chemical analysis. In this file one row is an herbicide treatment and sample for chemical analysis (if taken). This dataset is associated with the following publication: Olszyk , D., T. Pfleeger, T. Shiroyama, M. Blakely-Smith, E. Lee , and M. Plocher. Plant reproduction is altered by simulated herbicide drift toconstructed plant communities. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY. Society of Environmental Toxicology and Chemistry, Pensacola, FL, USA, 36(10): 2799-2813, (2017).
Datasheet.Excel.Pretest-posttest.Statistical t-test results.xlsx
figshare.com
xlsx
Updated Nov 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Theophile Shyiramunda (2023). Datasheet.Excel.Pretest-posttest.Statistical t-test results.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.24486685.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24486685.v1
Dataset updated
Nov 13, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Theophile Shyiramunda
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data set is for the research entitled"Group Discussions in Secondary School Chemistry: Unveiling Pedagogical Alchemy for Academic Advancement".
Students Test Data
kaggle.com
Updated Sep 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ATHARV BHARASKAR (2023). Students Test Data [Dataset]. https://www.kaggle.com/datasets/atharvbharaskar/students-test-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 12, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
ATHARV BHARASKAR
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
Dataset Overview: This dataset pertains to the examination results of students who participated in a series of academic assessments at a fictitious educational institution named "University of Exampleville." The assessments were administered across various courses and academic levels, with a focus on evaluating students' performance in general management and domain-specific topics.

Columns: The dataset comprises 12 columns, each representing specific attributes and performance indicators of the students. These columns encompass information such as the students' names (which have been anonymized), their respective universities, academic program names (including BBA and MBA), specializations, the semester of the assessment, the type of examination domain (general management or domain-specific), general management scores (out of 50), domain-specific scores (out of 50), total scores (out of 100), student ranks, and percentiles.

Data Collection: The examination data was collected during a standardized assessment process conducted by the University of Exampleville. The exams were designed to assess students' knowledge and skills in general management and their chosen domain-specific subjects. It involved students from both BBA and MBA programs who were in their final year of study.

Data Format: The dataset is available in a structured format, typically as a CSV file. Each row represents a unique student's performance in the examination, while columns contain specific information about their results and academic details.

Data Usage: This dataset is valuable for analyzing and gaining insights into the academic performance of students pursuing BBA and MBA degrees. It can be used for various purposes, including statistical analysis, performance trend identification, program assessment, and comparison of scores across domains and specializations. Furthermore, it can be employed in predictive modeling or decision-making related to curriculum development and student support.

Data Quality: The dataset has undergone preprocessing and anonymization to protect the privacy of individual students. Nevertheless, it is essential to use the data responsibly and in compliance with relevant data protection regulations when conducting any analysis or research.

Data Format: The exam data is typically provided in a structured format, commonly as a CSV (Comma-Separated Values) file. Each row in the dataset represents a unique student's examination performance, and each column contains specific attributes and scores related to the examination. The CSV format allows for easy import and analysis using various data analysis tools and programming languages like Python, R, or spreadsheet software like Microsoft Excel.

Here's a column-wise description of the dataset:

Name OF THE STUDENT: The full name of the student who took the exam. (Anonymized)

UNIVERSITY: The university where the student is enrolled.

PROGRAM NAME: The name of the academic program in which the student is enrolled (BBA or MBA).

Specialization: If applicable, the specific area of specialization or major that the student has chosen within their program.

Semester: The semester or academic term in which the student took the exam.

Domain: Indicates whether the exam was divided into two parts: general management and domain-specific.

GENERAL MANAGEMENT SCORE (OUT of 50): The score obtained by the student in the general management part of the exam, out of a maximum possible score of 50.

Domain-Specific Score (Out of 50): The score obtained by the student in the domain-specific part of the exam, also out of a maximum possible score of 50.

TOTAL SCORE (OUT of 100): The total score obtained by adding the scores from the general management and domain-specific parts, out of a maximum possible score of 100.
m
Raw data outputs 1-18
bridges.monash.edu
researchdata.edu.au
xlsx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abbas Salavaty Hosein Abadi; Sara Alaei; Mirana Ramialison; Peter Currie (2023). Raw data outputs 1-18 [Dataset]. http://doi.org/10.26180/21259491.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.26180/21259491.v1
Dataset updated
May 30, 2023
Dataset provided by
Monash University
Authors
Abbas Salavaty Hosein Abadi; Sara Alaei; Mirana Ramialison; Peter Currie
Description
Raw data outputs 1-18 Raw data output 1. Differentially expressed genes in AML CSCs compared with GTCs as well as in TCGA AML cancer samples compared with normal ones. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 2. Commonly and uniquely differentially expressed genes in AML CSC/GTC microarray and TCGA bulk RNA-seq datasets. This data was generated based on the results of AML microarray and TCGA data analysis. Raw data output 3. Common differentially expressed genes between training and test set samples the microarray dataset. This data was generated based on the results of AML microarray data analysis. Raw data output 4. Detailed information on the samples of the breast cancer microarray dataset (GSE52327) used in this study. Raw data output 5. Differentially expressed genes in breast CSCs compared with GTCs as well as in TCGA BRCA cancer samples compared with normal ones. Raw data output 6. Commonly and uniquely differentially expressed genes in breast cancer CSC/GTC microarray and TCGA BRCA bulk RNA-seq datasets. This data was generated based on the results of breast cancer microarray and TCGA BRCA data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 7. Differential and common co-expression and protein-protein interaction of genes between CSC and GTC samples. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively. Raw data output 8. Differentially expressed genes between AML dormant and active CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 9. Uniquely expressed genes in dormant or active AML CSCs. This data was generated based on the results of AML scRNA-seq data analysis. Raw data output 10. Intersections between the targeting transcription factors of AML key CSC genes and differentially expressed genes between AML CSCs vs GTCs and between dormant and active AML CSCs or the uniquely expressed genes in either class of CSCs. Raw data output 11. Targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 12. CSC-specific targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 13. The protein-protein interactions between AML key CSC genes with themselves and their targeting transcription factors. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. Raw data output 14. The previously confirmed associations of genes having the highest targeting desirableness and CSC-specific targeting desirableness scores with AML or other cancers’ (stem) cells as well as hematopoietic stem cells. These data were generated based on a PubMed database-based literature mining. Raw data output 15. Drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 16. CSC-specific drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section. Raw data output 17. Candidate drugs for experimental validation. These drugs were selected based on their respective (CSC-specific) drug scores. CSC is the abbreviation of cancer stem cell. Raw data output 18. Detailed information on the samples of the AML microarray dataset GSE30375 used in this study.
Test Excel Spreadsheet - Datasets - CKAN
ckan.earlham.ac.uk
Updated Jul 25, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.earlham.ac.uk (2019). Test Excel Spreadsheet - Datasets - CKAN [Dataset]. https://ckan.earlham.ac.uk/dataset/6236464b-de7e-4c4c-b261-d54f3dd8a76a
Explore at:
Dataset updated
Jul 25, 2019
Dataset provided by
CKANhttps://ckan.org/
Description
1000 lines of test data
a
Excel Import Test
data.acgov.org
Updated Dec 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AlamedaCounty.CA.US (2021). Excel Import Test [Dataset]. https://data.acgov.org/maps/91b061ea25214a49a69bc6d56a808099
Explore at:
Dataset updated
Dec 17, 2021
Dataset provided by
AlamedaCounty.CA.US
Description
This is the description
f
Data from: Excel Templates: A Helpful Tool for Teaching Statistics
tandf.figshare.com
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alejandro Quintela-del-Río; Mario Francisco-Fernández (2023). Excel Templates: A Helpful Tool for Teaching Statistics [Dataset]. http://doi.org/10.6084/m9.figshare.3408052.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3408052.v2
Dataset updated
May 30, 2023
Dataset provided by
Taylor & Francis
Authors
Alejandro Quintela-del-Río; Mario Francisco-Fernández
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This article describes a free, open-source collection of templates for the popular Excel (2013, and later versions) spreadsheet program. These templates are spreadsheet files that allow easy and intuitive learning and the implementation of practical examples concerning descriptive statistics, random variables, confidence intervals, and hypothesis testing. Although they are designed to be used with Excel, they can also be employed with other free spreadsheet programs (changing some particular formulas). Moreover, we exploit some possibilities of the ActiveX controls of the Excel Developer Menu to perform interactive Gaussian density charts. Finally, it is important to note that they can be often embedded in a web page, so it is not necessary to employ Excel software for their use. These templates have been designed as a useful tool to teach basic statistics and to carry out data analysis even when the students are not familiar with Excel. Additionally, they can be used as a complement to other analytical software packages. They aim to assist students in learning statistics, within an intuitive working environment. Supplementary materials with the Excel templates are available online.
B
Data Cleaning Sample
borealisdata.ca
Updated Jul 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rong Luo (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP3/ZCN177
Dataset updated
Jul 13, 2023
Dataset provided by
Borealis
Authors
Rong Luo
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Sample data for exercises in Further Adventures in Data Cleaning.
l
Excel dataset for Chi-squared test, risk ratio and odds ratio
repository.lincoln.ac.uk
xls
Updated Mar 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Flynn (2024). Excel dataset for Chi-squared test, risk ratio and odds ratio [Dataset]. http://doi.org/10.24385/lincoln.24325441.v1
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.24385/lincoln.24325441.v1
Dataset updated
Mar 22, 2024
Dataset provided by
University of Lincoln
Authors
John Flynn
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Excel dataset to demonstrate the Chi-squared test. To be used in conjunction with Word file of same title.
m
CBC Dataset
data.mendeley.com
Updated Nov 22, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
safa sami (2022). CBC Dataset [Dataset]. http://doi.org/10.17632/28s2bhdjfd.1
Explore at:
Unique identifier
https://doi.org/10.17632/28s2bhdjfd.1
Dataset updated
Nov 22, 2022
Authors
safa sami
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
About Dataset Safa S. Abdul-Jabbar, Alaa k. Farhan

Context This is the first Dataset for various ordinary patients in Iraq. The Dataset provides the patients’ Cell Blood Count test information that can be used to create a Hematology diagnosis/prediction system. Also, this Data was collected in 2022 from Al-Zahraa Al-Ahly Hospital. These data can be cleaned & analyzed using any programming language because it is provided in an excel file that can be accessed and manipulated easily. The user just needs to understand how rows and columns are arranged because the data was collected as images(CBC images) from the laboratories and then stored the extracted data in an excel file.  Content This Dataset contains 500 rows. For each row (patient information), there are 21 columns containing CBC test features that can be described as follows:

ID: Patients Identifier

WBC: White Blood Cell, Normal Ranges: 4.0 to 10.0, Unit: 10^9/L.

LYMp: Lymphocytes percentage, which is a type of white blood cell, Normal Ranges: 20.0 to 40.0, Unit: %

MIDp: Indicates the percentage combined value of the other types of white blood cells not classified as lymphocytes or granulocytes, Normal Ranges: 1.0 to 15.0, Unit: %

NEUTp: Neutrophils are a type of white blood cell (leukocytes); neutrophils percentage, Normal Ranges: 50.0 to 70.0, Unit: %

LYMn: Lymphocytes number are a type of white blood cell, Normal Ranges: 0.6 to 4.1, Unit: 10^9/L.

MIDn: Indicates the combined number of other white blood cells not classified as lymphocytes or granulocytes, Normal Ranges: 0.1 to 1.8, Unit: 10^9/L.

NEUTn: Neutrophils Number, Normal Ranges: 2.0 to 7.8, Unit: 10^9/L.

RBC: Red Blood Cell, Normal Ranges: 3.50 to 5.50, Unit: 10^12/L

HGB: Hemoglobin, Normal Ranges: 11.0 to 16.0, Unit: g/dL

HCT: Hematocrit is the proportion, by volume, of the Blood that consists of red blood cells, Normal Ranges: 36.0 to 48.0, Unit: %

MCV: Mean Corpuscular Volume, Normal Ranges: 80.0 to 99.0, Unit: fL

MCH: Mean Corpuscular Hemoglobin is the average amount of haemoglobin in the average red cell, Normal Ranges: 26.0 to 32.0, Unit: pg

MCHC: Mean Corpuscular Hemoglobin Concentration, Normal Ranges: 32.0 to 36.0, Unit: g/dL

RDWSD: Red Blood Cell Distribution Width, Normal Ranges: 37.0 to 54.0, Unit: fL

RDWCV: Red blood cell distribution width, Normal Ranges: 11.5 to 14.5, Unit: %

PLT: Platelet Count, Normal Ranges: 100 to 400, Unit: 10^9/L

MPV: Mean Platelet Volume, Normal Ranges: 7.4 to 10.4, Unit: fL

PDW: Red Cell Distribution Width, Normal Ranges: 10.0 to 17.0, Unit: %

PCT: The level of Procalcitonin in the Blood, Normal Ranges: 0.10 to 0.28, Unit: %

PLCR: Platelet Large Cell Ratio, Normal Ranges: 13.0 to 43.0, Unit: %

Acknowledgements We thank the entire Al-Zahraa Al-Ahly Hospital Hospital team, especially the hospital manager, for cooperating with us in collecting this data while maintaining patients' confidentiality.
f
Independent Data Aggregation, Quality Control and Visualization of...
arizona.figshare.com
png
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chun Ly; Jill McCleary; Cheryl Knott; Santiago Castiello-Gutiérrez (2023). Independent Data Aggregation, Quality Control and Visualization of University of Arizona COVID-19 Re-Entry Testing Data [Dataset]. http://doi.org/10.25422/azu.data.12966581.v2
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.25422/azu.data.12966581.v2
Dataset updated
May 30, 2023
Dataset provided by
University of Arizona Research Data Repository
Authors
Chun Ly; Jill McCleary; Cheryl Knott; Santiago Castiello-Gutiérrez
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
AbstractThe dataset provided here contains the efforts of independent data aggregation, quality control, and visualization of the University of Arizona (UofA) COVID-19 testing programs for the 2019 novel Coronavirus pandemic. The dataset is provided in the form of machine-readable tables in comma-separated value (.csv) and Microsoft Excel (.xlsx) formats.Additional InformationAs part of the UofA response to the 2019-20 Coronavirus pandemic, testing was conducted on students, staff, and faculty prior to start of the academic year and throughout the school year. These testings were done at the UofA Campus Health Center and through their instance program called "Test All Test Smart" (TATS). These tests identify active cases of SARS-nCoV-2 infections using the reverse transcription polymerase chain reaction (RT-PCR) test and the Antigen test. Because the Antigen test provided more rapid diagnosis, it was greatly used three weeks prior to the start of the Fall semester and throughout the academic year.As these tests were occurring, results were provided on the COVID-19 websites. First, beginning in early March, the Campus Health Alerts website reported the total number of positive cases. Later, numbers were provided for the total number of tests (March 12 and thereafter). According to the website, these numbers were updated daily for positive cases and weekly for total tests. These numbers were reported until early September where they were then included in the reporting for the TATS program.For the TATS program, numbers were provided through the UofA COVID-19 Update website. Initially on August 21, the numbers provided were the total number (July 31 and thereafter) of tests and positive cases. Later (August 25), additional information was provided where both PCR and Antigen testings were available. Here, the daily numbers were also included. On September 3, this website then provided both the Campus Health and TATS data. Here, PCR and Antigen were combined and referred to as "Total", and daily and cumulative numbers were provided.At this time, no official data dashboard was available until September 16, and aside from the information provided on these websites, the full dataset was not made publicly available. As such, the authors of this dataset independently aggregated data from multiple sources. These data were made publicly available through a Google Sheet with graphical illustration provided through the spreadsheet and on social media. The goal of providing the data and illustrations publicly was to provide factual information and to understand the infection rate of SARS-nCoV-2 in the UofA community.Because of differences in reported data between Campus Health and the TATS program, the dataset provides Campus Health numbers on September 3 and thereafter. TATS numbers are provided beginning on August 14, 2020.Description of Dataset ContentThe following terms are used in describing the dataset.1. "Report Date" is the date and time in which the website was updated to reflect the new numbers2. "Test Date" is to the date of testing/sample collection3. "Total" is the combination of Campus Health and TATS numbers4. "Daily" is to the new data associated with the Test Date5. "To Date (07/31--)" provides the cumulative numbers from 07/31 and thereafter6. "Sources" provides the source of information. The number prior to the colon refers to the number of sources. Here, "UACU" refers to the UA COVID-19 Update page, and "UARB" refers to the UA Weekly Re-Entry Briefing. "SS" and "WBM" refers to screenshot (manually acquired) and "Wayback Machine" (see Reference section for links) with initials provided to indicate which author recorded the values. These screenshots are available in the records.zip file.The dataset is distinguished where available by the testing program and the methods of testing. Where data are not available, calculations are made to fill in missing data (e.g., extrapolating backwards on the total number of tests based on daily numbers that are deemed reliable). Where errors are found (by comparing to previous numbers), those are reported on the above Google Sheet with specifics noted.For inquiries regarding the contents of this dataset, please contact the Corresponding Author listed in the README.txt file. Administrative inquiries (e.g., removal requests, trouble downloading, etc.) can be directed to data-management@arizona.edu
d
Relaxed Naïve Bayes Data
search.dataone.org
dataverse.harvard.edu
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Relaxed Naïve Bayes Team (2023). Relaxed Naïve Bayes Data [Dataset]. http://doi.org/10.7910/DVN/7KNKLL
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/7KNKLL
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Relaxed Naïve Bayes Team
Description
NaiveBayes_R.xlsx: This Excel file includes information as to how probabilities of observed features are calculated given recidivism (P(x_ij│R)) in the training data. Each cell is embedded with an Excel function to render appropriate figures. P(Xi|R): This tab contains probabilities of feature attributes among recidivated offenders. NIJ_Recoded: This tab contains re-coded NIJ recidivism challenge data following our coding schema described in Table 1. Recidivated_Train: This tab contains re-coded features of recidivated offenders. Tabs from [Gender] through [Condition_Other]: Each tab contains probabilities of feature attributes given recidivism. We use these conditional probabilities to replace the raw values of each feature in P(Xi|R) tab. NaiveBayes_NR.xlsx: This Excel file includes information as to how probabilities of observed features are calculated given non-recidivism (P(x_ij│N)) in the training data. Each cell is embedded with an Excel function to render appropriate figures. P(Xi|N): This tab contains probabilities of feature attributes among non-recidivated offenders. NIJ_Recoded: This tab contains re-coded NIJ recidivism challenge data following our coding schema described in Table 1. NonRecidivated_Train: This tab contains re-coded features of non-recidivated offenders. Tabs from [Gender] through [Condition_Other]: Each tab contains probabilities of feature attributes given non-recidivism. We use these conditional probabilities to replace the raw values of each feature in P(Xi|N) tab. Training_LnTransformed.xlsx: Figures in each cell are log-transformed ratios of probabilities in NaiveBayes_R.xlsx (P(Xi|R)) to the probabilities in NaiveBayes_NR.xlsx (P(Xi|N)). TestData.xlsx: This Excel file includes the following tabs based on the test data: P(Xi|R), P(Xi|N), NIJ_Recoded, and Test_LnTransformed (log-transformed P(Xi|R)/ P(Xi|N)). Training_LnTransformed.dta: We transform Training_LnTransformed.xlsx to Stata data set. We use Stat/Transfer 13 software package to transfer the file format. StataLog.smcl: This file includes the results of the logistic regression analysis. Both estimated intercept and coefficient estimates in this Stata log correspond to the raw weights and standardized weights in Figure 1. Brier Score_Re-Check.xlsx: This Excel file recalculates Brier scores of Relaxed Naïve Bayes Classifier in Table 3, showing evidence that results displayed in Table 3 are correct. *****Full List***** NaiveBayes_R.xlsx NaiveBayes_NR.xlsx Training_LnTransformed.xlsx TestData.xlsx Training_LnTransformed.dta StataLog.smcl Brier Score_Re-Check.xlsx Data for Weka (Training Set): Bayes_2022_NoID Data for Weka (Test Set): BayesTest_2022_NoID Weka output for machine learning models (Conventional naïve Bayes, AdaBoost, Multilayer Perceptron, Logistic Regression, and Random Forest)
Regeneration Study Test Data
catalog.data.gov
datasets.ai
+1more
Updated Nov 12, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Regeneration Study Test Data [Dataset]. https://catalog.data.gov/dataset/regeneration-study-test-data
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
Data (2 excel files) consist of the analytical test results on water sample collected from the two adsorption media tanks of the arsenic removal system during the regeneration processes conducted multiply times over a five-year period. Data set also includes the companion bed volumes of water treated by the tank of media at the time the water samples were collected. This dataset is associated with the following publications: Sorg, T., A. Chen, L. Wang, and R. Kolich. Regeneration of a Full-Scale Arsenic Removal Adsorptive Media System,Part 1: The Regeneration Process. JOURNAL OF AMERICAN WATER WORKS ASSOCIATION. American Water Resources Association, Middleburg, VA, USA, 109(5): 13-24, (2017). Sorg, T., R. Kolich, A.S.C. Chen, and L. Wang. Regeneration of a Full-Scale Arsenic Removal Adsorptive Media System,Part 2: The Performance and Cost. JOURNAL OF THE AMERICAN WATER WORKS ASSOCIATION. American Water Works Association, Denver, CO, USA, 109(5): E122-E128, (2017).
H
Time-Series Matrix (TSMx): A visualization tool for plotting multiscale...
dataverse.harvard.edu
Updated Jul 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Georgios Boumis; Brad Peter (2024). Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends [Dataset]. http://doi.org/10.7910/DVN/ZZDYM9
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/ZZDYM9
Dataset updated
Jul 8, 2024
Dataset provided by
Harvard Dataverse
Authors
Georgios Boumis; Brad Peter
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends TSMx is an R script that was developed to facilitate multi-temporal-scale visualizations of time-series data. The script requires only a two-column CSV of years and values to plot the slope of the linear regression line for all possible year combinations from the supplied temporal range. The outputs include a time-series matrix showing slope direction based on the linear regression, slope values plotted with colors indicating magnitude, and results of a Mann-Kendall test. The start year is indicated on the y-axis and the end year is indicated on the x-axis. In the example below, the cell in the top-right corner is the direction of the slope for the temporal range 2001–2019. The red line corresponds with the temporal range 2010–2019 and an arrow is drawn from the cell that represents that range. One cell is highlighted with a black border to demonstrate how to read the chart—that cell represents the slope for the temporal range 2004–2014. This publication entry also includes an excel template that produces the same visualizations without a need to interact with any code, though minor modifications will need to be made to accommodate year ranges other than what is provided. TSMx for R was developed by Georgios Boumis; TSMx was originally conceptualized and created by Brad G. Peter in Microsoft Excel. Please refer to the associated publication: Peter, B.G., Messina, J.P., Breeze, V., Fung, C.Y., Kapoor, A. and Fan, P., 2024. Perspectives on modifiable spatiotemporal unit problems in remote sensing of agriculture: evaluating rice production in Vietnam and tools for analysis. Frontiers in Remote Sensing, 5, p.1042624. https://www.frontiersin.org/journals/remote-sensing/articles/10.3389/frsen.2024.1042624 TSMx sample chart from the supplied Excel template. Data represent the productivity of rice agriculture in Vietnam as measured via EVI (enhanced vegetation index) from the NASA MODIS data product (MOD13Q1.V006). TSMx R script: # import packages library(dplyr) library(readr) library(ggplot2) library(tibble) library(tidyr) library(forcats) library(Kendall) options(warn = -1) # disable warnings # read data (.csv file with "Year" and "Value" columns) data <- read_csv("EVI.csv") # prepare row/column names for output matrices years <- data %>% pull("Year") r.names <- years[-length(years)] c.names <- years[-1] years <- years[-length(years)] # initialize output matrices sign.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) pval.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) slope.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) # function to return remaining years given a start year getRemain <- function(start.year) { years <- data %>% pull("Year") start.ind <- which(data[["Year"]] == start.year) + 1 remain <- years[start.ind:length(years)] return (remain) } # function to subset data for a start/end year combination splitData <- function(end.year, start.year) { keep <- which(data[['Year']] >= start.year & data[['Year']] <= end.year) batch <- data[keep,] return(batch) } # function to fit linear regression and return slope direction fitReg <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(sign(slope)) } # function to fit linear regression and return slope magnitude fitRegv2 <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(slope) } # function to implement Mann-Kendall (MK) trend test and return significance # the test is implemented only for n>=8 getMann <- function(batch) { if (nrow(batch) >= 8) { mk <- MannKendall(batch[['Value']]) pval <- mk[['sl']] } else { pval <- NA } return(pval) } # function to return slope direction for all combinations given a start year getSign <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) signs <- lapply(combs, fitReg) return(signs) } # function to return MK significance for all combinations given a start year getPval <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) pvals <- lapply(combs, getMann) return(pvals) } # function to return slope magnitude for all combinations given a start year getMagn <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) magns <- lapply(combs, fitRegv2) return(magns) } # retrieve slope direction, MK significance, and slope magnitude signs <- lapply(years, getSign) pvals <- lapply(years, getPval) magns <- lapply(years, getMagn) # fill-in output matrices dimension <- nrow(sign.matrix) for (i in 1:dimension) { sign.matrix[i, i:dimension] <- unlist(signs[i]) pval.matrix[i, i:dimension] <- unlist(pvals[i]) slope.matrix[i, i:dimension] <- unlist(magns[i]) } sign.matrix <-...
d
Data from: Utah FORGE: Milford Triaxial Test Data and Summary from EGI labs
catalog.data.gov
data.openei.org
+2more
Updated Jan 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Energy and Geoscience Institute at the University of Utah (2025). Utah FORGE: Milford Triaxial Test Data and Summary from EGI labs [Dataset]. https://catalog.data.gov/dataset/utah-forge-milford-triaxial-test-data-and-summary-from-egi-labs-99d6f
Explore at:
Dataset updated
Jan 20, 2025
Dataset provided by
Energy and Geoscience Institute at the University of Utah
Description
Six samples were evaluated in unconfined and triaxial compression, their data are included in separate excel spreadsheets, and summarized in the word document. Three samples were plugged along the axis of the core (presumed to be nominally vertical) and three samples were plugged perpendicular to the axis of the core. A designation of "V"indicates vertical or the long axis of the plugged sample is aligned with the axis of the core. Similarly, "H" indicates a sample that is nominally horizontal and cut orthogonal to the axis of the core. Stress-strain curves were made before and after the testing, and are included in the word doc. The confining pressure for this test was 2800 psi. A series of tests are being carried out on to define a failure envelope, to provide representative hydraulic fracture design parameters and for future geomechanical assessments. The samples are from well 52-21, which reaches a maximum depth of 3581 ft +/- 2 ft into a gneiss complex.
A
Illinois Drill Stem Tests
data.amerigeoss.org
data.wu.ac.at
esri rest, wfs, wms +1
Updated Jul 30, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States (2019). Illinois Drill Stem Tests [Dataset]. https://data.amerigeoss.org/da_DK/dataset/illinois-drill-stem-tests
Explore at:
esri rest, zip, wms, wfsAvailable download formats
Dataset updated
Jul 30, 2019
Dataset provided by
United States
Area covered
Illinois
Description
This dataset is a compilation of drill stem test observations, compiled by the Illinois State Geological Survey and published as a web feature service, an ESRI Service, a web map service, and as an Excel spreadsheet for the National Geothermal Data System. The downloadable Excel spreadsheets include information about the template, notes related to revisions of the template, Resource provider information, the data, a field list (data mapping view) and a worksheet with vocabularies for use in populating the spreadsheet (data valid terms).
d
Data from: Utah FORGE: Results of Direct Shear Tests on Saturated Joints in...
catalog.data.gov
data.openei.org
+2more
Updated Jan 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Purdue University (2025). Utah FORGE: Results of Direct Shear Tests on Saturated Joints in Sierra White Granite [Dataset]. https://catalog.data.gov/dataset/utah-forge-results-of-direct-shear-tests-on-saturated-joints-in-sierra-white-granite-871d8
Explore at:
Dataset updated
Jan 20, 2025
Dataset provided by
Purdue University
Description
This dataset includes an Excel file with the results of direct shear tests to investigate the mechanical and geophysical response of saturated joints. Tests were conducted on induced tension-fractured Sierra White granite joints in a custom water pressurized chamber. The preparation of granite joints is summarized in the PDF file. The PDF file also includes a description of the direct shear tests, experimental procedure, and results.
d
2013-14 Advanced Placement Test Taking
catalog.data.gov
Updated Sep 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for Civil Rights (OCR) (2023). 2013-14 Advanced Placement Test Taking [Dataset]. https://catalog.data.gov/dataset/2013-14-advanced-placement-test-taking-788bf
Explore at:
Dataset updated
Sep 1, 2023
Dataset provided by
Office for Civil Rights (OCR)
Description
This set of Excel files contains data on Advanced Placement test taking for all states. For each set, there are three three spreadsheets: total students, male students, and female students.
d
2011–2016 Single Well Aquifer Tests: Pumping Schedules, Water-Level Data in...
catalog.data.gov
data.usgs.gov
+1more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). 2011–2016 Single Well Aquifer Tests: Pumping Schedules, Water-Level Data in Aquifer Test Wells, and Analysis Results from Tests Conducted near Long Canyon, Goshute Valley, Northeastern Nevada [Dataset]. https://catalog.data.gov/dataset/20112016-single-well-aquifer-tests-pumping-schedules-water-level-data-in-aquifer-test-well
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
U.S. Geological Survey
Area covered
Goshute Valley
Description
This dataset presents tabular data and Excel workbooks used to analyze single-well aquifer tests in pumping wells and slug tests in monitoring wells near Long Canyon. The data also include pdf outputs from the analysis program, Aqtesolv (Duffield, 2007). The data are presented in two zipped files, (1) single-well aquifer tests in pumping wells and (2) slug tests in monitoring wells. The slug-test data were supplied by Newmont Mining Corporation and collected by Golder and Associates in 2011. Reference Cited: Duffield, G.M., 2007, AQTESOLV for windows: Version 4.5 User’s Guide, HydroSOLV, Inc. Reston, VA, p. 530, at, http://www.aqtesolv.com/download/aqtw20070719.pdf.