100+ datasets found

f
Initial data analysis checklist for data screening in longitudinal studies.
plos.figshare.com
xls
Updated May 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lara Lusa; Cécile Proust-Lima; Carsten O. Schmidt; Katherine J. Lee; Saskia le Cessie; Mark Baillie; Frank Lawrence; Marianne Huebner (2024). Initial data analysis checklist for data screening in longitudinal studies. [Dataset]. http://doi.org/10.1371/journal.pone.0295726.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0295726.t001
Dataset updated
May 29, 2024
Dataset provided by
PLOS ONE
Authors
Lara Lusa; Cécile Proust-Lima; Carsten O. Schmidt; Katherine J. Lee; Saskia le Cessie; Mark Baillie; Frank Lawrence; Marianne Huebner
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Initial data analysis checklist for data screening in longitudinal studies.
d
Data from: Preliminary Analysis of Stress in the Newberry EGS Well NWG 55-29...
catalog.data.gov
gdr.openei.org
+3more
Updated Jan 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Energy Technology Laboratory (2025). Preliminary Analysis of Stress in the Newberry EGS Well NWG 55-29 [Dataset]. https://catalog.data.gov/dataset/preliminary-analysis-of-stress-in-the-newberry-egs-well-nwg-55-29-d08f7
Explore at:
Dataset updated
Jan 20, 2025
Dataset provided by
National Energy Technology Laboratory
Description
As part of the planning for stimulation of the Newberry Volcano Enhanced Geothermal Systems (EGS) Demonstration project in Oregon, a high-resolution borehole televiewer (BHTV) log was acquired using the ALT ABI85 BHTV tool in the slightly deviated NWG 55-29 well. The image log reveals an extensive network of fractures in a conjugate set striking approximately N-S and dipping 50 deg that are well oriented for normal slip and are consistent with surface-breaking regional normal faults in the vicinity. Similarly, breakouts indicate a consistent minimum horizontal stress, Shmin, azimuth of 092.3 +/- 17.3 deg. In conjunction with a suite of geophysical logs, a model of the stress magnitudes constrained by the width of breakouts at depth and a model of rock strength independently indicates a predominantly normal faulting stress regime.
u
Cervical cancer screening among women in Johannesburg
researchdata.up.ac.za
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tafadzwa Pasipamire (2023). Cervical cancer screening among women in Johannesburg [Dataset]. http://doi.org/10.25403/UPresearchdata.19180697.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.25403/UPresearchdata.19180697.v1
Dataset updated
Jun 1, 2023
Dataset provided by
University of Pretoria
Authors
Tafadzwa Pasipamire
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The study is mixed methods research.Quantitative Data: Datasets are of sociodemographic data of women accessing cervical cancer screening at a woman's clinic. The datasets and do files can be opened in analytic software, STATA . Qualitative data: Qualitative data consists of preliminary analysis tables and reflective notes from in-depth interviews with female patients and healthcare providers. .
Dataset for "Are data papers cited as research data? Preliminary analysis on...
zenodo.org
bin, csv
Updated Sep 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kai Li; Kai Li; Pao-pei Huang; Wei Jeng; Wei Jeng; Pao-pei Huang (2024). Dataset for "Are data papers cited as research data? Preliminary analysis on interdisciplinary data paper citations" [Dataset]. http://doi.org/10.5281/zenodo.13763303
Explore at:
csv, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13763303
Dataset updated
Sep 14, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Kai Li; Kai Li; Pao-pei Huang; Wei Jeng; Wei Jeng; Pao-pei Huang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the dataset for the paper "Are data papers cited as research data? Preliminary analysis on interdisciplinary data paper citations" submitted to iConference 2025.
f
UC_vs_US Statistic Analysis.xlsx
figshare.com
xlsx
Updated Jul 9, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
F. (Fabiano) Dalpiaz (2020). UC_vs_US Statistic Analysis.xlsx [Dataset]. http://doi.org/10.23644/uu.12631628.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.23644/uu.12631628.v1
Dataset updated
Jul 9, 2020
Dataset provided by
Utrecht University
Authors
F. (Fabiano) Dalpiaz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sheet 1 (Raw-Data): The raw data of the study is provided, presenting the tagging results for the used measures described in the paper. For each subject, it includes multiple columns: A. a sequential student ID B an ID that defines a random group label and the notation C. the used notation: user Story or use Cases D. the case they were assigned to: IFA, Sim, or Hos E. the subject's exam grade (total points out of 100). Empty cells mean that the subject did not take the first exam F. a categorical representation of the grade L/M/H, where H is greater or equal to 80, M is between 65 included and 80 excluded, L otherwise G. the total number of classes in the student's conceptual model H. the total number of relationships in the student's conceptual model I. the total number of classes in the expert's conceptual model J. the total number of relationships in the expert's conceptual model K-O. the total number of encountered situations of alignment, wrong representation, system-oriented, omitted, missing (see tagging scheme below) P. the researchers' judgement on how well the derivation process explanation was explained by the student: well explained (a systematic mapping that can be easily reproduced), partially explained (vague indication of the mapping ), or not present.

Tagging scheme: Aligned (AL) - A concept is represented as a class in both models, either

with the same name or using synonyms or clearly linkable names; Wrongly represented (WR) - A class in the domain expert model is incorrectly represented in the student model, either (i) via an attribute, method, or relationship rather than class, or (ii) using a generic term (e.g., user'' instead ofurban planner''); System-oriented (SO) - A class in CM-Stud that denotes a technical implementation aspect, e.g., access control. Classes that represent legacy system or the system under design (portal, simulator) are legitimate; Omitted (OM) - A class in CM-Expert that does not appear in any way in CM-Stud; Missing (MI) - A class in CM-Stud that does not appear in any way in CM-Expert.

All the calculations and information provided in the following sheets

originate from that raw data.

Sheet 2 (Descriptive-Stats): Shows a summary of statistics from the data collection,

including the number of subjects per case, per notation, per process derivation rigor category, and per exam grade category.

Sheet 3 (Size-Ratio):

The number of classes within the student model divided by the number of classes within the expert model is calculated (describing the size ratio). We provide box plots to allow a visual comparison of the shape of the distribution, its central value, and its variability for each group (by case, notation, process, and exam grade) . The primary focus in this study is on the number of classes. However, we also provided the size ratio for the number of relationships between student and expert model.

Sheet 4 (Overall):

Provides an overview of all subjects regarding the encountered situations, completeness, and correctness, respectively. Correctness is defined as the ratio of classes in a student model that is fully aligned with the classes in the corresponding expert model. It is calculated by dividing the number of aligned concepts (AL) by the sum of the number of aligned concepts (AL), omitted concepts (OM), system-oriented concepts (SO), and wrong representations (WR). Completeness on the other hand, is defined as the ratio of classes in a student model that are correctly or incorrectly represented over the number of classes in the expert model. Completeness is calculated by dividing the sum of aligned concepts (AL) and wrong representations (WR) by the sum of the number of aligned concepts (AL), wrong representations (WR) and omitted concepts (OM). The overview is complemented with general diverging stacked bar charts that illustrate correctness and completeness.

For sheet 4 as well as for the following four sheets, diverging stacked bar

charts are provided to visualize the effect of each of the independent and mediated variables. The charts are based on the relative numbers of encountered situations for each student. In addition, a "Buffer" is calculated witch solely serves the purpose of constructing the diverging stacked bar charts in Excel. Finally, at the bottom of each sheet, the significance (T-test) and effect size (Hedges' g) for both completeness and correctness are provided. Hedges' g was calculated with an online tool: https://www.psychometrica.de/effect_size.html. The independent and moderating variables can be found as follows:

Sheet 5 (By-Notation):

Model correctness and model completeness is compared by notation - UC, US.

Sheet 6 (By-Case):

Model correctness and model completeness is compared by case - SIM, HOS, IFA.

Sheet 7 (By-Process):

Model correctness and model completeness is compared by how well the derivation process is explained - well explained, partially explained, not present.

Sheet 8 (By-Grade):

Model correctness and model completeness is compared by the exam grades, converted to categorical values High, Low , and Medium.
Z
Dataset: Preliminary analysis of open data pertaining to the services...
data.niaid.nih.gov
zenodo.org
Updated Nov 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ivetić, Vojislav (2023). Dataset: Preliminary analysis of open data pertaining to the services available through the Health Insurance Institute of Slovenia and provided by family medicine [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8305762
Explore at:
Dataset updated
Nov 23, 2023
Dataset provided by
Petravić, Luka
Ivetić, Vojislav
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Slovenia
Description
BACKGROUND: The Health Insurance Institute of Slovenia (ZZZS) began publishing service-related data in May 2023, following a directive from the Ministry of Health (MoH). The ZZZS website provides easily accessible information about the services provided by individual doctors, including their names. The user is provided relevant information about the doctor's employer, including whether it is a public or private institution. The data provided is useful for studying the public system's operations and identifying any errors or anomalies.

METHODS: The data for services provided in May 2023 was downloaded and analysed. The published data were cross-referenced using the provider's RIZDDZ number with the daily updated data on ambulatory workload from June 9, 2023, published by ZZZS. The data mentioned earlier were found to be inaccurate and were improved using alerts from the zdravniki.sledilnik.org portal. Therefore, they currently provide an accurate representation of the current situation. The total number of services provided by each provider in a given month was determined by adding up the individual services and then assigning them to the corresponding provider.

RESULTS: A pivot table was created to identify 307 unique operators, with 15 operators not appearing in both lists. There are 66 public providers, which make up about 72% of the contractual programme in the public system. There are 241 private providers, accounting for about 28% of the contractual programme. In May 2023, public providers accounted for 69% (n=646,236) of services in the family medicine system, while private providers contributed 31% (n=291,660). The total number of services provided by public and private providers was 937,896. Three linear correlations were analysed. The initial analysis of the entire sample yielded a high R-squared value of .998 (adjusted R-squared value of .996) and a significant level below 0.001. The second analysis of the data from private providers showed a high R Squared value of .904 (Adjusted R Squared = .886), indicating a strong correlation between the variables. Furthermore, the significance level was < 0.001, providing additional support for the statistical significance of the results. The third analysis used data from public providers and showed a strong level of explanatory power, with a R Squared value of 1.000 (Adjusted R Squared = 1.000). Furthermore, the statistical significance of the findings was established with a p-value < 0.001.

CONCLUSION: Our analysis shows a strong linear correlation between contract size of the program signed and number services rendered by family medicine providers. A stronger linear correlation is observed among providers in the public system compared to those in the private system. Our study found that private providers generally offer more services than public providers. However, it is important to acknowledge that the evaluation framework for assessing services may have inherent flaws when examining the data. Prescribing a prescription and resuscitating a patient are both assigned a rating of one service. It is crucial to closely monitor trends and identify comparable databases for pairing at the secondary and tertiary levels.
Preliminary analysis of citation practices in studies that analyze data from...
figshare.com
png
Updated Sep 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luc Boruta; Damien Vannson (2020). Preliminary analysis of citation practices in studies that analyze data from Altmetric.com [Dataset]. http://doi.org/10.6084/m9.figshare.13028210.v1
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13028210.v1
Dataset updated
Sep 30, 2020
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Luc Boruta; Damien Vannson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Preliminary results from an ongoing analysis of citation practices in quantitative studies that analyze data from Altmetric.com (supporting data for our submission to http://altmetrics.org/altmetrics20/).Data sources: https://web.archive.org/web/20200929163109/https://www.altmetric.com/blog/altmetrics-research-2019/ and https://web.archive.org/web/20200929163146/https://www.altmetric.com/blog/altmetric-supported-research-2018-in-review/The dataset shows that only 32% of quantitative studies that build upon Altmetric’s attention score mention the day on which the data was collected, and that 50% mention no version information at all.
A
Preliminary Analysis of Floods in the San Vicente area, El Salvador, Nov...
data.amerigeoss.org
jpeg, org, shtml
Updated Jul 23, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SERVIR (2019). Preliminary Analysis of Floods in the San Vicente area, El Salvador, Nov 2009 [Dataset]. https://data.amerigeoss.org/dataset/preliminary-analysis-of-floods-in-the-san-vicente-area-el-salvador-nov-2009
Explore at:
shtml, jpeg, orgAvailable download formats
Dataset updated
Jul 23, 2019
Dataset provided by
SERVIR
Area covered
El Salvador
Description
First satellite images of areas affected by flooding in El Salvador in November 2009. Using Formosat-2 satellite images*, this preliminary analysis identified the following populated areas as affected: San Vicente, Verapaz, Tepetitan and Guadalupe. Floods were detected along Quebrada Seca, Quebrada la Quebradona, Quebrada Pozo Caliente, Quebrada el Amante Blanco, Rio Acahuapa, Quebrada Paso Hondo, and Quebrada Ticuisa. Formosat images © 2010 Dr. Cheng-Chien Liu, National Cheng Kung University; Dr. An-Ming Wu, National Space Organization, Taiwan; Global Earth Observation and Data Analysis Center (GEODAC), Taiwan.
PLOS Open Science Indicators
plos.figshare.com
zip
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Public Library of Science (2025). PLOS Open Science Indicators [Dataset]. http://doi.org/10.6084/m9.figshare.21687686.v10
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21687686.v10
Dataset updated
Jul 10, 2025
Dataset provided by
PLOShttp://plos.org/
Authors
Public Library of Science
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains article metadata and information about Open Science Indicators for approximately 139,000 research articles published in PLOS journals from 1 January 2018 to 30 March 2025 and a set of approximately 28,000 comparator articles published in non-PLOS journals. This is the tenth release of this dataset, which will be updated with new versions on an annual basis.This version of the Open Science Indicators dataset shares the indicators seen in the previous versions as well as fully operationalised protocols and study registration indicators, which were previously only shared in preliminary forms. The v10 dataset focuses on detection of five Open Science practices by analysing the XML of published research articles:Sharing of research data, in particular data shared in data repositoriesSharing of codePosting of preprintsSharing of protocolsSharing of study registrationsThe dataset provides data and code generation and sharing rates, the location of shared data and code (whether in Supporting Information or in an online repository). It also provides preprint, protocol and study registration sharing rates as well as details of the shared output, such as publication date, URL/DOI/Registration Identifier and platform used. Additional data fields are also provided for each article analysed. This release has been run using an updated preprint detection method (see OSI-Methods-Statement_v10_Jul25.pdf for details). Further information on the methods used to collect and analyse the data can be found in Documentation.Further information on the principles and requirements for developing Open Science Indicators is available in https://doi.org/10.6084/m9.figshare.21640889.Data folders/filesData Files folderThis folder contains the main OSI dataset files PLOS-Dataset_v10_Jul25.csv and Comparator-Dataset_v10_Jul25.csv, which containdescriptive metadata, e.g. article title, publication data, author countries, is taken from the article .xml filesadditional information around the Open Science Indicators derived algorithmicallyand the OSI-Summary-statistics_v10_Jul25.xlsx file contains the summary data for both PLOS-Dataset_v10_Jul25.csv and Comparator-Dataset_v10_Jul25.csv.Documentation folderThis file contains documentation related to the main data files. The file OSI-Methods-Statement_v10_Jul25.pdf describes the methods underlying the data collection and analysis. OSI-Column-Descriptions_v10_Jul25.pdf describes the fields used in PLOS-Dataset_v10_Jul25.csv and Comparator-Dataset_v10_Jul25.csv. OSI-Repository-List_v1_Dec22.xlsx lists the repositories and their characteristics used to identify specific repositories in the PLOS-Dataset_v10_Jul25.csv and Comparator-Dataset_v10_Jul25.csv repository fields.The folder also contains documentation originally shared alongside the preliminary versions of the protocols and study registration indicators in order to give fuller details of their detection methods.Contact details for further information:Iain Hrynaszkiewicz, Director, Open Research Solutions, PLOS, ihrynaszkiewicz@plos.org / plos@plos.orgLauren Cadwallader, Open Research Manager, PLOS, lcadwallader@plos.org / plos@plos.orgAcknowledgements:Thanks to Allegra Pearce, Tim Vines, Asura Enkhbayar, Scott Kerr and parth sarin of DataSeer for contributing to data acquisition and supporting information.
Z
Supplementary Report for the paper "A Preliminary Analysis on the Effect of...
data.niaid.nih.gov
zenodo.org
Updated Aug 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ákos Hajdu (2024). Supplementary Report for the paper "A Preliminary Analysis on the Effect of Randomness in a CEGAR Framework" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1117853
Explore at:
Dataset updated
Aug 2, 2024
Dataset provided by
Ákos Hajdu
Zoltán Micskei
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A supplementary report for the paper "A Preliminary Analysis on the Effect of Randomness in a CEGAR Framework" by Ákos Hajdu and Zoltán Micskei, presented at the 25th PhD Mini-Symposium (2018), organized by the Department of Measurement and Information Systems at the Budapest University of Technology and Economics.
A
‘Preliminary Parcels’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Preliminary Parcels’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-preliminary-parcels-f239/d1ce69b7/?iid=000-075&v=presentation
Explore at:
Dataset updated
Jan 27, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Preliminary Parcels’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/86b4e1d0-2e40-4042-8e91-be7efca1015f on 27 January 2022.

--- Dataset description provided by original source is as follows ---

Feature layer containing authoritative preliminary parcel polygons for Sioux Falls, South Dakota.

Preliminary parcel polygons are derived from concept plans and are not final.

--- Original source retains full ownership of the source dataset ---
d
Analysis of Pre-Retrofit Building and Utility Data - Southeast United States...
catalog.data.gov
Updated Jul 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ibacos Innovation (2025). Analysis of Pre-Retrofit Building and Utility Data - Southeast United States [Dataset]. https://catalog.data.gov/dataset/analysis-of-pre-retrofit-building-and-utility-data-southeast-united-states
Explore at:
Dataset updated
Jul 22, 2025
Dataset provided by
Ibacos Innovation
Area covered
United States, Southeastern United States
Description
This project delves into the workflow and results of regression models on monthly and daily utility data (meter readings of electricity consumption), outlining a process for screening and gathering useful results from inverse models. Energy modeling predictions created in Building Energy Optimization software (BEopt) Version 2.0.0.3 (BEopt 2013) are used to infer causes of differences among similar homes. This simple data analysis is useful for the purposes of targeting audits and maximizing the accuracy of energy savings predictions with minimal costs. The data for this project are from two adjacent military housing communities of 1,166 houses in the southeastern United States. One community was built in the 1970s, and the other was built in the mid-2000s. Both communities are all electric; the houses in the older community were retrofitted with ground source heat pumps in the early 1990s, and the newer community was built to an early version of ENERGY STAR with air source heat pumps. The houses in the older community will receive phased retrofits (approximately 10 per month) in the coming years. All houses have had daily electricity metering readings since early 2011. This project explores a dataset at a simple level and describes applications of a utility data normalization. There are far more sophisticated ways to analyze a dataset of dynamic, high resolution data; however, this report focuses on simple processes to create big-picture overviews of building portfolios as an initial step in a community-scale analysis. TO4 9.1.2: Comm. Scale Military Housing Upgrades
W
Data from: Preliminary analysis of a shale oil class separation procedure
cloud.csiss.gmu.edu
data.wu.ac.at
pdf
Updated Aug 8, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Energy Data Exchange (2019). Preliminary analysis of a shale oil class separation procedure [Dataset]. https://cloud.csiss.gmu.edu/uddi/dataset/preliminary-analysis-of-a-shale-oil-class-separation-procedure0
Explore at:
pdf(2514451)Available download formats
Dataset updated
Aug 8, 2019
Dataset provided by
Energy Data Exchange
Description
The results of the class separation of Kentucky and Colorado shale oils indicate that the separation scheme developed is effective in separating whole shale oils into their component saturate, olefinic, aromatic, and polar fractions. The effectiveness of the separation is indicated by the proton NMR and FTIR analysis of the fractions, while the reproducibility is given by the agreement of duplicate runs. The types of problems encountered, such as column channeling, solvent schemes, elution rates, elution volumes, sample transfer, and adsorbent preparation were corrected and/or modified to provide maximum separation while at the same time minimizing separation time. Nuclear magnetic resonance has proved valuable in analyzing the distribution of hydrogen in the individual fractions. This will allow optimization of operating conditions to yield the desired products. Carbon (/sup 13/C) NMR of the fractions will provide additional structural information on the fractions, such as average carbon chain lengths, branched/straight chain and aliphatic/aromatic ratios, which will complement the information gained through proton NMR. 3 refs., 4 figs., 3 tabs.
w
Data from: Preliminary Analysis of Tank 241-C-106 Dryout due to Large...
data.wu.ac.at
Updated Dec 29, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2015). Preliminary Analysis of Tank 241-C-106 Dryout due to Large Postulated Leak and Vaporization [Dataset]. https://data.wu.ac.at/schema/geothermaldata_org/M2ExNTMwOWEtYjMxMS00MjJjLTkxZGQtNjZjNmE4ZTQzZDQx
Explore at:
Dataset updated
Dec 29, 2015
Description
No Publication Abstract is Available
d
Data from: Preliminary analysis of clay gouge from a well in the San Andreas...
datadiscoverystudio.org
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Preliminary analysis of clay gouge from a well in the San Andreas fault zone in central California [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/99117eb36d31480390222e9f952e3cf0/html
Explore at:
Area covered

Description
no abstract provided
f
Data from: The Often-Overlooked Power of Summary Statistics in Exploratory...
acs.figshare.com
xlsx
Updated Jun 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tahereh G. Avval; Behnam Moeini; Victoria Carver; Neal Fairley; Emily F. Smith; Jonas Baltrusaitis; Vincent Fernandez; Bonnie. J. Tyler; Neal Gallagher; Matthew R. Linford (2023). The Often-Overlooked Power of Summary Statistics in Exploratory Data Analysis: Comparison of Pattern Recognition Entropy (PRE) to Other Summary Statistics and Introduction of Divided Spectrum-PRE (DS-PRE) [Dataset]. http://doi.org/10.1021/acs.jcim.1c00244.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jcim.1c00244.s002
Dataset updated
Jun 8, 2023
Dataset provided by
ACS Publications
Authors
Tahereh G. Avval; Behnam Moeini; Victoria Carver; Neal Fairley; Emily F. Smith; Jonas Baltrusaitis; Vincent Fernandez; Bonnie. J. Tyler; Neal Gallagher; Matthew R. Linford
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Unsupervised exploratory data analysis (EDA) is often the first step in understanding complex data sets. While summary statistics are among the most efficient and convenient tools for exploring and describing sets of data, they are often overlooked in EDA. In this paper, we show multiple case studies that compare the performance, including clustering, of a series of summary statistics in EDA. The summary statistics considered here are pattern recognition entropy (PRE), the mean, standard deviation (STD), 1-norm, range, sum of squares (SSQ), and X4, which are compared with principal component analysis (PCA), multivariate curve resolution (MCR), and/or cluster analysis. PRE and the other summary statistics are direct methods for analyzing datathey are not factor-based approaches. To quantify the performance of summary statistics, we use the concept of the “critical pair,” which is employed in chromatography. The data analyzed here come from different analytical methods. Hyperspectral images, including one of a biological material, are also analyzed. In general, PRE outperforms the other summary statistics, especially in image analysis, although a suite of summary statistics is useful in exploring complex data sets. While PRE results were generally comparable to those from PCA and MCR, PRE is easier to apply. For example, there is no need to determine the number of factors that describe a data set. Finally, we introduce the concept of divided spectrum-PRE (DS-PRE) as a new EDA method. DS-PRE increases the discrimination power of PRE. We also show that DS-PRE can be used to provide the inputs for the k-nearest neighbor (kNN) algorithm. We recommend PRE and DS-PRE as rapid new tools for unsupervised EDA.
d
Data from: Preliminary analysis of integrated stratigraphic data from the...
datadiscoverystudio.org
pdf
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Preliminary analysis of integrated stratigraphic data from the Phred #1 corehole, Indian River County, Florida [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/add4d572cd124032a419469ee3e88bf2/html
Explore at:
pdfAvailable download formats
Area covered

Description
no abstract provided
u
Data life cycle from survey files, pre-processing, analysis to visualisation...
figshare.unimelb.edu.au
png
Updated Jul 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amanda Belton; Mark Selkrig; Sharon McDonough; R.K. Keamy; Robyn Brandenburg (2025). Data life cycle from survey files, pre-processing, analysis to visualisation [Dataset]. http://doi.org/10.26188/29670347.v1
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.26188/29670347.v1
Dataset updated
Jul 30, 2025
Dataset provided by
The University of Melbourne
Authors
Amanda Belton; Mark Selkrig; Sharon McDonough; R.K. Keamy; Robyn Brandenburg
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Diagram of the process starting with collating the survey data files, these were pre-processed for analysis as shown with an image of a white baby with respondent text as well as the demographic details, a sanpshot shows how these were analysed using a miro board with lines and sticky notes, then these analysis was visualised as data portraits, data quilts and quilted bar charts.
w
Data from: Active cooling for downhole instrumentation: Preliminary analysis...
data.wu.ac.at
Updated Apr 9, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Active cooling for downhole instrumentation: Preliminary analysis and system selection [Dataset]. https://data.wu.ac.at/schema/geothermaldata_org/ODRjM2IzNjAtNTQ3Ni00MDVhLWE4M2QtYTg2YjY5MDU3OWQ1
Explore at:
Dataset updated
Apr 9, 2018
Description
No Publication Abstract is Available
e
Data Collection and Data Pre-Processing
paper.erudition.co.in
html
Updated Aug 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2025). Data Collection and Data Pre-Processing [Dataset]. https://paper.erudition.co.in/makaut/bachelor-in-business-administration-2020-2021/5/data-analytics-skills-for-managers
Explore at:
htmlAvailable download formats
Dataset updated
Aug 1, 2025
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Question Paper Solutions of chapter Data Collection and Data Pre-Processing of Data Analytics Skills for Managers, 5th Semester , Bachelor in Business Administration 2020 - 2021

Facebook

Twitter

Click to copy link

Link copied

Cite

Lara Lusa; Cécile Proust-Lima; Carsten O. Schmidt; Katherine J. Lee; Saskia le Cessie; Mark Baillie; Frank Lawrence; Marianne Huebner (2024). Initial data analysis checklist for data screening in longitudinal studies. [Dataset]. http://doi.org/10.1371/journal.pone.0295726.t001

Initial data analysis checklist for data screening in longitudinal studies.

Explore at:

xlsAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0295726.t001

Dataset updated

May 29, 2024

Dataset provided by

PLOS ONE

Authors

Lara Lusa; Cécile Proust-Lima; Carsten O. Schmidt; Katherine J. Lee; Saskia le Cessie; Mark Baillie; Frank Lawrence; Marianne Huebner

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Initial data analysis checklist for data screening in longitudinal studies.

Clear search

Close search

Google apps

Main menu

Initial data analysis checklist for data screening in longitudinal studies.

Data from: Preliminary Analysis of Stress in the Newberry EGS Well NWG 55-29...

Cervical cancer screening among women in Johannesburg

Dataset for "Are data papers cited as research data? Preliminary analysis on...

UC_vs_US Statistic Analysis.xlsx

Dataset: Preliminary analysis of open data pertaining to the services...

Preliminary analysis of citation practices in studies that analyze data from...

Preliminary Analysis of Floods in the San Vicente area, El Salvador, Nov...

PLOS Open Science Indicators

Supplementary Report for the paper "A Preliminary Analysis on the Effect of...

‘Preliminary Parcels’ analyzed by Analyst-2

Analysis of Pre-Retrofit Building and Utility Data - Southeast United States...

Data from: Preliminary analysis of a shale oil class separation procedure

Data from: Preliminary Analysis of Tank 241-C-106 Dryout due to Large...

Data from: Preliminary analysis of clay gouge from a well in the San Andreas...

Data from: The Often-Overlooked Power of Summary Statistics in Exploratory...

Data from: Preliminary analysis of integrated stratigraphic data from the...

Data life cycle from survey files, pre-processing, analysis to visualisation...

Data from: Active cooling for downhole instrumentation: Preliminary analysis...

Data Collection and Data Pre-Processing

Initial data analysis checklist for data screening in longitudinal studies.