78 datasets found
  1. Statistical Comparison of Two ROC Curves

    • figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yaacov Petscher (2023). Statistical Comparison of Two ROC Curves [Dataset]. http://doi.org/10.6084/m9.figshare.860448.v1
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Yaacov Petscher
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This excel file will do a statistical tests of whether two ROC curves are different from each other based on the Area Under the Curve. You'll need the coefficient from the presented table in the following article to enter the correct AUC value for the comparison: Hanley JA, McNeil BJ (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148:839-843.

  2. H

    Time-Series Matrix (TSMx): A visualization tool for plotting multiscale...

    • dataverse.harvard.edu
    Updated Jul 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Georgios Boumis; Brad Peter (2024). Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends [Dataset]. http://doi.org/10.7910/DVN/ZZDYM9
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 8, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Georgios Boumis; Brad Peter
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends TSMx is an R script that was developed to facilitate multi-temporal-scale visualizations of time-series data. The script requires only a two-column CSV of years and values to plot the slope of the linear regression line for all possible year combinations from the supplied temporal range. The outputs include a time-series matrix showing slope direction based on the linear regression, slope values plotted with colors indicating magnitude, and results of a Mann-Kendall test. The start year is indicated on the y-axis and the end year is indicated on the x-axis. In the example below, the cell in the top-right corner is the direction of the slope for the temporal range 2001–2019. The red line corresponds with the temporal range 2010–2019 and an arrow is drawn from the cell that represents that range. One cell is highlighted with a black border to demonstrate how to read the chart—that cell represents the slope for the temporal range 2004–2014. This publication entry also includes an excel template that produces the same visualizations without a need to interact with any code, though minor modifications will need to be made to accommodate year ranges other than what is provided. TSMx for R was developed by Georgios Boumis; TSMx was originally conceptualized and created by Brad G. Peter in Microsoft Excel. Please refer to the associated publication: Peter, B.G., Messina, J.P., Breeze, V., Fung, C.Y., Kapoor, A. and Fan, P., 2024. Perspectives on modifiable spatiotemporal unit problems in remote sensing of agriculture: evaluating rice production in Vietnam and tools for analysis. Frontiers in Remote Sensing, 5, p.1042624. https://www.frontiersin.org/journals/remote-sensing/articles/10.3389/frsen.2024.1042624 TSMx sample chart from the supplied Excel template. Data represent the productivity of rice agriculture in Vietnam as measured via EVI (enhanced vegetation index) from the NASA MODIS data product (MOD13Q1.V006). TSMx R script: # import packages library(dplyr) library(readr) library(ggplot2) library(tibble) library(tidyr) library(forcats) library(Kendall) options(warn = -1) # disable warnings # read data (.csv file with "Year" and "Value" columns) data <- read_csv("EVI.csv") # prepare row/column names for output matrices years <- data %>% pull("Year") r.names <- years[-length(years)] c.names <- years[-1] years <- years[-length(years)] # initialize output matrices sign.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) pval.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) slope.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) # function to return remaining years given a start year getRemain <- function(start.year) { years <- data %>% pull("Year") start.ind <- which(data[["Year"]] == start.year) + 1 remain <- years[start.ind:length(years)] return (remain) } # function to subset data for a start/end year combination splitData <- function(end.year, start.year) { keep <- which(data[['Year']] >= start.year & data[['Year']] <= end.year) batch <- data[keep,] return(batch) } # function to fit linear regression and return slope direction fitReg <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(sign(slope)) } # function to fit linear regression and return slope magnitude fitRegv2 <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(slope) } # function to implement Mann-Kendall (MK) trend test and return significance # the test is implemented only for n>=8 getMann <- function(batch) { if (nrow(batch) >= 8) { mk <- MannKendall(batch[['Value']]) pval <- mk[['sl']] } else { pval <- NA } return(pval) } # function to return slope direction for all combinations given a start year getSign <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) signs <- lapply(combs, fitReg) return(signs) } # function to return MK significance for all combinations given a start year getPval <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) pvals <- lapply(combs, getMann) return(pvals) } # function to return slope magnitude for all combinations given a start year getMagn <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) magns <- lapply(combs, fitRegv2) return(magns) } # retrieve slope direction, MK significance, and slope magnitude signs <- lapply(years, getSign) pvals <- lapply(years, getPval) magns <- lapply(years, getMagn) # fill-in output matrices dimension <- nrow(sign.matrix) for (i in 1:dimension) { sign.matrix[i, i:dimension] <- unlist(signs[i]) pval.matrix[i, i:dimension] <- unlist(pvals[i]) slope.matrix[i, i:dimension] <- unlist(magns[i]) } sign.matrix <-...

  3. f

    UC_vs_US Statistic Analysis.xlsx

    • figshare.com
    xlsx
    Updated Jul 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    F. (Fabiano) Dalpiaz (2020). UC_vs_US Statistic Analysis.xlsx [Dataset]. http://doi.org/10.23644/uu.12631628.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 9, 2020
    Dataset provided by
    Utrecht University
    Authors
    F. (Fabiano) Dalpiaz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sheet 1 (Raw-Data): The raw data of the study is provided, presenting the tagging results for the used measures described in the paper. For each subject, it includes multiple columns: A. a sequential student ID B an ID that defines a random group label and the notation C. the used notation: user Story or use Cases D. the case they were assigned to: IFA, Sim, or Hos E. the subject's exam grade (total points out of 100). Empty cells mean that the subject did not take the first exam F. a categorical representation of the grade L/M/H, where H is greater or equal to 80, M is between 65 included and 80 excluded, L otherwise G. the total number of classes in the student's conceptual model H. the total number of relationships in the student's conceptual model I. the total number of classes in the expert's conceptual model J. the total number of relationships in the expert's conceptual model K-O. the total number of encountered situations of alignment, wrong representation, system-oriented, omitted, missing (see tagging scheme below) P. the researchers' judgement on how well the derivation process explanation was explained by the student: well explained (a systematic mapping that can be easily reproduced), partially explained (vague indication of the mapping ), or not present.

    Tagging scheme:
    Aligned (AL) - A concept is represented as a class in both models, either
    

    with the same name or using synonyms or clearly linkable names; Wrongly represented (WR) - A class in the domain expert model is incorrectly represented in the student model, either (i) via an attribute, method, or relationship rather than class, or (ii) using a generic term (e.g., user'' instead ofurban planner''); System-oriented (SO) - A class in CM-Stud that denotes a technical implementation aspect, e.g., access control. Classes that represent legacy system or the system under design (portal, simulator) are legitimate; Omitted (OM) - A class in CM-Expert that does not appear in any way in CM-Stud; Missing (MI) - A class in CM-Stud that does not appear in any way in CM-Expert.

    All the calculations and information provided in the following sheets
    

    originate from that raw data.

    Sheet 2 (Descriptive-Stats): Shows a summary of statistics from the data collection,
    

    including the number of subjects per case, per notation, per process derivation rigor category, and per exam grade category.

    Sheet 3 (Size-Ratio):
    

    The number of classes within the student model divided by the number of classes within the expert model is calculated (describing the size ratio). We provide box plots to allow a visual comparison of the shape of the distribution, its central value, and its variability for each group (by case, notation, process, and exam grade) . The primary focus in this study is on the number of classes. However, we also provided the size ratio for the number of relationships between student and expert model.

    Sheet 4 (Overall):
    

    Provides an overview of all subjects regarding the encountered situations, completeness, and correctness, respectively. Correctness is defined as the ratio of classes in a student model that is fully aligned with the classes in the corresponding expert model. It is calculated by dividing the number of aligned concepts (AL) by the sum of the number of aligned concepts (AL), omitted concepts (OM), system-oriented concepts (SO), and wrong representations (WR). Completeness on the other hand, is defined as the ratio of classes in a student model that are correctly or incorrectly represented over the number of classes in the expert model. Completeness is calculated by dividing the sum of aligned concepts (AL) and wrong representations (WR) by the sum of the number of aligned concepts (AL), wrong representations (WR) and omitted concepts (OM). The overview is complemented with general diverging stacked bar charts that illustrate correctness and completeness.

    For sheet 4 as well as for the following four sheets, diverging stacked bar
    

    charts are provided to visualize the effect of each of the independent and mediated variables. The charts are based on the relative numbers of encountered situations for each student. In addition, a "Buffer" is calculated witch solely serves the purpose of constructing the diverging stacked bar charts in Excel. Finally, at the bottom of each sheet, the significance (T-test) and effect size (Hedges' g) for both completeness and correctness are provided. Hedges' g was calculated with an online tool: https://www.psychometrica.de/effect_size.html. The independent and moderating variables can be found as follows:

    Sheet 5 (By-Notation):
    

    Model correctness and model completeness is compared by notation - UC, US.

    Sheet 6 (By-Case):
    

    Model correctness and model completeness is compared by case - SIM, HOS, IFA.

    Sheet 7 (By-Process):
    

    Model correctness and model completeness is compared by how well the derivation process is explained - well explained, partially explained, not present.

    Sheet 8 (By-Grade):
    

    Model correctness and model completeness is compared by the exam grades, converted to categorical values High, Low , and Medium.

  4. Data and program: Comparison between Machine Learning Models and...

    • zenodo.org
    zip
    Updated Jul 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jinxu Li; Xiang Song; Jiangjiang Xia; Wei Shangguan; Xiaodong Zeng; Jinxu Li; Xiang Song; Jiangjiang Xia; Wei Shangguan; Xiaodong Zeng (2025). Data and program: Comparison between Machine Learning Models and Conventional Statistical Models in Predicting Global Tree Canopy Height and Crown Radius [Dataset]. http://doi.org/10.5281/zenodo.15951974
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 16, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jinxu Li; Xiang Song; Jiangjiang Xia; Wei Shangguan; Xiaodong Zeng; Jinxu Li; Xiang Song; Jiangjiang Xia; Wei Shangguan; Xiaodong Zeng
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The attachment includes three folders:
    The first folder, Data classification (testing and training), consists of two folders (crown_radius and height), the first crown_radius folder It contains excel data of three plant functional types (PFTs) - temperate needleleaf trees (MN), temperate broadleaf trees (MB) and tropical broadleaf trees (TB), these three excel data all contain 19 soil factors data, 22 climate factors data and information such as crown_radius_m, mask, stem_diameter_cm, etc. The information in the second height folder is similar, and it corresponds to Table 1.Data summary and Figure 3 for each PFT in the article;

    The second folder, Feather importance, contains two excel spreadsheets (crown_radius-FI and height-FI), the first excel spreadsheet of crown_radius-FI Feather importance containing three plant functional types (PFTs) is temperate needleleaf trees (MN), temperate broadleaf trees (MB), and tropical broadleaf trees (TB); The excel table information of the second height-FI is similar, and its information corresponds to Figure 5 and Figure S3 in the article;

    The third folder "program" contains two packages (make_model1 and make_model2) and a calling program "Source program". Among them, the make_model1 package is mainly used to obtain the best parameters for selecting the model; The make_model2 package is based on the selection of the make_model1 package to further analyze the specific FI values of the factors in the best model. The Source program is used to make specific calls to the package according to the requirements.

  5. Input-Output Data Sets Used in the Evaluation of the Two-Layer Soil Moisture...

    • catalog.data.gov
    • s.cnmilf.com
    Updated Mar 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2023). Input-Output Data Sets Used in the Evaluation of the Two-Layer Soil Moisture and Flux Model [Dataset]. https://catalog.data.gov/dataset/input-output-data-sets-used-in-the-evaluation-of-the-two-layer-soil-moisture-and-flux-mode
    Explore at:
    Dataset updated
    Mar 3, 2023
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The Excel file contains the model input-out data sets that where used to evaluate the two-layer soil moisture and flux dynamics model. The model is original and was developed by Dr. Hantush by integrating the well-known Richards equation over the root layer and the lower vadose zone. The input-output data are used for: 1) the numerical scheme verification by comparison against HYDRUS model as a benchmark; 2) model validation by comparison against real site data; and 3) for the estimation of model predictive uncertainty and sources of modeling errors. This dataset is associated with the following publication: He, J., M.M. Hantush, L. Kalin, and S. Isik. Two-Layer numerical model of soil moisture dynamics: Model assessment and Bayesian uncertainty estimation. JOURNAL OF HYDROLOGY. Elsevier Science Ltd, New York, NY, USA, 613 part A: 128327, (2022).

  6. Data-analysis-EXCEL-POWER-BI

    • kaggle.com
    Updated Jul 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmed Samir (2023). Data-analysis-EXCEL-POWER-BI [Dataset]. https://www.kaggle.com/datasets/ahmedsamir11111/data-analysis-excel-power-bi/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 27, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ahmed Samir
    Description

    In the beginning, the case was just data for a company that did not indicate any useful information that would help decision-makers. In this case, after collecting a number of revenues and expenses over the months. Needed to know the answers to a number of questions to make important decisions based on intuition-free data. The Questions:- About Rev. & Exp.
    - What is the total sales and profit for the whole period? And What Total products sold? And What is Net profit? - In which month was the highest percentage of revenue achieved? And in the same month, what is the largest day have amount of revenue? - In which month was the highest percentage of expenses achieved? And in the same month, what is the largest day have amount of exp.? - What is the extent of the change in expenditures for each month? Percentage change in net profit over the months? About Distribution - What is the number of products sold each month in the largest state? -The top 3 largest states buying products during the two years? Comparison - Between Sales Method by Sales? - Between Men and Women’s Product by Sales? - Between Retailer by Profit?

    What I did? - Understanding the data - preprocessing and clean the data - Solve The problems in the cleaning like missing data or false type data - querying the data and make some calculations like "COGS" with power query "Excel". - Modeling and make some measures on the data with power pivot "Excel" - After finishing processing and preparation, I made Some Pivot tables to answers the questions. - Last, I made a dashboard with Power BI to visualize The Results.

  7. Replication Package - How Do Requirements Evolve During Elicitation? An...

    • zenodo.org
    bin, zip
    Updated Apr 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alessio Ferrari; Alessio Ferrari; Paola Spoletini; Paola Spoletini; Sourav Debnath; Sourav Debnath (2022). Replication Package - How Do Requirements Evolve During Elicitation? An Empirical Study Combining Interviews and App Store Analysis [Dataset]. http://doi.org/10.5281/zenodo.6472498
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Apr 21, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alessio Ferrari; Alessio Ferrari; Paola Spoletini; Paola Spoletini; Sourav Debnath; Sourav Debnath
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the replication package for the paper titled "How Do Requirements Evolve During Elicitation? An Empirical Study Combining Interviews and App Store Analysis", by Alessio Ferrari, Paola Spoletini and Sourav Debnath.

    The package contains the following folders and files.

    /R-analysis

    This is a folder containing all the R implementations of the the statistical tests included in the paper, together with the source .csv file used to produce the results. Each R file has the same title as the associated .csv file. The titles of the files reflect the RQs as they appear in the paper. The association between R files and Tables in the paper is as follows:

    - RQ1-1-analyse-story-rates.R: Tabe 1, user story rates

    - RQ1-1-analyse-role-rates.R: Table 1, role rates

    - RQ1-2-analyse-story-category-phase-1.R: Table 3, user story category rates in phase 1 compared to original rates

    - RQ1-2-analyse-role-category-phase-1.R: Table 5, role category rates in phase 1 compared to original rates

    - RQ2.1-analysis-app-store-rates-phase-2.R: Table 8, user story and role rates in phase 2

    - RQ2.2-analysis-percent-three-CAT-groups-ph1-ph2.R: Table 9, comparison of the categories of user stories in phase 1 and 2

    - RQ2.2-analysis-percent-two-CAT-roles-ph1-ph2.R: Table 10, comparison of the categories of roles in phase 1 and 2.

    The .csv files used for statistical tests are also used to produce boxplots. The association betwee boxplot figures and files is as follows.

    - RQ1-1-story-rates.csv: Figure 4

    - RQ1-1-role-rates.csv: Figure 5

    - RQ1-2-categories-phase-1.csv: Figure 8

    - RQ1-2-role-category-phase-1.csv: Figure 9

    - RQ2-1-user-story-and-roles-phase-2.csv: Figure 13

    - RQ2.2-percent-three-CAT-groups-ph1-ph2.csv: Figure 14

    - RQ2.2-percent-two-CAT-roles-ph1-ph2.csv: Figure 17

    - IMG-only-RQ2.2-us-category-comparison-ph1-ph2.csv: Figure 15

    - IMG-only-RQ2.2-frequent-roles.csv: Figure 18

    NOTE: The last two .csv files do not have an associated statistical tests, but are used solely to produce boxplots.

    /Data-Analysis

    This folder contains all the data used to answer the research questions.

    RQ1.xlsx: includes all the data associated to RQ1 subquestions, two tabs for each subquestion (one for user stories and one for roles). The names of the tabs are self-explanatory of their content.

    RQ2.1.xlsx: includes all the data for the RQ1.1 subquestion. Specifically, it includes the following tabs:

    - Data Source-US-category: for each category of user story, and for each analyst, there are two lines.

    The first one reports the number of user stories in that category for phase 1, and the second one reports the

    number of user stories in that category for phase 2, considering the specific analyst.

    - Data Source-role: for each category of role, and for each analyst, there are two lines.

    The first one reports the number of user stories in that role for phase 1, and the second one reports the

    number of user stories in that role for phase 2, considering the specific analyst.

    - RQ2.1 rates: reports the final rates for RQ2.1.

    NOTE: The other tabs are used to support the computation of the final rates.

    RQ2.2.xlsx: includes all the data for the RQ2.2 subquestion. Specifically, it includes the following tabs:

    - Data Source-US-category: same as RQ2.1.xlsx

    - Data Source-role: same as RQ2.1.xlsx

    - RQ2.2-category-group: comparison between groups of categories in the different phases, used to produce Figure 14

    - RQ2.2-role-group: comparison between role groups in the different phases, used to produce Figure 17

    - RQ2.2-specific-roles-diff: difference between specific roles, used to produce Figure 18

    NOTE: the other tabs are used to support the computation of the values reported in the tabs above.

    RQ2.2-single-US-category.xlsx: includes the data for the RQ2.2 subquestion associated to single categories of user stories.

    A separate tab is used given the complexity of the computations.

    - Data Source-US-category: same as RQ2.1.xlsx

    - Totals: total number of user stories for each analyst in phase 1 and phase 2

    - Results-Rate-Comparison: difference between rates of user stories in phase 1 and phase 2, used to produce the file

    "img/IMG-only-RQ2.2-us-category-comparison-ph1-ph2.csv", which is in turn used to produce Figure 15

    - Results-Analysts: number of analysts using each novel category produced in phase 2, used to produce Figure 16.

    NOTE: the other tabs are used to support the computation of the values reported in the tabs above.

    RQ2.3.xlsx: includes the data for the RQ2.3 subquestion. Specifically, it includes the following tabs:

    - Data Source-US-category: same as RQ2.1.xlsx

    - Data Source-role: same as RQ2.1.xlsx

    - RQ2.3-categories: novel categories produced in phase 2, used to produce Figure 19

    - RQ2-3-most-frequent-categories: most frequent novel categories

    /Raw-Data-Phase-I

    The folder contains one Excel file for each analyst, s1.xlsx...s30.xlsx, plus the file of the original user stories with annotations (original-us.xlsx). Each file contains two tabs:

    - Evaluation: includes the annotation of the user stories as existing user story in the original categories (annotated with "E"), novel user story in a certain category (refinement, annotated with "N"), and novel user story in novel category (Name of the category in column "New Feature"). **NOTE 1:** It should be noticed that in the paper the case "refinement" is said to be annotated with "R" (instead of "N", as in the files) to make the paper clearer and easy to read.

    - Roles: roles used in the user stories, and count of the user stories belonging to a certain role.

    /Raw-Data-Phaes-II

    The folder contains one Excel file for each analyst, s1.xlsx...s30.xlsx. Each file contains two tabs:

    - Analysis: includes the annotation of the user stories as belonging to existing original

    category (X), or to categories introduced after interviews, or to categories introduced

    after app store inspired elicitation (name of category in "Cat. Created in PH1"), or to

    entirely novel categories (name of category in "New Category").

    - Roles: roles used in the user stories, and count of the user stories belonging to a certain role.

    /Figures

    This folder includes the figures reported in the paper. The boxplots are generated from the

    data using the tool http://shiny.chemgrid.org/boxplotr/. The histograms and other plots are

    produced with Excel, and are also reported in the excel files listed above.

  8. d

    Oldrieve Excel File Datebase for Computation 3018487 -- Teaching K-3...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Sep 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oldrieve, Richard (2024). Oldrieve Excel File Datebase for Computation 3018487 -- Teaching K-3 Multi-Digit Arithmetic Computation to Students with Slow Language Processing [Dataset]. http://doi.org/10.7910/DVN/PDHFKV
    Explore at:
    Dataset updated
    Sep 24, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Oldrieve, Richard
    Description

    The attached excel file contains the database for Studies A & B for the article submitted to the journal "Computation" with submission number 3018487. It also contains data for test piloting the Blended Arithmetic Curriculum in two classrooms who participated in Study B. There is a pre-test that was administered in January and then a post-test given in February that covered Chapter 7--which is the chapter where students have learned how to compute 2-digit by 2-digit numbers that focus on the "Limited Facts" of adding on 1, adding on 0, adding 5+5, 9+1, as well as 7+7, 7+8, 8+7, and 8+8. The students who completed these seven chapters with accuracy and speed, did quite well on the urban school district's 2nd grade math proficiency test. Unfortunately, the section of the paper containing the results of the Chapter 7 assessment needed to be cut because it was confusing to explain and reviewers wanted the article shortened.

  9. m

    Excel generated epidemic curves for the paper "A Simple, SIR-like but...

    • data.mendeley.com
    Updated Dec 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaoping Liu (2020). Excel generated epidemic curves for the paper "A Simple, SIR-like but Individual-Based Epidemic Model: Application in Comparison of COVID-19 in New York City and Wuhan" [Dataset]. http://doi.org/10.17632/3vg2r3ymgk.3
    Explore at:
    Dataset updated
    Dec 12, 2020
    Authors
    Xiaoping Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    New York, Wuhan
    Description

    The author has calculated and plotted all epidemic curves in Excel for the paper "A Simple, SIR-like but Individual-Based Epidemic Model: Application in Comparison of COVID-19 in New York City and Wuhan". All these calculated curves are shown in Figures 2-11, which are separately placed in different sheets in the Excel file. The values of parameters l and c are separately placed in two cells marked in yellow. The two cells are located in top one or two row on the left. After the two parameters are changed, the Excel file will calculate the 4 variables An, In, Rn and Tn from n=1 to N. The calculated values are listed in 4 different columns of cells below the column labels An, In, Rn and Tn, respectively.

  10. f

    Additional file 2: Table S2. of Comparison, alignment, and synchronization...

    • springernature.figshare.com
    • datasetcatalog.nlm.nih.gov
    xlsx
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Edison Ong; Sirarat Sarntivijai; Simon Jupp; Helen Parkinson; Yongqun He (2023). Additional file 2: Table S2. of Comparison, alignment, and synchronization of cell line information between CLO and EFO [Dataset]. http://doi.org/10.6084/m9.figshare.5728968.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    figshare
    Authors
    Edison Ong; Sirarat Sarntivijai; Simon Jupp; Helen Parkinson; Yongqun He
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Final EFO-CLO alignment result. The 874 EFO-CLO mapped cell lines aligned and merged into CLO (Tab. 1 in the excel file) and 344 EFO unique immortalized permanent cell lines added to CLO (Tab. 2 in the excel file). File is stored in Microsoft Excel spreadsheet (xlsx) format. (XLSX 54Â kb)

  11. C

    Hospital Annual Financial Data - Selected Data & Pivot Tables

    • data.chhs.ca.gov
    • data.ca.gov
    • +5more
    csv, data, doc, html +4
    Updated Apr 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Health Care Access and Information (2025). Hospital Annual Financial Data - Selected Data & Pivot Tables [Dataset]. https://data.chhs.ca.gov/dataset/hospital-annual-financial-data-selected-data-pivot-tables
    Explore at:
    xls(44933632), pdf(258239), xlsx(758089), pdf(121968), html, xls(44967936), xlsx(763636), xlsx(770931), xls(51554816), xlsx(765216), xlsx(754073), xlsx, xls(14657536), pdf(383996), xlsx(758376), xls(920576), doc, xls(16002048), xls(51424256), xls(19577856), xlsx(768036), xls, xlsx(769128), xlsx(14714368), zip, pdf(310420), xls(18301440), xls(19625472), xlsx(777616), xls(18445312), xlsx(756356), xlsx(790979), pdf(303198), xlsx(771275), xlsx(779866), xls(19599360), pdf(333268), csv(205488092), xlsx(750199), data, xls(19650048), xlsx(782546)Available download formats
    Dataset updated
    Apr 23, 2025
    Dataset authored and provided by
    Department of Health Care Access and Information
    Description

    On an annual basis (individual hospital fiscal year), individual hospitals and hospital systems report detailed facility-level data on services capacity, inpatient/outpatient utilization, patients, revenues and expenses by type and payer, balance sheet and income statement.

    Due to the large size of the complete dataset, a selected set of data representing a wide range of commonly used data items, has been created that can be easily managed and downloaded. The selected data file includes general hospital information, utilization data by payer, revenue data by payer, expense data by natural expense category, financial ratios, and labor information.

    There are two groups of data contained in this dataset: 1) Selected Data - Calendar Year: To make it easier to compare hospitals by year, hospital reports with report periods ending within a given calendar year are grouped together. The Pivot Tables for a specific calendar year are also found here. 2) Selected Data - Fiscal Year: Hospital reports with report periods ending within a given fiscal year (July-June) are grouped together.

  12. d

    Spreadsheet of best models for each downscaled climate dataset and for all...

    • catalog.data.gov
    • data.usgs.gov
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Spreadsheet of best models for each downscaled climate dataset and for all downscaled climate datasets considered together (Best_model_lists.xlsx) [Dataset]. https://catalog.data.gov/dataset/spreadsheet-of-best-models-for-each-downscaled-climate-dataset-and-for-all-downscaled-clim
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    The South Florida Water Management District (SFWMD) and the U.S. Geological Survey have developed projected future change factors for precipitation depth-duration-frequency (DDF) curves at 174 National Oceanic and Atmospheric Administration (NOAA) Atlas 14 stations in central and south Florida. The change factors were computed as the ratio of projected future to historical extreme precipitation depths fitted to extreme precipitation data from various downscaled climate datasets using a constrained maximum likelihood (CML) approach. The change factors correspond to the period 2050-2089 (centered in the year 2070) as compared to the 1966-2005 historical period. A Microsoft Excel workbook is provided that tabulates best models for each downscaled climate dataset and for all downscaled climate datasets considered together. Best models were identified based on how well the models capture the climatology and interannual variability of four climate extreme indices using the Model Climatology Index (MCI) and the Model Variability Index (MVI) of Srivastava and others (2020). The four indices consist of annual maxima consecutive precipitation for durations of 1, 3, 5, and 7 days compared against the same indices computed based on the PRISM and SFWMD gridded precipitation datasets for two climate regions: climate region 4 in South Central Florida, and climate region 5 in South Florida. The PRISM dataset is based on the Parameter-elevation Relationships on Independent Slopes Model interpolation method of Daly and others (2008). The South Florida Water Management District’s (SFWMD) precipitation super-grid is a gridded precipitation dataset developed by modelers at the agency for use in hydrologic modeling (SFWMD, 2005). This dataset is considered by the SFWMD as the best available gridded rainfall dataset for south Florida. Best models were selected based on MCI and MVI evaluated within each individual downscaled dataset. In addition, best models were selected by comparison across datasets and referred to as "ALL DATASETS" hereafter. Due to the small sample size, all models in the using the Weather Research and Forecasting Model (JupiterWRF) dataset were considered as best models.

  13. o

    BY-COVID - WP5 - Baseline Use Case: SARS-CoV-2 vaccine effectiveness...

    • explore.openaire.eu
    Updated Jan 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francisco Estupiñán-Romero; Nina Van Goethem; Marjan Meurisse; Javier González-Galindo; Enrique Bernal-Delgado (2023). BY-COVID - WP5 - Baseline Use Case: SARS-CoV-2 vaccine effectiveness assessment - Common Data Model Specification [Dataset]. http://doi.org/10.5281/zenodo.6913045
    Explore at:
    Dataset updated
    Jan 26, 2023
    Authors
    Francisco Estupiñán-Romero; Nina Van Goethem; Marjan Meurisse; Javier González-Galindo; Enrique Bernal-Delgado
    Description

    This publication corresponds to the Common Data Model (CDM) specification of the Baseline Use Case proposed in T.5.2 (WP5) in the BY-COVID project on “SARS-CoV-2 Vaccine(s) effectiveness in preventing SARS-CoV-2 infection.” Research Question: “How effective have the SARS-CoV-2 vaccination programmes been in preventing SARS-CoV-2 infections?” Intervention (exposure): COVID-19 vaccine(s) Outcome: SARS-CoV-2 infection Subgroup analysis: Vaccination schedule (type of vaccine) Study Design: An observational retrospective longitudinal study to assess the effectiveness of the SARS-CoV-2 vaccine in preventing SARS-CoV-2 infections using routinely collected social, health and care data from several countries. A causal model was established using Directed Acyclic Graphs (DAGs) to map domain knowledge, theories and assumptions about the causal relationship between exposure and outcome. The DAG developed for the research question of interest is shown below. Cohort definition: All people eligible to be vaccinated (from 5 to 115 years old, included) or with, at least, one dose of a SARS-CoV-2 vaccine (any of the available brands) having or not a previous SARS-CoV-2 infection. Inclusion criteria: All people vaccinated with at least one dose of the COVID-19 vaccine (any available brands) in an area of residence. Any person eligible to be vaccinated (from 5 to 115 years old, included) with a positive diagnosis (irrespective of the type of test) for SARS-CoV-2 infection (COVID-19) during the period of study. Exclusion criteria: People not eligible for the vaccine (from 0 to 4 years old, included) Study period: From the date of the first documented SARS-CoV-2 infection in each country to the most recent date in which data is available at the time of analysis. Roughly from 01-03-2020 to 30-06-2022, depending on the country. Files included in this publication: Causal model (responding to the research question) SARS-CoV-2 vaccine effectiveness causal model v.1.0.0 (HTML) - Interactive report showcasing the structural causal model (DAG) to answer the research question SARS-CoV-2 vaccine effectiveness causal model v.1.0.0 (QMD) - Quarto RMarkdown script to produce the structural causal model Common data model specification (following the causal model) SARS-CoV-2 vaccine effectiveness data model specification (XLXS) - Human-readable version (Excel) SARS-CoV-2 vaccine effectiveness data model specification dataspice (HTML) - Human-readable version (interactive report) SARS-CoV-2 vaccine effectiveness data model specification dataspice (JSON) - Machine-readable version Synthetic dataset (complying with the common data model specifications) SARS-CoV-2 vaccine effectiveness synthetic dataset (CSV) [UTF-8, pipe | separated, N~650,000 registries] SARS-CoV-2 vaccine effectiveness synthetic dataset EDA (HTML) - Interactive report of the exploratory data analysis (EDA) of the synthetic dataset SARS-CoV-2 vaccine effectiveness synthetic dataset EDA (JSON) - Machine-readable version of the exploratory data analysis (EDA) of the synthetic dataset SARS-CoV-2 vaccine effectiveness synthetic dataset generation script (IPYNB) - Jupyter notebook with Python scripting and commenting to generate the synthetic dataset #### Baseline Use Case: SARS-CoV-2 vaccine effectiveness assessment - Common Data Model Specification v.1.1.0 change log #### Updated Causal model to eliminate the consideration of 'vaccination_schedule_cd' as a mediator Adjusted the study period to be consistent with the Study Protocol Updated 'sex_cd' as a required variable Added 'chronic_liver_disease_bl' as a comorbidity at the individual level Updated 'socecon_lvl_cd' at the area level as a recommended variable Added crosswalks for the definition of 'chronic_liver_disease_bl' in a separate sheet Updated the 'vaccination_schedule_cd' reference to the 'Vaccine' node in the updated DAG Updated the description of the 'confirmed_case_dt' and 'previous_infection_dt' variables to clarify the definition and the need for a single registry per person The scripts (software) accompanying the data model specification are offered "as-is" without warranty and disclaiming liability for damages resulting from using it. The software is released under the CC-BY-4.0 licence, which permits you to use the content for almost any purpose (but does not grant you any trademark permissions), so long as you note the license and give credit.

  14. Store Data Analysis using MS excel

    • kaggle.com
    Updated Mar 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NisshaaChoudhary (2024). Store Data Analysis using MS excel [Dataset]. https://www.kaggle.com/datasets/nisshaachoudhary/store-data-analysis-using-ms-excel/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 10, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    NisshaaChoudhary
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Vrinda Store: Interactive Ms Excel dashboardVrinda Store: Interactive Ms Excel dashboard Feb 2024 - Mar 2024Feb 2024 - Mar 2024 The owner of Vrinda store wants to create an annual sales report for 2022. So that their employees can understand their customers and grow more sales further. Questions asked by Owner of Vrinda store are as follows:- 1) Compare the sales and orders using single chart. 2) Which month got the highest sales and orders? 3) Who purchased more - women per men in 2022? 4) What are different order status in 2022?

    And some other questions related to business. The owner of Vrinda store wanted a visual story of their data. Which can depict all the real time progress and sales insight of the store. This project is a Ms Excel dashboard which presents an interactive visual story to help the Owner and employees in increasing their sales. Task performed : Data cleaning, Data processing, Data analysis, Data visualization, Report. Tool used : Ms Excel The owner of Vrinda store wants to create an annual sales report for 2022. So that their employees can understand their customers and grow more sales further. Questions asked by Owner of Vrinda store are as follows:- 1) Compare the sales and orders using single chart. 2) Which month got the highest sales and orders? 3) Who purchased more - women per men in 2022? 4) What are different order status in 2022? And some other questions related to business. The owner of Vrinda store wanted a visual story of their data. Which can depict all the real time progress and sales insight of the store. This project is a Ms Excel dashboard which presents an interactive visual story to help the Owner and employees in increasing their sales. Task performed : Data cleaning, Data processing, Data analysis, Data visualization, Report. Tool used : Ms Excel Skills: Data Analysis · Data Analytics · ms excel · Pivot Tables

  15. r

    Data from: Event conceptualisation and aspect in L2 English and Persian: An...

    • researchdata.se
    • demo.researchdata.se
    Updated Nov 7, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Somaje Abdollahian Barough (2019). Event conceptualisation and aspect in L2 English and Persian: An application of the Heidelberg-Paris model [Dataset]. http://doi.org/10.5878/wz3s-wt38
    Explore at:
    (10147845)Available download formats
    Dataset updated
    Nov 7, 2019
    Dataset provided by
    Stockholm University
    Authors
    Somaje Abdollahian Barough
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Time period covered
    Aug 1, 2010 - Jul 31, 2013
    Area covered
    Islamic Republic of, Iran, Sweden, United States, United Kingdom
    Description

    The data have been used in an investigation for a PhD thesis in English Linguistics on similarities and differences in the use of the progressive aspect in two different language systems, English and Persian, both of which have the grammaticalised progressive. It is an application of the Heidelberg-Paris model of investigation into the impact of the progressive aspect on event conceptualisation. It builds on an analysis of single event descriptions at sentence level and re-narrations of a film clip at discourse level, as presented in von Stutterheim and Lambert (2005) DOI: 10.1515/9783110909593.203; Carroll and Lambert (2006: 54–73) http://libris.kb.se/bib/10266700; and von Stutterheim, Andermann, Carroll, Flecken & Schmiedtová (2012) DOI: 10.1515/ling-2012-0026. However, there are system-based typological differences between these two language systems due to the absence/presence of the imperfective-perfective categories, respectively. Thus, in addition to the description of the status of the progressive aspect in English and Persian and its impact on event conceptualisation, an important part of the investigation is the analysis of the L2 English speakers’ language production as the progressives in the first languages, L1s, exhibit differences in their principles of use due to the typological differences. The question of importance in the L2 context concerns the way they conceptualise ongoing events when the language systems are different, i.e. whether their language production is conceptually driven by their first language Persian.

    The data consist of two data sets as the study includes two linguistic experiments, Experiment 1 and Experiment 2. The data for both experiments were collected by email. Separate forms of instructions, and language background questions were prepared for the six different informant groups, i.e. three speaker groups and two experimental tasks, as well as a Nelson English test https://www.worldcat.org/isbn/9780175551972 on the proficiency of English for Experiment 2 was selected and modified for the L2 English speaker group. Nelson English tests are published in Fowler, W.S. & Coe, N. (1976). Nelson English tests. Middlesex: Nelson and Sons. The test battery provides tests for all levels of proficiency. The graded tests are compiled in ten sets from elementary to very advanced level. Each set includes four graded tests, i.e. A, B, C, and D, resulting in 40 separate tests, each with 50 multiple-choice questions. The test entitled 250C was selected for this project. It belongs to the slot 19 out of the 40 slots of the total battery. The multiple-choice questions were checked with a native English professional and 5 inadequate questions relevant for pronunciation were omitted. In addition, a few modifications of the grammar questions were made, aiming at including questions that involve a contrast for the Persian L2 English learner with respect to the grammars of the two languages. The omissions and modifications provide an appropriate grammar test for very advanced Iranian learners of L2 English who have learnt the language in a classroom setting. The data set collected from the informants are characterised as follows: The data from Experiment 1 functions as the basis for the description of the progressive aspect in English, Persian and L2 English, while the data from Experiment 2 is the basis for the analysis of its use in a long stretch of discourse/language production for the three speaker groups. The parameters selected for the investigation comprised, first, phasal decomposition, which involves the use of the progressive in unrelated single motion events and narratives, and uses of begin/start in narratives. Second, granularity in narratives, which relates to the overall amount of language production in narratives. Third, event boundedness (encoded in the use of 2-state verbs and 1-state verbs with an endpoint adjunct) partly in single motion events and partly in temporal shift in narratives. Temporal shift is defined as follows: Events in the narrative which are bounded shift the time line via a right boundary; events with a left boundary also shift the time line, even if they are unbounded. Fourth, left boundary comprising the use of begin/start and try in narratives. Finally, temporal structuring, which involves the use of bounded versus unbounded events preceding the temporal adverbial then in narratives (The tests are described in the documentation files aspectL2English_Persian_Exp2Chi-square-tests-in-SPSS.docx and aspectL2English_Persian_Exp2Chi-square-tests-in-SPSS.rtf). In both experiments the participants watched a video, one relevant for single event descriptions, the other relevant for re-narration of a series of events. Thus, two different videos with stimuli for the different kinds of experimental tasks were used. For Experiment 1, a video of 63 short film clips presenting unrelated single events was provided by Professor Christiane von Stutterheim, Heidelberg University Language & Cognition (HULC) Lab, at Heidelberg University, German, https://www.hulclab.eu/. For Experiment 2, an animation called Quest produced by Thomas Stellmach 1996 was used. It is available online at http://www.youtube.com/watch?v=uTyev6OaThg. Both stimuli have been used in the previous investigations on different languages by the research groups associated with the HULC Lab. The informants were asked to describe the events seen in the stimuli videos, to record their language production and send it to the researcher. For Experiment 2, most part of the L1 English data were provided by Prof. von Stutterheim, Heidelberg University, making available 34 re-narrations of the film Quest in English. 24 of them were selected for the present investigation. The project used six different informant groups, i.e. fully separate groups for the two experiments. The data from single event descriptions in Experiment 1 were analysed quantitatively in Excel. The re-narrations of Experiment 2 were coded in NVivo 10 (2014) providing frequencies of various parametrical features (Ltd, Nv. (2014). NVivo QSR International Pty Ltd, Version 10. Doncaster, Australia: QSR International). The numbers from NVivo 10 were analysed statistically in Excel and SPSS (2017). The tools are appropriate for this research. Excel suits well for the smaller data load in Experiment 1 while NVivo 10 is practical for the large amount of data and parameters in Experiment 2. Notably, NVivo 10 enabled the analysis of the three data sets to take place in the same manner once the categories of analysis and parameters had been defined under different nodes. As the results were to be extracted in the same fashion from each data set, the L1 English data received from the Heidelberg for Experiment 2 were re-analysed according to the criteria employed in this project. Yet, the analysis in the project conforms to the criteria used earlier in the model.

  16. Data from: "Ecophysiological variation in two provenances of Pinus flexilis...

    • osti.gov
    Updated Dec 31, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Castanha, Cristina; Germino, Matthew J.; Kueppers, Lara M.; Reinhardt, Keith (2020). Data from: "Ecophysiological variation in two provenances of Pinus flexilis seedlings across an elevation gradient from forest to alpine" [Dataset]. https://www.osti.gov/dataexplorer/biblio/1804122-data-from-ecophysiological-variation-two-provenances-pinus-flexilis-seedlings-across-elevation-gradient-from-forest-alpine
    Explore at:
    Dataset updated
    Dec 31, 2020
    Dataset provided by
    United States Department of Energyhttp://energy.gov/
    Environmental System Science Data Infrastructure for a Virtual Ecosystem; Subalpine and Alpine Species Range Shifts with Climate Change: Temperature and Soil Moisture Manipulations to Test Species and Population Responses (Alpine Treeline Warming Experiment)
    Authors
    Castanha, Cristina; Germino, Matthew J.; Kueppers, Lara M.; Reinhardt, Keith
    Description

    This archive contains data used to support conclusions drawn in “Ecophysiological variation in two provenances of Pinus flexilis seedlings across an elevation gradient from forest to alpine”, by Reinhardt et al., 2011. Data were collected over one summer season in plots within the Alpine Treeline Warming Experiment (ATWE), before climate manipulations began. The experiment was located on Niwot Ridge, in the Front Range of the Colorado Rocky Mountains. This data package includes five comma-separated-values (.csv) files, five Microsoft Excel (.xlsx) files, one .pdf file, and two types of geospatial files: keyhole markup language (.kml), and ESRI shapefiles (.shp). .csv files can be opened using any simple text-editing software (such as Notepad and TextEdit), R, and Microsoft Excel. .xlsx files can only be opened using Microsoft Excel. The .pdf file can be opened using Adobe Acrobat Reader or any other compatible file viewing software. The .kml file can be opened using Google Earth and Google Maps, and shapefiles can be opened using any software compatible with the file type, such as ESRI’s ArcGIS suite and QGIS.Data archived contain gas exchange and plant physiology measurements, non-structural carbohydrate data, among others. Geospatial files are also provided for additional locational context. The files andmore » their contents in this data package are summarized under "Data Summary" in the included Data User's Guide. All files (excluding geospatial) are available in both Microsoft Excel and in .csv format, and are indicated in the Data Summary list as well.-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Climate change is predicted to cause upward shifts in forest tree distributions, which will require seedling recruitment beyond current forest boundaries. However, predicting the likelihood of successful plant establishment beyond current species’ ranges under changing climate is complicated by the interaction of genetic and environmental controls on seedling establishment. To determine how genetics and climate may interact to affect seedling establishment, we transplanted recently germinated seedlings from high- and low-elevation provenances (HI and LO, respectively) of Pinus flexilis in common gardens arrayed along an elevation and canopy gradient from subalpine forest into the alpine zone and examined differences in physiology and morphology between provenances and among sites. Plant dry mass, projected leaf area and shoot:root ratios were 12–40% greater in LO compared with HI seedlings at each elevation. There were no significant changes in these variables among sites except for decreased dry mass of LO seedlings in the alpine site. Photosynthesis, carbon balance (photosynthesis/respiration) and conductance increased >2× with elevation for both provenances, and were 35–77% greater in LO seedlings compared with HI seedlings. There were no differences in dark-adapted chlorophyll fluorescence (Fv/Fm) among sites or between provenances. Our results suggest that for P. flexilis seedlings, provenances selected for above-ground growth may outperform those selected for stress resistance in the absence of harsh climatic conditions, even well above the species’ range limits in the alpine zone. This indicates that forest genetics may be important to understanding and managing species’ range adjustments due to climate change.« less

  17. Z

    A dataset from a survey investigating disciplinary differences in data...

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gregory, Kathleen (2024). A dataset from a survey investigating disciplinary differences in data citation [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7555362
    Explore at:
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Haustein, Stefanie
    Ninkov, Anton Boudreau
    Peters, Isabella
    Ripp, Chantal
    Gregory, Kathleen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    GENERAL INFORMATION

    Title of Dataset: A dataset from a survey investigating disciplinary differences in data citation

    Date of data collection: January to March 2022

    Collection instrument: SurveyMonkey

    Funding: Alfred P. Sloan Foundation

    SHARING/ACCESS INFORMATION

    Licenses/restrictions placed on the data: These data are available under a CC BY 4.0 license

    Links to publications that cite or use the data:

    Gregory, K., Ninkov, A., Ripp, C., Peters, I., & Haustein, S. (2022). Surveying practices of data citation and reuse across disciplines. Proceedings of the 26th International Conference on Science and Technology Indicators. International Conference on Science and Technology Indicators, Granada, Spain. https://doi.org/10.5281/ZENODO.6951437

    Gregory, K., Ninkov, A., Ripp, C., Roblin, E., Peters, I., & Haustein, S. (2023). Tracing data: A survey investigating disciplinary differences in data citation. Zenodo. https://doi.org/10.5281/zenodo.7555266

    DATA & FILE OVERVIEW

    File List

    Filename: MDCDatacitationReuse2021Codebookv2.pdf Codebook

    Filename: MDCDataCitationReuse2021surveydatav2.csv Dataset format in csv

    Filename: MDCDataCitationReuse2021surveydatav2.sav Dataset format in SPSS

    Filename: MDCDataCitationReuseSurvey2021QNR.pdf Questionnaire

    Additional related data collected that was not included in the current data package: Open ended questions asked to respondents

    METHODOLOGICAL INFORMATION

    Description of methods used for collection/generation of data:

    The development of the questionnaire (Gregory et al., 2022) was centered around the creation of two main branches of questions for the primary groups of interest in our study: researchers that reuse data (33 questions in total) and researchers that do not reuse data (16 questions in total). The population of interest for this survey consists of researchers from all disciplines and countries, sampled from the corresponding authors of papers indexed in the Web of Science (WoS) between 2016 and 2020.

    Received 3,632 responses, 2,509 of which were completed, representing a completion rate of 68.6%. Incomplete responses were excluded from the dataset. The final total contains 2,492 complete responses and an uncorrected response rate of 1.57%. Controlling for invalid emails, bounced emails and opt-outs (n=5,201) produced a response rate of 1.62%, similar to surveys using comparable recruitment methods (Gregory et al., 2020).

    Methods for processing the data:

    Results were downloaded from SurveyMonkey in CSV format and were prepared for analysis using Excel and SPSS by recoding ordinal and multiple choice questions and by removing missing values.

    Instrument- or software-specific information needed to interpret the data:

    The dataset is provided in SPSS format, which requires IBM SPSS Statistics. The dataset is also available in a coded format in CSV. The Codebook is required to interpret to values.

    DATA-SPECIFIC INFORMATION FOR: MDCDataCitationReuse2021surveydata

    Number of variables: 95

    Number of cases/rows: 2,492

    Missing data codes: 999 Not asked

    Refer to MDCDatacitationReuse2021Codebook.pdf for detailed variable information.

  18. Data for: A systematic review showed no performance benefit of machine...

    • search.datacite.org
    • data.mendeley.com
    Updated Mar 14, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ben Van Calster (2019). Data for: A systematic review showed no performance benefit of machine learning over logistic regression for clinical prediction models [Dataset]. http://doi.org/10.17632/sypyt6c2mc
    Explore at:
    Dataset updated
    Mar 14, 2019
    Dataset provided by
    DataCitehttps://www.datacite.org/
    Mendeley
    Authors
    Ben Van Calster
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The uploaded files are: 1) Excel file containing 6 sheets in respective Order: "Data Extraction" (summarized final data extractions from the three reviewers involved), "Comparison Data" (data related to the comparisons investigated), "Paper level data" (summaries at paper level), "Outcome Event Data" (information with respect to number of events for every outcome investigated within a paper), "Tuning Classification" (data related to the manner of hyperparameter tuning of Machine Learning Algorithms). 2) R script used for the Analysis (In order to read the data, please: Save "Comparison Data", "Paper level data", "Outcome Event Data" Excel sheets as txt files. In the R script srpap: Refers to the "Paper level data" sheet, srevents: Refers to the "Outcome Event Data" sheet and srcompx: Refers to " Comparison data Sheet". 3) Supplementary Material: Including Search String, Tables of data, Figures 4) PRISMA checklist items

  19. f

    Data from: Consolidating and Managing Data for Drug Development within a...

    • figshare.com
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arvin Moser; Alexander E. Waked; Joseph DiMartino (2023). Consolidating and Managing Data for Drug Development within a Pharmaceutical Laboratory: Comparing the Mapping and Reporting Tools from Software Applications [Dataset]. http://doi.org/10.1021/acs.oprd.1c00082.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    ACS Publications
    Authors
    Arvin Moser; Alexander E. Waked; Joseph DiMartino
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    We present a perspective on drug development for the synthesis of an active pharmaceutical ingredient (e.g., agomelatine) within a commercial technology called Luminata and compare the results to the current method of consolidating the reaction data into Microsoft Excel. The Excel document becomes the ultimate repository of information extracted from multiple sources such as the electronic lab notebook, the laboratory information management system, the chromatography data system, in-house databases, and external data. The major needs of a pharmaceutical company are tracking the stages of multiple reactions, calculating the impurity carryover across the stages, and performing structure dereplication for an unknown impurity. As there is no standardized software available to link the different needs throughout the life cycle of process development, there is a demand for mapping tools to consolidate the route for an API synthesis and link it with analytical data while reducing transcription errors and maintaining an audit trail.

  20. d

    Directed network analysis of 2-year-old and 4-year-old children

    • search.dataone.org
    • datasetcatalog.nlm.nih.gov
    • +1more
    Updated Dec 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Norikazu Hirose; Masanori Kato; Ayumi Maruyama (2024). Directed network analysis of 2-year-old and 4-year-old children [Dataset]. http://doi.org/10.5061/dryad.63xsj3vbx
    Explore at:
    Dataset updated
    Dec 31, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Norikazu Hirose; Masanori Kato; Ayumi Maruyama
    Description

    This dataset includes triaxial acceleration data collected from 60 children (2- and 4-year-olds) across three childcare facilities during a 15-minute free play session. Dyadic (two-person) and triadic (three-person) peer relationships were analyzed using a novel-directed network analysis method, offering insights into age and sex differences in peer interactions. The dataset includes normalized connection counts, demographic details, and detailed analyses of interaction directionality. This research validates the application of network analysis in early childhood studies, reducing observational biases and labor-intensive manual coding, and provides a framework for exploring complex social dynamics in naturalistic play settings., Participants:The study involved 60 children, with equal representation of 2- and 4-year-olds, across three childcare facilities. Informed consent was obtained from the participants' legal guardians following ethical guidelines. Data Collection:Participants wore wristwatch-style triaxial accelerometers (Silmee W22, TDK) during a 15-minute free play session. Acceleration data were recorded at 20 Hz to measure individual movement intensity and calculate dyadic and triadic peer relationships. Data Processing:

    Normalization: Acceleration data were aggregated into 1-second intervals and normalized using Sturges’ formula to account for individual variability.

    Network Analysis: Connections were quantified using directed graph analysis, identifying dyads and triads based on movement entropy thresholds. Entropy values ranged from 0 to 1, representing interaction strength.

    Statistical Analysis:Chi-square tests and ANOVAs were performed to analyze age, sex, and directional differences in peer r..., , # Directed network analysis of 2-year-old and 4-year-old children

    https://doi.org/10.5061/dryad.63xsj3vbx

    Description of the data and file structure

    Dataset Overview: This dataset contains information about directed networks among 2-year-old and 4-year-old children. The data reflects interactions in terms of directed connections from one child to another, categorized by their unique identifiers and demographics.

    Files and variables

    File: Directed_network_of_2_and_4_YO_childlen.xlsx

    The Excel file consists of four sheets as follows:

    Sheet1: Dyad

    Sheet2: Triad

    Sheet3: Directed NW

    Sheet4: Directed NW among individuals

    Sheet description and variables

    Sheet 1:Â Dyad

    This sheet includes each child's age, gender, and the number of Dyads for each facility. Using this data, comparisons of the number of Dyads were conducted between different ages and genders.

    Variables in each column

    • Facility...
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Yaacov Petscher (2023). Statistical Comparison of Two ROC Curves [Dataset]. http://doi.org/10.6084/m9.figshare.860448.v1
Organization logo

Statistical Comparison of Two ROC Curves

Explore at:
10 scholarly articles cite this dataset (View in Google Scholar)
xlsAvailable download formats
Dataset updated
Jun 3, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Yaacov Petscher
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This excel file will do a statistical tests of whether two ROC curves are different from each other based on the Area Under the Curve. You'll need the coefficient from the presented table in the following article to enter the correct AUC value for the comparison: Hanley JA, McNeil BJ (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148:839-843.

Search
Clear search
Close search
Google apps
Main menu