32 datasets found
  1. f

    Statistical Comparison of Two ROC Curves

    • figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yaacov Petscher (2023). Statistical Comparison of Two ROC Curves [Dataset]. http://doi.org/10.6084/m9.figshare.860448.v1
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    figshare
    Authors
    Yaacov Petscher
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This excel file will do a statistical tests of whether two ROC curves are different from each other based on the Area Under the Curve. You'll need the coefficient from the presented table in the following article to enter the correct AUC value for the comparison: Hanley JA, McNeil BJ (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148:839-843.

  2. f

    UC_vs_US Statistic Analysis.xlsx

    • figshare.com
    xlsx
    Updated Jul 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    F. (Fabiano) Dalpiaz (2020). UC_vs_US Statistic Analysis.xlsx [Dataset]. http://doi.org/10.23644/uu.12631628.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 9, 2020
    Dataset provided by
    Utrecht University
    Authors
    F. (Fabiano) Dalpiaz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sheet 1 (Raw-Data): The raw data of the study is provided, presenting the tagging results for the used measures described in the paper. For each subject, it includes multiple columns: A. a sequential student ID B an ID that defines a random group label and the notation C. the used notation: user Story or use Cases D. the case they were assigned to: IFA, Sim, or Hos E. the subject's exam grade (total points out of 100). Empty cells mean that the subject did not take the first exam F. a categorical representation of the grade L/M/H, where H is greater or equal to 80, M is between 65 included and 80 excluded, L otherwise G. the total number of classes in the student's conceptual model H. the total number of relationships in the student's conceptual model I. the total number of classes in the expert's conceptual model J. the total number of relationships in the expert's conceptual model K-O. the total number of encountered situations of alignment, wrong representation, system-oriented, omitted, missing (see tagging scheme below) P. the researchers' judgement on how well the derivation process explanation was explained by the student: well explained (a systematic mapping that can be easily reproduced), partially explained (vague indication of the mapping ), or not present.

    Tagging scheme:
    Aligned (AL) - A concept is represented as a class in both models, either
    

    with the same name or using synonyms or clearly linkable names; Wrongly represented (WR) - A class in the domain expert model is incorrectly represented in the student model, either (i) via an attribute, method, or relationship rather than class, or (ii) using a generic term (e.g., user'' instead ofurban planner''); System-oriented (SO) - A class in CM-Stud that denotes a technical implementation aspect, e.g., access control. Classes that represent legacy system or the system under design (portal, simulator) are legitimate; Omitted (OM) - A class in CM-Expert that does not appear in any way in CM-Stud; Missing (MI) - A class in CM-Stud that does not appear in any way in CM-Expert.

    All the calculations and information provided in the following sheets
    

    originate from that raw data.

    Sheet 2 (Descriptive-Stats): Shows a summary of statistics from the data collection,
    

    including the number of subjects per case, per notation, per process derivation rigor category, and per exam grade category.

    Sheet 3 (Size-Ratio):
    

    The number of classes within the student model divided by the number of classes within the expert model is calculated (describing the size ratio). We provide box plots to allow a visual comparison of the shape of the distribution, its central value, and its variability for each group (by case, notation, process, and exam grade) . The primary focus in this study is on the number of classes. However, we also provided the size ratio for the number of relationships between student and expert model.

    Sheet 4 (Overall):
    

    Provides an overview of all subjects regarding the encountered situations, completeness, and correctness, respectively. Correctness is defined as the ratio of classes in a student model that is fully aligned with the classes in the corresponding expert model. It is calculated by dividing the number of aligned concepts (AL) by the sum of the number of aligned concepts (AL), omitted concepts (OM), system-oriented concepts (SO), and wrong representations (WR). Completeness on the other hand, is defined as the ratio of classes in a student model that are correctly or incorrectly represented over the number of classes in the expert model. Completeness is calculated by dividing the sum of aligned concepts (AL) and wrong representations (WR) by the sum of the number of aligned concepts (AL), wrong representations (WR) and omitted concepts (OM). The overview is complemented with general diverging stacked bar charts that illustrate correctness and completeness.

    For sheet 4 as well as for the following four sheets, diverging stacked bar
    

    charts are provided to visualize the effect of each of the independent and mediated variables. The charts are based on the relative numbers of encountered situations for each student. In addition, a "Buffer" is calculated witch solely serves the purpose of constructing the diverging stacked bar charts in Excel. Finally, at the bottom of each sheet, the significance (T-test) and effect size (Hedges' g) for both completeness and correctness are provided. Hedges' g was calculated with an online tool: https://www.psychometrica.de/effect_size.html. The independent and moderating variables can be found as follows:

    Sheet 5 (By-Notation):
    

    Model correctness and model completeness is compared by notation - UC, US.

    Sheet 6 (By-Case):
    

    Model correctness and model completeness is compared by case - SIM, HOS, IFA.

    Sheet 7 (By-Process):
    

    Model correctness and model completeness is compared by how well the derivation process is explained - well explained, partially explained, not present.

    Sheet 8 (By-Grade):
    

    Model correctness and model completeness is compared by the exam grades, converted to categorical values High, Low , and Medium.

  3. H

    Time-Series Matrix (TSMx): A visualization tool for plotting multiscale...

    • dataverse.harvard.edu
    Updated Jul 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Georgios Boumis; Brad Peter (2024). Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends [Dataset]. http://doi.org/10.7910/DVN/ZZDYM9
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 8, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Georgios Boumis; Brad Peter
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Time-Series Matrix (TSMx): A visualization tool for plotting multiscale temporal trends TSMx is an R script that was developed to facilitate multi-temporal-scale visualizations of time-series data. The script requires only a two-column CSV of years and values to plot the slope of the linear regression line for all possible year combinations from the supplied temporal range. The outputs include a time-series matrix showing slope direction based on the linear regression, slope values plotted with colors indicating magnitude, and results of a Mann-Kendall test. The start year is indicated on the y-axis and the end year is indicated on the x-axis. In the example below, the cell in the top-right corner is the direction of the slope for the temporal range 2001–2019. The red line corresponds with the temporal range 2010–2019 and an arrow is drawn from the cell that represents that range. One cell is highlighted with a black border to demonstrate how to read the chart—that cell represents the slope for the temporal range 2004–2014. This publication entry also includes an excel template that produces the same visualizations without a need to interact with any code, though minor modifications will need to be made to accommodate year ranges other than what is provided. TSMx for R was developed by Georgios Boumis; TSMx was originally conceptualized and created by Brad G. Peter in Microsoft Excel. Please refer to the associated publication: Peter, B.G., Messina, J.P., Breeze, V., Fung, C.Y., Kapoor, A. and Fan, P., 2024. Perspectives on modifiable spatiotemporal unit problems in remote sensing of agriculture: evaluating rice production in Vietnam and tools for analysis. Frontiers in Remote Sensing, 5, p.1042624. https://www.frontiersin.org/journals/remote-sensing/articles/10.3389/frsen.2024.1042624 TSMx sample chart from the supplied Excel template. Data represent the productivity of rice agriculture in Vietnam as measured via EVI (enhanced vegetation index) from the NASA MODIS data product (MOD13Q1.V006). TSMx R script: # import packages library(dplyr) library(readr) library(ggplot2) library(tibble) library(tidyr) library(forcats) library(Kendall) options(warn = -1) # disable warnings # read data (.csv file with "Year" and "Value" columns) data <- read_csv("EVI.csv") # prepare row/column names for output matrices years <- data %>% pull("Year") r.names <- years[-length(years)] c.names <- years[-1] years <- years[-length(years)] # initialize output matrices sign.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) pval.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) slope.matrix <- matrix(data = NA, nrow = length(years), ncol = length(years)) # function to return remaining years given a start year getRemain <- function(start.year) { years <- data %>% pull("Year") start.ind <- which(data[["Year"]] == start.year) + 1 remain <- years[start.ind:length(years)] return (remain) } # function to subset data for a start/end year combination splitData <- function(end.year, start.year) { keep <- which(data[['Year']] >= start.year & data[['Year']] <= end.year) batch <- data[keep,] return(batch) } # function to fit linear regression and return slope direction fitReg <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(sign(slope)) } # function to fit linear regression and return slope magnitude fitRegv2 <- function(batch) { trend <- lm(Value ~ Year, data = batch) slope <- coefficients(trend)[[2]] return(slope) } # function to implement Mann-Kendall (MK) trend test and return significance # the test is implemented only for n>=8 getMann <- function(batch) { if (nrow(batch) >= 8) { mk <- MannKendall(batch[['Value']]) pval <- mk[['sl']] } else { pval <- NA } return(pval) } # function to return slope direction for all combinations given a start year getSign <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) signs <- lapply(combs, fitReg) return(signs) } # function to return MK significance for all combinations given a start year getPval <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) pvals <- lapply(combs, getMann) return(pvals) } # function to return slope magnitude for all combinations given a start year getMagn <- function(start.year) { remaining <- getRemain(start.year) combs <- lapply(remaining, splitData, start.year = start.year) magns <- lapply(combs, fitRegv2) return(magns) } # retrieve slope direction, MK significance, and slope magnitude signs <- lapply(years, getSign) pvals <- lapply(years, getPval) magns <- lapply(years, getMagn) # fill-in output matrices dimension <- nrow(sign.matrix) for (i in 1:dimension) { sign.matrix[i, i:dimension] <- unlist(signs[i]) pval.matrix[i, i:dimension] <- unlist(pvals[i]) slope.matrix[i, i:dimension] <- unlist(magns[i]) } sign.matrix <-...

  4. f

    GHS Safety Fingerprints

    • figshare.com
    xlsx
    Updated Oct 25, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian Murphy (2018). GHS Safety Fingerprints [Dataset]. http://doi.org/10.6084/m9.figshare.7210019.v3
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Oct 25, 2018
    Dataset provided by
    figshare
    Authors
    Brian Murphy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Spreadsheets targeted at the analysis of GHS safety fingerprints.AbstractOver a 20-year period, the UN developed the Globally Harmonized System (GHS) to address international variation in chemical safety information standards. By 2014, the GHS became widely accepted internationally and has become the cornerstone of OSHA’s Hazard Communication Standard. Despite this progress, today we observe that there are inconsistent results when different sources apply the GHS to specific chemicals, in terms of the GHS pictograms, hazard statements, precautionary statements, and signal words assigned to those chemicals. In order to assess the magnitude of this problem, this research uses an extension of the “chemical fingerprints” used in 2D chemical structure similarity analysis to GHS classifications. By generating a chemical safety fingerprint, the consistency of the GHS information for specific chemicals can be assessed. The problem is the sources for GHS information can differ. For example, the SDS for sodium hydroxide pellets found on Fisher Scientific’s website displays two pictograms, while the GHS information for sodium hydroxide pellets on Sigma Aldrich’s website has only one pictogram. A chemical information tool, which identifies such discrepancies within a specific chemical inventory, can assist in maintaining the quality of the safety information needed to support safe work in the laboratory. The tools for this analysis will be scaled to the size of a moderate large research lab or small chemistry department as a whole (between 1000 and 3000 chemical entities) so that labelling expectations within these universes can be established as consistently as possible.Most chemists are familiar with programs such as excel and google sheets which are spreadsheet programs that are used by many chemists daily. Though a monadal programming approach with these tools, the analysis of GHS information can be made possible for non-programmers. This monadal approach employs single spreadsheet functions to analyze the data collected rather than long programs, which can be difficult to debug and maintain. Another advantage of this approach is that the single monadal functions can be mixed and matched to meet new goals as information needs about the chemical inventory evolve over time. These monadal functions will be used to converts GHS information into binary strings of data called “bitstrings”. This approach is also used when comparing chemical structures. The binary approach make data analysis more manageable, as GHS information comes in a variety of formats such as pictures or alphanumeric strings which are difficult to compare on their face. Bitstrings generated using the GHS information can be compared using an operator such as the tanimoto coefficent to yield values from 0 for strings that have no similarity to 1 for strings that are the same. Once a particular set of information is analyzed the hope is the same techniques could be extended to more information. For example, if GHS hazard statements are analyzed through a spreadsheet approach the same techniques with minor modifications could be used to tackle more GHS information such as pictograms.Intellectual Merit. This research indicates that the use of the cheminformatic technique of structural fingerprints can be used to create safety fingerprints. Structural fingerprints are binary bit strings that are obtained from the non-numeric entity of 2D structure. This structural fingerprint allows comparison of 2D structure through the use of the tanimoto coefficient. The use of this structural fingerprint can be extended to safety fingerprints, which can be created by converting a non-numeric entity such as GHS information into a binary bit string and comparing data through the use of the tanimoto coefficient.Broader Impact. Extension of this research can be applied to many aspects of GHS information. This research focused on comparing GHS hazard statements, but could be further applied to other bits of GHS information such as pictograms and GHS precautionary statements. Another facet of this research is allowing the chemist who uses the data to be able to compare large dataset using spreadsheet programs such as excel and not need a large programming background. Development of this technique will also benefit the Chemical Health and Safety community and Chemical Information communities by better defining the quality of GHS information available and providing a scalable and transferable tool to manipulate this information to meet a variety of other organizational needs.

  5. C

    Hospital Annual Financial Data - Selected Data & Pivot Tables

    • data.chhs.ca.gov
    • data.ca.gov
    • +4more
    csv, data, doc, html +4
    Updated Apr 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Health Care Access and Information (2025). Hospital Annual Financial Data - Selected Data & Pivot Tables [Dataset]. https://data.chhs.ca.gov/dataset/hospital-annual-financial-data-selected-data-pivot-tables
    Explore at:
    pdf(383996), pdf(333268), xls(51424256), xlsx, xlsx(782546), xlsx(765216), pdf(310420), xls(18301440), xls, xlsx(750199), xlsx(756356), zip, xlsx(779866), pdf(121968), xls(18445312), xls(19577856), xls(51554816), xls(44967936), pdf(258239), xlsx(769128), xlsx(763636), xlsx(771275), xlsx(752914), xlsx(768036), xlsx(790979), xls(16002048), xls(19599360), data, xlsx(754073), xls(44933632), xls(14657536), xlsx(758376), xls(920576), xlsx(758089), xls(19650048), xlsx(14714368), html, csv(205488092), pdf(303198), doc, xls(19625472), xlsx(770931), xlsx(777616)Available download formats
    Dataset updated
    Apr 23, 2025
    Dataset authored and provided by
    Department of Health Care Access and Information
    Description

    On an annual basis (individual hospital fiscal year), individual hospitals and hospital systems report detailed facility-level data on services capacity, inpatient/outpatient utilization, patients, revenues and expenses by type and payer, balance sheet and income statement.

    Due to the large size of the complete dataset, a selected set of data representing a wide range of commonly used data items, has been created that can be easily managed and downloaded. The selected data file includes general hospital information, utilization data by payer, revenue data by payer, expense data by natural expense category, financial ratios, and labor information.

    There are two groups of data contained in this dataset: 1) Selected Data - Calendar Year: To make it easier to compare hospitals by year, hospital reports with report periods ending within a given calendar year are grouped together. The Pivot Tables for a specific calendar year are also found here. 2) Selected Data - Fiscal Year: Hospital reports with report periods ending within a given fiscal year (July-June) are grouped together.

  6. N

    Excel Township, Minnesota Annual Population and Growth Analysis Dataset: A...

    • neilsberg.com
    csv, json
    Updated Jul 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Excel Township, Minnesota Annual Population and Growth Analysis Dataset: A Comprehensive Overview of Population Changes and Yearly Growth Rates in Excel township from 2000 to 2023 // 2024 Edition [Dataset]. https://www.neilsberg.com/insights/excel-township-mn-population-by-year/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Jul 30, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Excel Township, Minnesota
    Variables measured
    Annual Population Growth Rate, Population Between 2000 and 2023, Annual Population Growth Rate Percent
    Measurement technique
    The data presented in this dataset is derived from the 20 years data of U.S. Census Bureau Population Estimates Program (PEP) 2000 - 2023. To measure the variables, namely (a) population and (b) population change in ( absolute and as a percentage ), we initially analyzed and tabulated the data for each of the years between 2000 and 2023. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Excel township population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of Excel township across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.

    Key observations

    In 2023, the population of Excel township was 300, a 0.99% decrease year-by-year from 2022. Previously, in 2022, Excel township population was 303, a decline of 0.98% compared to a population of 306 in 2021. Over the last 20 plus years, between 2000 and 2023, population of Excel township increased by 17. In this period, the peak population was 308 in the year 2020. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).

    Content

    When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).

    Data Coverage:

    • From 2000 to 2023

    Variables / Data Columns

    • Year: This column displays the data year (Measured annually and for years 2000 to 2023)
    • Population: The population for the specific year for the Excel township is shown in this column.
    • Year on Year Change: This column displays the change in Excel township population for each year compared to the previous year.
    • Change in Percent: This column displays the year on year change as a percentage. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Excel township Population by Year. You can refer the same here

  7. Data from: Delta Produce Sources Study

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Delta Produce Sources Study [Dataset]. https://catalog.data.gov/dataset/delta-produce-sources-study-51a7a
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    The Delta Produce Sources Study was an observational study designed to measure and compare food environments of farmers markets (n=3) and grocery stores (n=12) in 5 rural towns located in the Lower Mississippi Delta region of Mississippi. Data were collected via electronic surveys from June 2019 to March 2020 using a modified version of the Nutrition Environment Measures Survey (NEMS) Farmers Market Audit tool. The tool was modified to collect information pertaining to source of fresh produce and also for use with both farmers markets and grocery stores. Availability, source, quality, and price information were collected and compared between farmers markets and grocery stores for 13 fresh fruits and 32 fresh vegetables via SAS software programming. Because the towns were not randomly selected and the sample sizes are relatively small, the data may not be generalizable to all rural towns in the Lower Mississippi Delta region of Mississippi. Resources in this dataset:Resource Title: Delta Produce Sources Study dataset . File Name: DPS Data Public.csvResource Description: The dataset contains variables corresponding to availability, source (country, state and town if country is the United States), quality, and price (by weight or volume) of 13 fresh fruits and 32 fresh vegetables sold in farmers markets and grocery stores located in 5 Lower Mississippi Delta towns.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel Resource Title: Delta Produce Sources Study data dictionary. File Name: DPS Data Dictionary Public.csvResource Description: This file is the data dictionary corresponding to the Delta Produce Sources Study dataset.Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel

  8. B

    Financial Performance Indicators for Canadian Business [Excel]

    • borealisdata.ca
    • search.dataone.org
    Updated Sep 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada (2023). Financial Performance Indicators for Canadian Business [Excel] [Dataset]. http://doi.org/10.5683/SP3/SZHJFY
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 29, 2023
    Dataset provided by
    Borealis
    Authors
    Statistics Canada
    License

    https://borealisdata.ca/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.5683/SP3/SZHJFYhttps://borealisdata.ca/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.5683/SP3/SZHJFY

    Time period covered
    1994 - 2011
    Area covered
    Canada
    Description

    This CD-ROM product is an authoritative reference source of 15 key financial ratios by industry groupings compiled from the North American Industry Classification System (NAICS 2007). It is based on up-to-date, reliable and comprehensive data on Canadian businesses, derived from Statistics Canada databases of financial statements for three reference years. The CD-ROM enables users to compare their enterprise's performance to that of their industry and to address issues such as profitability, efficiency and business risk. Financial Performance Indicators can also be used for inter-industry comparisons. Volume 1 covers large enterprises in both the financial and non-financial sectors, at the national level, with annual operating revenue of $25 million or more. Volume 2 covers medium-sized enterprises in the non-financial sector, at the national level, with annual operating revenue of $5 million to less than $25 million. Volume 3 covers small enterprises in the non-financial sector, at the national, provincial, territorial, Atlantic region and Prairie region levels, with annual operating revenue of $30,000 to less than $5 million. Note: FPICB has been discontinued as of 2/23/2015. Statistics Canada continues to provide information on Canadian businesses through alternative data sources. Information on specific financial ratios will continue to be available through the annual Financial and Taxation Statistics for Enterprises program: CANSIM table 180-0003 ; the Quarterly Survey of Financial Statements: CANSIM tables 187-0001 and 187-0002 ; and the Small Business Profiles, which present financial data for small businesses in Canada, available on Industry Canada's website: Financial Performance Data.

  9. h

    Supporting data for PhD thesis “Investigating the Impact of Argument-Driven...

    • datahub.hku.hk
    Updated Jul 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kin Yi Leung (2023). Supporting data for PhD thesis “Investigating the Impact of Argument-Driven Inquiry and Academically Productive Talk on Critical Thinking and Learning Motivation in Post-Pandemic Hong Kong Science Education” [Dataset]. http://doi.org/10.25442/hku.23648130.v1
    Explore at:
    Dataset updated
    Jul 20, 2023
    Dataset provided by
    HKU Data Repository
    Authors
    Kin Yi Leung
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Directory of Files: A. Filename: Combine_CCTDI.zip
    Short description: Quantitative Data. The zip files contain 6 Excel files which store students' raw data. This raw data set consists of student's input on each CCTDI item. The pre-data were collected through an online survey, while post-data were collected through pen and paper. The data will be analysed by ANOVA to compare the effectiveness of the intervention. (California Critical Thinking Disposition Inventory (CCTDI) has been widely employed in the field of education to investigate the changes in students’ Critical Thinking (CT) attitudes resulting from teaching interventions by comparing the pre- and post-tests. This 6-point scale self-reporting instrument requires respondents to rate themselves, ranging from “rating 1” for not describing them at all to “rating 6” for extremely well. The instrument has 40 questions categorized in seven subsets covering various CT dispositions dimensions, namely: i) truth-seeking, ii) open-mindedness, iii) analyticity, iv) systematicity, v) inquisitiveness, vi) maturity, and vii) self-confidence.

    B. Filename: Combine_TCTSPS.zip
    Short description: Quantitative Data. The zip files contains 6 excel files which stores students' raw data. consists of student's input on each TCTSPS item. The pre-data were collected through an online survey, while post-data were collected through pen and paper. The data will be analysed by ANOVA to compare the effectiveness of the intervention. (Test of Critical Thinking Skills for Primary and Secondary School Students (TCTS-PS) consists of 24 items divided into five subscales measuring distinct yet correlated aspects of CT skills, namely: (I) differentiating theory from assumptions, (II) deciding evidence, (III) inference, (IV) finding an alternative theory, and (V) evaluation of arguments. The instrument yields a possible total score of 72. The instrument is intended for use in measuring gains in CT skills resulting from instruction, predicting success in programs where CT is crucial, and examining relationships between CT skills and other abilities or traits.)

    C. Filename: Combine_SMTSL.zip
    Short description: Quantitative Data. The zip files contains 5 excel files which stores students' raw data. consists of student's input on each SMTSL item. The pre-data were collected through an online survey, while post-data were collected through pen and paper. The data will be analysed by ANOVA to compare the effectiveness of the intervention. (Students' Motivation Towards Science learning (SMTSL) defined six factors that related to the motivation in science learning including self-efficacy, active learning strategies and so on, in order to measure participants' motivation towards science learning: A. Self-efficacy, B. Active learning , trategies, C. Science learning value, D. Performance goal, E. Achievement goal, and F. Learning environment stimulation)

    D. Filename: Combine_Discourse Transcription_1.zip and Combine_Discourse Transcription_2.zip
    Short description: Qualitative Data.The zip files contains 6 excel files which 6 teachers' classroom teaching discourse transcriptions. The data will be analysed by thematic analysis to compare the effectiveness of the intervention. (38 science classroom discourse videos of 8th graders were transcribed and coded by Academically Productive Talk framework (APT). APT is drawing from sociological, linguistic, and anthropological perspectives, comprises four primary constructs or objectives.)

    E. Filename: Combine_Inquiry Report.zip
    Short description: Qualitative Data. The zip files contains 2 excel files which 2 schools' inquiry report scores according rubrics. The data will be analysed by thematic analysis to compare the effectiveness of the intervention. (To assess the quality of students' arguments, a validated scoring rubric was employed to evaluate the student's written argument. These aspects primarily concentrated on the student's proficiency in five perspectives (Walker & Sampson, 2013, p. 573): (AR1) Provide a well-articulated, adequate, and accurate claim that answers the research question, (AR2) Use genuine evidence to support the claim and to present the evidence in an appropriate manner, (AR3) Provide enough valid and reliable evidence to support the claim, (AR4) Provide a rationale is sufficient and appropriate, and (AR5) Compare his or her findings with other groups in the project.)

    F. Filename: Combined_Interview Transcription.xlsx
    Short description: Qualitative Data. The file contains all the students' interview transcriptions. The data will be analysed by thematic analysis to compare the effectiveness of the intervention. (A semi-structured interviews was conducted to gather interviewees' motivation of CT and learning motivation in the context of science. The interview data would be used to complement the quantitative results (i.e., TCTS-PS, CCTDI, and SMTSL scores).

  10. f

    Data from: Consolidating and Managing Data for Drug Development within a...

    • figshare.com
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arvin Moser; Alexander E. Waked; Joseph DiMartino (2023). Consolidating and Managing Data for Drug Development within a Pharmaceutical Laboratory: Comparing the Mapping and Reporting Tools from Software Applications [Dataset]. http://doi.org/10.1021/acs.oprd.1c00082.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    ACS Publications
    Authors
    Arvin Moser; Alexander E. Waked; Joseph DiMartino
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    We present a perspective on drug development for the synthesis of an active pharmaceutical ingredient (e.g., agomelatine) within a commercial technology called Luminata and compare the results to the current method of consolidating the reaction data into Microsoft Excel. The Excel document becomes the ultimate repository of information extracted from multiple sources such as the electronic lab notebook, the laboratory information management system, the chromatography data system, in-house databases, and external data. The major needs of a pharmaceutical company are tracking the stages of multiple reactions, calculating the impurity carryover across the stages, and performing structure dereplication for an unknown impurity. As there is no standardized software available to link the different needs throughout the life cycle of process development, there is a demand for mapping tools to consolidate the route for an API synthesis and link it with analytical data while reducing transcription errors and maintaining an audit trail.

  11. l

    Dataset _ The influence of social context on the perception of assistive...

    • repository.lboro.ac.uk
    pdf
    Updated Oct 9, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salman Asghar; George Torrens; Hassan Iftikhar; Ruth Welsh; Robert G. Harland (2019). Dataset _ The influence of social context on the perception of assistive technology: Using a semantic differential scale to compare young adults’ views from the UK and Pakistan [Dataset]. http://doi.org/10.17028/rd.lboro.7982006.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Oct 9, 2019
    Dataset provided by
    Loughborough University
    Authors
    Salman Asghar; George Torrens; Hassan Iftikhar; Ruth Welsh; Robert G. Harland
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Area covered
    Pakistan, United Kingdom
    Description

    This dataset contains raw data and their corresponding results files associated with a recent study. Each MS Excel spreadsheet entails the data for one aspect of study which is specified by name of the file.The information about participants i.e. personal and demographic, responses for first SD scale, second SD scale and personal evaluation are presented in each spreadsheet. The supplemental material (participant information sheet, informed consent form, online questionnaire, risk assessment form) are also enclosed with this dataset. Lastly, for the analysis of raw data, statistical test such as; independent sample t-test was performed. The original SPSS data files are also included.

  12. Z

    Poseidon 2.0 - Decision Support Tool for Water Reuse (Microsoft Excel) and...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oertlé, Emmanuel (2024). Poseidon 2.0 - Decision Support Tool for Water Reuse (Microsoft Excel) and Handbook [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3755379
    Explore at:
    Dataset updated
    Jul 22, 2024
    Dataset authored and provided by
    Oertlé, Emmanuel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Poseidon 2.0 is a user-oriented, simple and fast Excel-Tool which aims to compare different wastewater treatment techniques based on their pollutant removal efficiencies, their costs and additional assessment criteria. Poseidon can be applied for pre-feasibility studies in order to assess possible water reuse options and can show decision makers and other stakeholders that implementable solutions are available to comply with local requirements. This upload consists in:

    Poseidon 2.0 Excel File that can be used with Microsoft Excel - XLSM

    Handbook presenting main features of the decision support tool - PDF

    This dataset is linked to following additional open access resources:
    Oertlé E, Hugi C, Wintgens T, Karavitis C, Oertlé E, Hugi C, Wintgens T, Karavitis CA. 2019. Poseidon—Decision Support Tool for Water Reuse. Water. 11(1):153. doi:10.3390/w11010153. [accessed 2019 Jan 22]. http://www.mdpi.com/2073-4441/11/1/153 .

    Externally hosted supplementary file 1, Oertlé, Emmanuel. (2018, December 5). Poseidon - Decision Support Tool for Water Reuse (Microsoft Excel) and Handbook (Version 1.1.1). Zenodo. http://doi.org/10.5281/zenodo.3341573

    Externally hosted supplementary file 2, Oertlé, Emmanuel. (2018). Wastewater Treatment Unit Processes Datasets: Pollutant removal efficiencies, evaluation criteria and cost estimations (Version 1.0.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1247434

    Externally hosted supplementary file 3, Oertlé, Emmanuel. (2018). Treatment Trains for Water Reclamation (Dataset) (Version 1.0.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1972627

    Externally hosted supplementary file 4, Oertlé, Emmanuel. (2018). Water Quality Classes - Recommended Water Quality Based on Guideline and Typical Wastewater Qualities (Version 1.0.2) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3341570

  13. m

    A Test to Compare Interval Time Series - Supplementary Material

    • data.mendeley.com
    Updated Jan 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elizabeth Ann Maharaj (2021). A Test to Compare Interval Time Series - Supplementary Material [Dataset]. http://doi.org/10.17632/f35nry7hjz.1
    Explore at:
    Dataset updated
    Jan 11, 2021
    Authors
    Elizabeth Ann Maharaj
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supplementary material for the manuscript "A Test to Compare Interval Time Series". This includes figures and tables referred to in the manuscript as well as details of scripts and data files used for the simulation studies and the application. All scripts are in MATLAB (.m) format and data files are is MATLAB (.mat) and in EXCEL (. xlsx) formats.

  14. d

    Fluid electrical conductivity data

    • catalog.data.gov
    • data.usgs.gov
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Fluid electrical conductivity data [Dataset]. https://catalog.data.gov/dataset/fluid-electrical-conductivity-data
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    When water is pumped slowly from saturated sediment-water inteface sediments, the more highly connected, mobile porosity domain is prefferentially sampled, compared to less-mobile pore spaces. Changes in fluid electrical conductivity (EC) during controlled downward ionic tracer injections into interface sediments can be assumed to represent mobile porosity dynamics, which are therefore distinguished from less-mobile porosity dynamics that is measured using bulk EC geoelectrical methods. Fluid EC samples were drawn at flow rates similar to tracer injection rates to prevent inducing preferential flow. The data were collected using a stainless steel tube with slits cut into the bottom (USGS MINIPOINT style) connected to an EC meter via c-flex or neoprene tubing, and drawn up through the system via a peristaltic pump. The data were compiled into an excel spreadsheet and time corrected to compare to bulk EC data that were collected simultaneously and contained in another section of this data release. Controlled, downward flow experiments were conducted in Dual-domain porosity apparatus (DDPA). Downward flow rates ranged from 1.2 to 1.4 m/d in DDPA1 and at 1 m/d, 3 m/d, 5 m/d, 0.9 m/d as described in the publication: Briggs, M.A., Day-Lewis, F.D., Dehkordy, F.M.P., Hampton, T., Zarnetske, J.P., Singha, K., Harvey, J.W. and Lane, J.W., 2018, Direct observations of hydrologic exchange occurring with less-mobile porosity and the development of anoxic microzones in sandy lakebed sediments, Water Resources Research, DOI:10.1029/2018WR022823.

  15. Stock Market Analysis using Power BI

    • kaggle.com
    Updated Aug 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DileepKumarVemali (2024). Stock Market Analysis using Power BI [Dataset]. https://www.kaggle.com/datasets/dileepkumarvemali/stock-market-analysis-using-power-bi/data?select=StocksListNSETest.xlsx
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 12, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    DileepKumarVemali
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains the essential files for conducting a dynamic stock market analysis using Power BI. The data is sourced from Yahoo Finance and includes historical stock prices, which can be dynamically updated by adding new stock codes to the provided Excel sheet.

    Files Included: Power BI Report (.pbix): The interactive Power BI report that includes various visualizations such as Candle Charts, Line Charts for Support and Resistance, and Technical Indicators like SMA, EMA, Bollinger Bands, and RSI. The report is designed to provide a comprehensive analysis of stock performance over time.

    Stock Data Excel Sheet (.xlsx): This Excel sheet is connected to the Power BI report and allows for dynamic data loading. By adding new stock codes to this sheet, the Power BI report automatically refreshes to include the new data, enabling continuous updates without manual intervention.

    Overview and Chart Pages Snapshots for better understanding about the Report.

    Key Features: Dynamic Data Loading: Easily update the dataset by adding new stock codes to the Excel sheet. The Power BI report will automatically pull the corresponding data from Yahoo Finance. Comprehensive Visualizations: Analyze stock trends using Candle Charts, identify key price levels with Support and Resistance lines, and explore market behavior through various technical indicators. Interactive Analysis: The Power BI report includes slicers and navigation buttons to switch between different time periods and visualizations, providing a tailored analysis experience. Use Cases: Ideal for financial analysts, traders, or anyone interested in conducting a detailed stock market analysis. Can be used to monitor the performance of individual stocks or compare trends across multiple stocks over time. Tags: Stock Market Power BI Financial Analysis Yahoo Finance Data Visualization

  16. CATCH-EyoU Work Package 2 Dataset 2.1a - Full Consortium Collection of...

    • zenodo.org
    Updated Jan 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shakuntala Banaji; Frosso Motti-Stefanidi; Elvira Cicognani; Erik Amna; Peter Noack; Veronika Kalmus; Isabel Menezes; Petr Macek; Shakuntala Banaji; Frosso Motti-Stefanidi; Elvira Cicognani; Erik Amna; Peter Noack; Veronika Kalmus; Isabel Menezes; Petr Macek (2020). CATCH-EyoU Work Package 2 Dataset 2.1a - Full Consortium Collection of Literature Matrix [Dataset]. http://doi.org/10.5281/zenodo.886325
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Shakuntala Banaji; Frosso Motti-Stefanidi; Elvira Cicognani; Erik Amna; Peter Noack; Veronika Kalmus; Isabel Menezes; Petr Macek; Shakuntala Banaji; Frosso Motti-Stefanidi; Elvira Cicognani; Erik Amna; Peter Noack; Veronika Kalmus; Isabel Menezes; Petr Macek
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset includes one master Excel spreadsheet containing literatures searched, catalogued and summarised in the fields of Cultural Studies, Education, History, Media and Communication, Philosophy, Political Science, Psychology and Sociology, which contains 770 selected texts.

    The aims of the data collected are to produce an integrated theory that builds on the findings of different disciplines (Cultural Studies, Education, History, Media and Communication, Philosophy, Political Science, Psychology and Sociology) focused on the understanding of factors and processes (from the macro social level to the social and psychological level), within the different life contexts, that promote or hinder youth active citizenship in EU.

    It is possible that similar databases of literature around Europe, Young People and Active Citizenship across the fields of Cultural Studies, Education, History, Media and Communication, Philosophy, Political Science, Psychology and Sociology exist in other forms, perhaps collected for studies on one or more of the included disciplines, but we do not currently have access to a similar repository.

    With that said, it is highly unlikely that an exact dataset corresponding to the specifics of this study exist in any form elsewhere, thus justifying the creation of new data for this study in the absence of suitable existing data. Data collected here will bridge the gap between global aggregated literatures on youth and citizenship separated by discipline on the one hand, and a new dataset offering an integrated literature analysis of different fields of study.

    The data sources are available in bibliographic format and attached via Excel document.

    The dataset relies on the following information taken from the data sources: specific identifying information about the text itself (title/author/year/publisher); abstract or summarizing information either taken directly from the text or summarized by the researcher; and keywords either taken directly from the text or summarized by the researcher.

    Finally, the aggregated literature review spreadsheet constitutes raw data which can be reused by researchers who want to compare our data with similar data collected in different countries, or to perform textual analysis (content analysis and/or data mining) on our data.

  17. C

    Budget and results of the municipality and city districts

    • ckan.mobidatalab.eu
    Updated Apr 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OverheidNl (2023). Budget and results of the municipality and city districts [Dataset]. https://ckan.mobidatalab.eu/dataset/qrzdwvf8jdh7fw
    Explore at:
    http://publications.europa.eu/resource/authority/file-type/html, http://publications.europa.eu/resource/authority/file-type/tar_xzAvailable download formats
    Dataset updated
    Apr 11, 2023
    Dataset provided by
    OverheidNl
    License

    http://standaarden.overheid.nl/owms/terms/licentieonbekendhttp://standaarden.overheid.nl/owms/terms/licentieonbekend

    Description

    Financial data of the municipality and city districts are published via Openspending.nl. The OpenSpending platform of the Open State Foundation makes it possible to digitally disclose and compare government expenditure and income. The source data can also be downloaded in (uniform) Excel format and available through an API.

  18. Data from: Data and code from: Mycotoxin contamination & the nutritional...

    • catalog.data.gov
    • gimi9.com
    • +1more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Data and code from: Mycotoxin contamination & the nutritional content of corn targeted for animal feed [Dataset]. https://catalog.data.gov/dataset/data-and-code-from-mycotoxin-contamination-the-nutritional-content-of-corn-targeted-for-an
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    This dataset contains raw data (Excel spreadsheet, .xlsx), R statistical code (RMarkdown notebook, .Rmd), and rendered output of the R notebook (HTML). This comprises all raw data and code needed to reproduce the analyses in the manuscript:Pokoo-Aikins, A., C. M. McDonough, T. R. Mitchell, J. A. Hawkins, L. F. Adams, Q. D. Read, X. Li, R. Shanmugasundaram, E. Rodewald, P. Acharya, A. E. Glenn, and S. E. Gold. 2024. Mycotoxin contamination and the nutritional content of corn targeted for animal feed. Poultry Science, 104303. DOI: 10.1016/j.psj.2024.104303.The data consist of the mycotoxin concentration, nutrient content, and color of different samples of corn (maize). We model the effect of mycotoxin concentration on the concentration of several different nutrients in corn. We include main effects of the different mycotoxins as well as two-way interactions between each pair of mycotoxins. We also include analysis of mycotoxin effects on the L variable from the color analysis, because it seems to be the one most important for determining the overall color of the corn. We use AIC to compare the models with and without interaction terms. We find that the models without interaction terms are better so we omit the interactions. We present adjusted R-squared values for each model as well as the p-values associated with the average slopes (effect of each mycotoxin on each nutrient). Finally, we produce the figures that appear in the above cited manuscript.Column metadata can be found in the Excel spreadsheet.Included filesCombined LCMS NIR Color data.xlsx: Excel file with all raw data (sheet 1) and column metadata (sheet 2).corn_mycotoxin_analysis_archived.Rmd: RMarkdown notebook with all analysis codecorn_mycotoxin_analysis_archived.html: rendered output of R notebook

  19. w

    Automatic number plate recognition (ANPR) project

    • data.wu.ac.at
    • findtransportdata.dft.gov.uk
    • +1more
    csv, html, xlsx, zip
    Updated Aug 1, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leeds City Council (2017). Automatic number plate recognition (ANPR) project [Dataset]. https://data.wu.ac.at/odso/data_gov_uk/ZjkwZGI3NmUtZTcyZi00YWI2LTk5MjctNzY1MTAxYjdkOTk3
    Explore at:
    zip(19971156.0), xlsx(14474.0), html, csv(349.0)Available download formats
    Dataset updated
    Aug 1, 2017
    Dataset provided by
    Leeds City Council
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    A dataset providing information of the vehicle types and counts in several locations in Leeds. Purpose of the project The aim of this work was to examine the profile of vehicle types in Leeds, in order to compare local emissions with national predictions. Traffic was monitored for a period of one week at two Inner Ring Road locations in April 2016 and at seven sites around the city in June 2016. The vehicle registration data was then sent to the Department for Transport (Dft), who combined it with their vehicle type data, replacing the registration number with an anonymised ‘Unique ID’. The data is provided in three folders:- Raw Data – contains the data in the format it was received, and a sample of each format. Processed Data – the data after processing by LCC, lookup tables, and sample data. Outputs – Excel spreadsheets summarising the data for each site, for various time/dates. Initially a dataset was received for the Inner Ring Road (see file “IRR ANPR matched to DFT vehicle type list.csv”), with vehicle details, but with missing / uncertain data on the vehicles emissions Eurostandard class. Of the 820,809 recorded journeys, from the pseudo registration number field (UniqueID) it was determined that there were 229,891 unique vehicles, and 31,912 unique “vehicle types” based on the unique concatenated vehicle description fields. It was therefore decided to import the data into an MS Access database, create a table of vehicle types, and to add the necessary fields/data so that combined with the year of manufacture / vehicle registration, the appropriate Eurostandard could be determined for the particular vehicle. The criteria for the Eurostandards was derived mainly from www.dieselnet.com and summarised in a spreadsheet (“EuroStandards.xlsx”). Vehicle types were assigned to a “VehicleClass” (see “Lookup Tables.xlsx”) and “EU class” with additional fields being added for any modified data (Gross Vehicle Weight – “GVM_Mod”; Engine capacity – “EngineCC_mod”; No of passenger seats – “PassSeats”; and Kerb weight – “KerbWt”). Missing data was added from the internet lookups, extrapolation from known data, and by association – eg 99% of cars with an engine size Additional data was then received from the Inner Ring Road site, giving journey date/time and incorporating the Taxi data for licensed taxis in Leeds. Similar data for Sites 1-7 was also then received, and processed to determine the “VehicleClass” and “EU class”. A mixture of Update queries, and VBA processing was then used to provide the Level 1-6 breakdown of vehicle types (see “Lookup Tables.xlsx”). The data was then combined into one database, so that the required Excel spreadsheets could be exported for the required time/date periods (see “outputs” folder).

  20. e

    Data from: Homelessness in England

    • data.europa.eu
    csv, excel xls
    Updated Jul 30, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cambridgeshire Insight (2018). Homelessness in England [Dataset]. https://data.europa.eu/data/datasets/homelessness-in-england1?locale=en
    Explore at:
    excel xls, csvAvailable download formats
    Dataset updated
    Jul 30, 2018
    Dataset authored and provided by
    Cambridgeshire Insight
    Area covered
    England
    Description

    Statistics about homelessness for every local authority in England.

    This includes annual data covering 2009-10 to 2017-18 based on CLG live table 784, known as the P1E returns.

    There are also quarterly returns (live table 784a) which cover April to June; July to September, September to December and January to March, since April 2013 available on the CLG webpage (see links)

    Both are provided in excel and csv format.

    These data help us compare trends across the country for the decisions local authorities make when people apply to them as homeless and each district's use of temporary accommodation.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Yaacov Petscher (2023). Statistical Comparison of Two ROC Curves [Dataset]. http://doi.org/10.6084/m9.figshare.860448.v1

Statistical Comparison of Two ROC Curves

Explore at:
11 scholarly articles cite this dataset (View in Google Scholar)
xlsAvailable download formats
Dataset updated
Jun 3, 2023
Dataset provided by
figshare
Authors
Yaacov Petscher
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This excel file will do a statistical tests of whether two ROC curves are different from each other based on the Area Under the Curve. You'll need the coefficient from the presented table in the following article to enter the correct AUC value for the comparison: Hanley JA, McNeil BJ (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148:839-843.

Search
Clear search
Close search
Google apps
Main menu