Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This excel file will do a statistical tests of whether two ROC curves are different from each other based on the Area Under the Curve. You'll need the coefficient from the presented table in the following article to enter the correct AUC value for the comparison: Hanley JA, McNeil BJ (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148:839-843.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sheet 1 (Raw-Data): The raw data of the study is provided, presenting the tagging results for the used measures described in the paper. For each subject, it includes multiple columns: A. a sequential student ID B an ID that defines a random group label and the notation C. the used notation: user Story or use Cases D. the case they were assigned to: IFA, Sim, or Hos E. the subject's exam grade (total points out of 100). Empty cells mean that the subject did not take the first exam F. a categorical representation of the grade L/M/H, where H is greater or equal to 80, M is between 65 included and 80 excluded, L otherwise G. the total number of classes in the student's conceptual model H. the total number of relationships in the student's conceptual model I. the total number of classes in the expert's conceptual model J. the total number of relationships in the expert's conceptual model K-O. the total number of encountered situations of alignment, wrong representation, system-oriented, omitted, missing (see tagging scheme below) P. the researchers' judgement on how well the derivation process explanation was explained by the student: well explained (a systematic mapping that can be easily reproduced), partially explained (vague indication of the mapping ), or not present.
Tagging scheme:
Aligned (AL) - A concept is represented as a class in both models, either
with the same name or using synonyms or clearly linkable names;
Wrongly represented (WR) - A class in the domain expert model is
incorrectly represented in the student model, either (i) via an attribute,
method, or relationship rather than class, or
(ii) using a generic term (e.g., user'' instead ofurban
planner'');
System-oriented (SO) - A class in CM-Stud that denotes a technical
implementation aspect, e.g., access control. Classes that represent legacy
system or the system under design (portal, simulator) are legitimate;
Omitted (OM) - A class in CM-Expert that does not appear in any way in
CM-Stud;
Missing (MI) - A class in CM-Stud that does not appear in any way in
CM-Expert.
All the calculations and information provided in the following sheets
originate from that raw data.
Sheet 2 (Descriptive-Stats): Shows a summary of statistics from the data collection,
including the number of subjects per case, per notation, per process derivation rigor category, and per exam grade category.
Sheet 3 (Size-Ratio):
The number of classes within the student model divided by the number of classes within the expert model is calculated (describing the size ratio). We provide box plots to allow a visual comparison of the shape of the distribution, its central value, and its variability for each group (by case, notation, process, and exam grade) . The primary focus in this study is on the number of classes. However, we also provided the size ratio for the number of relationships between student and expert model.
Sheet 4 (Overall):
Provides an overview of all subjects regarding the encountered situations, completeness, and correctness, respectively. Correctness is defined as the ratio of classes in a student model that is fully aligned with the classes in the corresponding expert model. It is calculated by dividing the number of aligned concepts (AL) by the sum of the number of aligned concepts (AL), omitted concepts (OM), system-oriented concepts (SO), and wrong representations (WR). Completeness on the other hand, is defined as the ratio of classes in a student model that are correctly or incorrectly represented over the number of classes in the expert model. Completeness is calculated by dividing the sum of aligned concepts (AL) and wrong representations (WR) by the sum of the number of aligned concepts (AL), wrong representations (WR) and omitted concepts (OM). The overview is complemented with general diverging stacked bar charts that illustrate correctness and completeness.
For sheet 4 as well as for the following four sheets, diverging stacked bar
charts are provided to visualize the effect of each of the independent and mediated variables. The charts are based on the relative numbers of encountered situations for each student. In addition, a "Buffer" is calculated witch solely serves the purpose of constructing the diverging stacked bar charts in Excel. Finally, at the bottom of each sheet, the significance (T-test) and effect size (Hedges' g) for both completeness and correctness are provided. Hedges' g was calculated with an online tool: https://www.psychometrica.de/effect_size.html. The independent and moderating variables can be found as follows:
Sheet 5 (By-Notation):
Model correctness and model completeness is compared by notation - UC, US.
Sheet 6 (By-Case):
Model correctness and model completeness is compared by case - SIM, HOS, IFA.
Sheet 7 (By-Process):
Model correctness and model completeness is compared by how well the derivation process is explained - well explained, partially explained, not present.
Sheet 8 (By-Grade):
Model correctness and model completeness is compared by the exam grades, converted to categorical values High, Low , and Medium.
Facebook
Twitter"SHRP 2 initiated the L38 project to pilot test products from five of the program’s completed projects. The products support reliability estimation and use based on data analyses, analytical techniques, and decision-making framework. The L38 project has two main objectives: (1) to assist agencies in using travel time reliability as a measure in their business practices and (2) to receive feedback from the project research teams on the applicability and usefulness of the products tested, along with their suggested possible refinements. SHRP 2 selected four teams from California, Minnesota, Florida, and Washington. Project L38C tested elements from Projects L02, L05, L07, and L08. Project L02 identified methods to collect, archive, and integrate required data for reliability estimation and methods for analyzing and visualizing the causes of unreliability based on the collected data. Projects L07 and L08 produced analytical techniques and tools for estimating reliability based on developed models and allowing the estimation of reliability and the impacts on reliability of alternative mitigating strategies. Project L05 provided guidance regarding how to use reliability assessments to support the business processes of transportation agencies. The datasets in this zip file, which is 7.83 MB in size, support of SHRP 2 reliability project L38C, "Pilot testing of SHRP 2 reliability data and analytical products: Florida." The accompanying report can be accessed at the following URL: https://rosap.ntl.bts.gov/view/dot/3609 There are 12 datasets in this zip file, including 2 Microsoft Excel worksheets (XLSX) and 10 Comma Separated Values (CSV) files. The Microsoft Excel worksheets can be opened using the 2010 and 2016 versions of Microsoft Word, the CSV files can be opened using most text editors.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This article describes a free, open-source collection of templates for the popular Excel (2013, and later versions) spreadsheet program. These templates are spreadsheet files that allow easy and intuitive learning and the implementation of practical examples concerning descriptive statistics, random variables, confidence intervals, and hypothesis testing. Although they are designed to be used with Excel, they can also be employed with other free spreadsheet programs (changing some particular formulas). Moreover, we exploit some possibilities of the ActiveX controls of the Excel Developer Menu to perform interactive Gaussian density charts. Finally, it is important to note that they can be often embedded in a web page, so it is not necessary to employ Excel software for their use. These templates have been designed as a useful tool to teach basic statistics and to carry out data analysis even when the students are not familiar with Excel. Additionally, they can be used as a complement to other analytical software packages. They aim to assist students in learning statistics, within an intuitive working environment. Supplementary materials with the Excel templates are available online.
Facebook
TwitterTwo bootstrap tools are provided in the form of Excel spreadsheets. One tool is to compute means and confidence intervals from user provided data. The other tool computes p-values for significant difference testing of two user provided data sets. All means are weighted with weights provided by the user. Instructions are provided for each Excel spreadsheet tool. Download the tools as "Original Format".
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This database contains the results from questionnaires gathered during user testing of the SELFEX solution, a training system utilizing motion-tracking gloves, augmented reality (AR), and screen-based interfaces. Participants were asked to complete paper- and tablet-based questionnaires after interacting with both AR and screen-guided training environments. The data provided allows for a comparative analysis between the two training methods (AR vs. screen) and assesses the suitability of the MAGOS hand-tracking gloves for this application. Additionally, it facilitates the exploration of correlations between various user experience factors, such as ease of use, usefulness, satisfaction, and ease of learning.
The folder is divided into two types of files:
- PDF files: These contain the three questionnaires administered during testing.
- "dataset.xlsx": This file includes the questionnaire results.
Within the Excel file, the data is organized across three sheets:
- "Results with AR glasses": Displays data from the experiment conducted using Hololens 2 AR glasses. Participants are anonymized and coded by gender (e.g., M01 for the first male participant).
- "Results without AR glasses": Shows data from the experiment conducted with five participants using a TV screen instead of Hololens 2 to follow the assembly training instructions.
- "Demographic data": Contains demographic information related to the participants.
This dataset enables comprehensive evaluation and comparison of the training methods and user experiences.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
If this Data Set is useful, and upvote is appreciated. This data approach student achievement in secondary education of two Portuguese schools. The data attributes include student grades, demographic, social and school related features) and it was collected by using school reports and questionnaires. Two datasets are provided regarding the performance in two distinct subjects: Mathematics (mat) and Portuguese language (por). In [Cortez and Silva, 2008], the two datasets were modeled under binary/five-level classification and regression tasks. Important note: the target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade (issued at the 3rd period), while G1 and G2 correspond to the 1st and 2nd-period grades. It is more difficult to predict G3 without G2 and G1, but such prediction is much more useful (see paper source for more details).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This comprehensive dataset provides a wealth of information about all countries worldwide, covering a wide range of indicators and attributes. It encompasses demographic statistics, economic indicators, environmental factors, healthcare metrics, education statistics, and much more. With every country represented, this dataset offers a complete global perspective on various aspects of nations, enabling in-depth analyses and cross-country comparisons.
- Country: Name of the country.
- Density (P/Km2): Population density measured in persons per square kilometer.
- Abbreviation: Abbreviation or code representing the country.
- Agricultural Land (%): Percentage of land area used for agricultural purposes.
- Land Area (Km2): Total land area of the country in square kilometers.
- Armed Forces Size: Size of the armed forces in the country.
- Birth Rate: Number of births per 1,000 population per year.
- Calling Code: International calling code for the country.
- Capital/Major City: Name of the capital or major city.
- CO2 Emissions: Carbon dioxide emissions in tons.
- CPI: Consumer Price Index, a measure of inflation and purchasing power.
- CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.
- Currency_Code: Currency code used in the country.
- Fertility Rate: Average number of children born to a woman during her lifetime.
- Forested Area (%): Percentage of land area covered by forests.
- Gasoline_Price: Price of gasoline per liter in local currency.
- GDP: Gross Domestic Product, the total value of goods and services produced in the country.
- Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.
- Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.
- Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.
- Largest City: Name of the country's largest city.
- Life Expectancy: Average number of years a newborn is expected to live.
- Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.
- Minimum Wage: Minimum wage level in local currency.
- Official Language: Official language(s) spoken in the country.
- Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.
- Physicians per Thousand: Number of physicians per thousand people.
- Population: Total population of the country.
- Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.
- Tax Revenue (%): Tax revenue as a percentage of GDP.
- Total Tax Rate: Overall tax burden as a percentage of commercial profits.
- Unemployment Rate: Percentage of the labor force that is unemployed.
- Urban Population: Percentage of the population living in urban areas.
- Latitude: Latitude coordinate of the country's location.
- Longitude: Longitude coordinate of the country's location.
- Analyze population density and land area to study spatial distribution patterns.
- Investigate the relationship between agricultural land and food security.
- Examine carbon dioxide emissions and their impact on climate change.
- Explore correlations between economic indicators such as GDP and various socio-economic factors.
- Investigate educational enrollment rates and their implications for human capital development.
- Analyze healthcare metrics such as infant mortality and life expectancy to assess overall well-being.
- Study labor market dynamics through indicators such as labor force participation and unemployment rates.
- Investigate the role of taxation and its impact on economic development.
- Explore urbanization trends and their social and environmental consequences.
Data Source: This dataset was compiled from multiple data sources
If this was helpful, a vote is appreciated ❤️ Thank you 🙂
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Data consists of Generation, Storage and Transmission Data for 8 different tests cases split into two groups.
Group 1: IEEE test cases
6bus, 14bus and 24bus.
All data for these cases is stored within their respective excel files.
Group 2: NEM
5bus Central 2036
5bus Step Change 2036
5bus Slow Change 2036
5bus High DER 2036
5bus Fast Change 2036
Data for the traces, demand, and each asset type is stored separately within excel files under each subfolder.
For code on how this data is ingested for modelling, or further info, please contact.
Drew Mitchell
Drew.mitchell1@monash.edu
Facebook
TwitterBy Health [source]
This dataset contains mortality statistics for 122 U.S. cities in 2016, providing detailed information about all deaths that occurred due to any cause, including pneumonia and influenza. The data is voluntarily reported from cities with populations of 100,000 or more, and it includes the place of death and the week during which the death certificate was filed. Data is provided broken down by age group and includes a flag indicating the reliability of each data set to help inform analysis. Each row also provides longitude and latitude information for each reporting area in order to make further analysis easier. These comprehensive mortality statistics are invaluable resources for tracking disease trends as well as making comparisons between different areas across the country in order to identify public health risks quickly and effectively
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains mortality rates for 122 U.S. cities in 2016, including deaths by age group and cause of death. The data can be used to study various trends in mortality and contribute to the understanding of how different diseases impact different age groups across the country.
In order to use the data, firstly one has to identify which variables they would like to use from this dataset. These include: reporting area; MMWR week; All causes by age greater than 65 years; All causes by age 45-64 years; All causes by age 25-44 years; All causes by age 1-24 years; All causes less than 1 year old; Pneumonia and Influenza total fatalities; Location (1 & 2); flag indicating reliability of data.
Once you have identified the variables that you are interested in,you will need to filter the dataset so that it only includes relevant information for your analysis or research purposes. For example, if you are looking at trends between different ages, then all you would need is information on those 3 specific cause groups (greater than 65, 45-64 and 25-44). You can do this using a selection tool that allows you to pick only certain columns from your data set or an excel filter tool if your data is stored as a csv file type .
Next step is preparing your data - it’s important for efficient analysis also helpful when there are too many variables/columns which can confuse our analysis process – eliminate unnecessary columns, rename column labels where needed etc ... In addition , make sure we clean up any missing values / outliers / incorrect entries before further investigation .Remember , outliers or corrupt entries may lead us into incorrect conclusions upon analyzing our set ! Once we complete the cleaning steps , now its safe enough transit into drawing insights !
The last step involves using statistical methods such as linear regression with multiple predictors or descriptive statistical measures such as mean/median etc ..to draw key insights based on analysis done so far and generate some actionable points !
With these steps taken care off , now its easier for anyone who decides dive into another project involving this particular dataset with added advantage formulated out of existing work done over our previous investigations!
- Creating population health profiles for cities in the U.S.
- Tracking public health trends across different age groups
- Analyzing correlations between mortality and geographical locations
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: rows.csv | Column name | Description | |:--------------------------------------------|:-----------------------------------...
Facebook
Twitter140 Male migrant workers aged 20–51 years of either Bangladeshi or Indian ethnicity from a single dormitory in Singapore volunteered to participate in the study. In total, 133 blood samples were taken at the start of the study and were used to assess vitamin B12, hemoglobin, ferritin, folate, and zinc status; a sub-sample underwent for homocysteine testing. Anthropometric measurements and vital signs, including height, weight, and blood pressure were recorded before and after the intervention., A Clinical Research Organisation (CRO) has collected the data during the nutrition intervention study. The CRO has entered the dataset into a database (Excel file). Protocol violations and deviations, study product consumption and study completion data were evaluated during the data review to define the per protocol dataset before database lock. An independent statistician performed the statistics of the data. For all analyses, statistical significance was defined as a p-value <0.05. SPSS version 26 software (IBM Corp., Released 2019, Armonk, NY, USA) was used for all analyses. An Intent-to-treat analysis was considered for the data analysis. Analysis of outcomes was done with the original continuous measurement values. To minimize the inter-individual component covariance, the difference between baseline and endline measurements for each individual survey subject was calculated. P-values were calculated using a one-sample t-test comparing the average difference to zero. A two-sample..., Excel is needed to open the data files, and SPSS or Word is needed to open the statistical outcomes.
Facebook
Twitterhttps://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Latest Monthly statistics on Learning Disabilities and Autism (LDA) from the Assuring Transformation collection and MHSDS collection. For the first time, this publication brings together the LDA data from the Assuring Transformation collection and the LDA service specific statistics from the Mental Health Statistics Data Set (MHSDS). There are differences in the inpatient figures between the MHSDS and AT data sets and work is underway to better understand these. NHS Digital plans to publish additional monthly comparator data from this work in future publications. The MHSDS LDA data are currently labelled experimental as they are undergoing evaluation. Further information on the quality of these statistics is available in the Data Quality section of the main report. It is planned that the MHSDS will become the sole source of inpatient LDA data in the future, replacing Assuring Transformation. There is a slight difference in scope between the two data collections. The MHSDS data is from providers based in England and includes care provided in England but may be commissioned outside England. Whereas the Assuring Transformation data are provided by English commissioners and healthcare will typically be provided in England but also includes data on care commissioned in England and provided elsewhere in the UK. The release comprises: Assuring Transformation Publication: This statistical release published by NHS Digital makes available the most recent data relating to patients with learning disabilities and/or autistic spectrum disorder receiving inpatient care commissioned by the NHS in England MHSDS LDA Publication: This publication provides statistics relating to NHS funded secondary mental health, learning disabilities and autism services in England. These statistics are derived from submissions made using version 2.0 of the Mental Health Services Dataset (MHSDS). Prior to May 2018 the LDA service specific statistics were included in the main MHSDS publication. Each publication consists of the following documents: •A report which presents England level analysis of key measures. •A monthly CSV file which presents key measures at England level. •A metadata file to accompany the CSV file, which provides contextual information for each measure. •An excel reference data tables showing data as reported and total patient counts. For AT these are retrospectively updated from March 2015 onwards. •Assuring Transformation also has an easy read version of the main report highlighting key findings in an easy-to-understand way. We hope this information is helpful and would be grateful if you could spare a couple of minutes to complete a short customer satisfaction survey. Please use the link to the form at the bottom of this page to provide us with any feedback or suggestions for improving the report. Please note the MHSDS Main Report, Metadata and Comparators files with February 2018 data have been revised and replaced as of 14th June 2018. This was mainly due to a calculation error within the ‘provider comparison’ tab in the Comparator file. We apologise for any inconvenience this has caused and processes have now been implemented to prevent this from occurring again in the future
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
We share the complete aerosol optical depth dataset with high spatial (1x1km^2) and temporal (daily) resolution and the Beijing 1954 projection (https://epsg.io/2412) for mainland China (2015-2018). The original aerosol optical depth images are from Multi-Angle Implementation of Atmospheric Correction Aerosol Optical Depth (MAIAC AOD) (https://lpdaac.usgs.gov/products/mcd19a2v006/) with the similar spatiotemporal resolution and the sinusoidal projection (https://en.wikipedia.org/wiki/Sinusoidal_projection). After projection conversion, eighteen tiles of MAIAC AOD were merged to obtain a large image of AOD covering the entire area of mainland China. Due to the conditions of clouds and high surface reflectance, each original MAIAC AOD image usually has many missing values, and the average missing percentage of each AOD image may exceed 60%. Such a high percentage of missing values severely limits applicability of the original MAIAC AOD dataset product. We used the sophisticated method of full residual deep networks (Li et al, 2020, https://ieeexplore.ieee.org/document/9186306) to impute the daily missing MAIAC AOD, thus obtaining the complete (no missing values) high-resolution AOD data product covering mainland China. The covariates used in imputation included coordinates, elevation, MERRA2 coarse-resolution PBLH and AOD variables, cloud fraction, high-resolution meteorological variables (air pressure, air temperature, relative humidity and wind speed) and/or time index etc. Ground monitoring data were used to generate high-resolution meteorological variables to ensure the reliability of interpolation. Overall, our daily imputation models achieved an average training R^2 of 0.90 with a range of 0.75 to 0.97 (average RMSE: 0.075, with a range of 0.026 to 0.32) and an average test R^2 of 0.90 with a range of 0.75 to 0.97 (average RMSE: 0.075 with a range of 0.026 to 0.32). With almost no difference between training metrics and test metrics, the high test R^2 and low test RMSE show the reliability of AOD imputation. In the evaluation using the ground AOD data from the monitoring stations of the Aerosol Robot Network (AERONET) in mainland China, our method obtained a R^2 of 0.78 and RMSE of 0.27, which further illustrated the reliability of the method. This database contains four datasets: - Daily complete high-resolution AOD image dataset for mainland China from January 1, 2015 to December 31, 2018. The archived resources contain 1461 images stored in 1461 files, and 3 summary Excel files. The table “CHN_AOD_INFO.xlsx” describing the properties of the 1461 images, including projection, training R^2 and RMSE, testing R^2 and RMSE, minmum, mean, median and maximum AOD that we predicted. - The table “Model_and_Accuracy_of_Meteorological_Elements.xlsx” describing the statistics of performance metrics in interpolation of high-resolution meteorological dataset. - The table “Evaluation_Using_AERONET_AOD.xlsx” showing the evaluation result of AERONET, including R^2, RMSE, and monitoring information used in this study.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The objective of this research was to analyze curated content published via newsletters and find out what software testing knowledge was present in those resources. Software Testing newsletters were analyzed in this research and the AtlasTI software was used with a grounded theory approach to tag the resources mentioned in these newsletters. The resources were obtained by visiting multiple curated software testing-related newsletters and downloading articles as PDFs. After downloading, open/axial coding was used to code each file based on various different categories. The attached excel files provide a detailed view of what common software testing technologies, techniques, problems, and more are mentioned in newsletter resources. This data set is linked to a bachelor thesis completed at the EEMCS faculty at the TU Delft. A link will be added after publication.
Facebook
TwitterList of the data tables as part of the Immigration system statistics Home Office release. Summary and detailed data tables covering the immigration system, including out-of-country and in-country visas, asylum, detention, and returns.
If you have any feedback, please email MigrationStatsEnquiries@homeoffice.gov.uk.
The Microsoft Excel .xlsx files may not be suitable for users of assistive technology.
If you use assistive technology (such as a screen reader) and need a version of these documents in a more accessible format, please email MigrationStatsEnquiries@homeoffice.gov.uk
Please tell us what format you need. It will help us if you say what assistive technology you use.
Immigration system statistics, year ending September 2025
Immigration system statistics quarterly release
Immigration system statistics user guide
Publishing detailed data tables in migration statistics
Policy and legislative changes affecting migration to the UK: timeline
Immigration statistics data archives
https://assets.publishing.service.gov.uk/media/691afc82e39a085bda43edd8/passenger-arrivals-summary-sep-2025-tables.ods">Passenger arrivals summary tables, year ending September 2025 (ODS, 31.5 KB)
‘Passengers refused entry at the border summary tables’ and ‘Passengers refused entry at the border detailed datasets’ have been discontinued. The latest published versions of these tables are from February 2025 and are available in the ‘Passenger refusals – release discontinued’ section. A similar data series, ‘Refused entry at port and subsequently departed’, is available within the Returns detailed and summary tables.
https://assets.publishing.service.gov.uk/media/691b03595a253e2c40d705b9/electronic-travel-authorisation-datasets-sep-2025.xlsx">Electronic travel authorisation detailed datasets, year ending September 2025 (MS Excel Spreadsheet, 58.6 KB)
ETA_D01: Applications for electronic travel authorisations, by nationality
ETA_D02: Outcomes of applications for electronic travel authorisations, by nationality
https://assets.publishing.service.gov.uk/media/6924812a367485ea116a56bd/visas-summary-sep-2025-tables.ods">Entry clearance visas summary tables, year ending September 2025 (ODS, 53.3 KB)
https://assets.publishing.service.gov.uk/media/691aebbf5a253e2c40d70598/entry-clearance-visa-outcomes-datasets-sep-2025.xlsx">Entry clearance visa applications and outcomes detailed datasets, year ending September 2025 (MS Excel Spreadsheet, 30.2 MB)
Vis_D01: Entry clearance visa applications, by nationality and visa type
Vis_D02: Outcomes of entry clearance visa applications, by nationality, visa type, and outcome
Additional data relating to in country and overse
Facebook
TwitterWe welcome any feedback on the structure of our data files, their usability, or any suggestions for improvements; please contact vehicles statistics.
The Department for Transport is committed to continuously improving the quality and transparency of our outputs, in line with the Code of Practice for Statistics. In line with this, we have recently concluded a planned review of the processes and methodologies used in the production of Vehicle licensing statistics data. The review sought to seek out and introduce further improvements and efficiencies in the coding technologies we use to produce our data and as part of that, we have identified several historical errors across the published data tables affecting different historical periods. These errors are the result of mistakes in past production processes that we have now identified, corrected and taken steps to eliminate going forward.
Most of the revisions to our published figures are small, typically changing values by less than 1% to 3%. The key revisions are:
Licensed Vehicles (2014 Q3 to 2016 Q3)
We found that some unlicensed vehicles during this period were mistakenly counted as licensed. This caused a slight overstatement, about 0.54% on average, in the number of licensed vehicles during this period.
3.5 - 4.25 tonnes Zero Emission Vehicles (ZEVs) Classification
Since 2023, ZEVs weighing between 3.5 and 4.25 tonnes have been classified as light goods vehicles (LGVs) instead of heavy goods vehicles (HGVs). We have now applied this change to earlier data and corrected an error in table VEH0150. As a result, the number of newly registered HGVs has been reduced by:
3.1% in 2024
2.3% in 2023
1.4% in 2022
Table VEH0156 (2018 to 2023)
Table VEH0156, which reports average CO₂ emissions for newly registered vehicles, has been updated for the years 2018 to 2023. Most changes are minor (under 3%), but the e-NEDC measure saw a larger correction, up to 15.8%, due to a calculation error. Other measures (WLTP and Reported) were less notable, except for April 2020 when COVID-19 led to very few new registrations which led to greater volatility in the resultant percentages.
Neither these specific revisions, nor any of the others introduced, have had a material impact on the statistics overall, the direction of trends nor the key messages that they previously conveyed.
Specific details of each revision made has been included in the relevant data table notes to ensure transparency and clarity. Users are advised to review these notes as part of their regular use of the data to ensure their analysis accounts for these changes accordingly.
If you have questions regarding any of these changes, please contact the Vehicle statistics team.
Data tables containing aggregated information about vehicles in the UK are also available.
CSV files can be used either as a spreadsheet (using Microsoft Excel or similar spreadsheet packages) or digitally using software packages and languages (for example, R or Python).
When using as a spreadsheet, there will be no formatting, but the file can still be explored like our publication tables. Due to their size, older software might not be able to open the entire file.
df_VEH0120_GB: https://assets.publishing.service.gov.uk/media/68ed0c52f159f887526bbda6/df_VEH0120_GB.csv">Vehicles at the end of the quarter by licence status, body type, make, generic model and model: Great Britain (CSV, 59.8 MB)
Scope: All registered vehicles in Great Britain; from 1994 Quarter 4 (end December)
Schema: BodyType, Make, GenModel, Model, Fuel, LicenceStatus, [number of vehicles; 1 column per quarter]
df_VEH0120_UK: <a class="govuk-link" href="https://assets.publishing.service.gov.uk/media/68ed0c2
Facebook
TwitterThese datasets are framed on predicting the short-term electricity, this forecasting problem is known in the research field as short-term load forecasting (STLF). These datasets address the STLF problem for the Panama power system, in which the forecasting horizon is one week, with hourly steps, which is a total of 168 hours. These datasets are useful to train and test forecasting models and compare their results with the power system operator official forecast (take a look at real-time electricity load). The datasets include historical load, a vast set of weather variables, holidays, and historical load weekly forecast features. More information regarding these datasets context, a literature review of forecasting techniques suitable for this dataset, and results after testing a set of Machine Learning; are available in the article Short-Term Electricity Load Forecasting with Machine Learning. (Aguilar Madrid, E.; Antonio, N. Short-Term Electricity Load Forecasting with Machine Learning. Information 2021, 12, 50. https://doi.org/10.3390/info12020050)
The main objectives around these datasets are: 1. Evaluate the power system operator official forecasts (weekly pre-dispatch forecast) against the real load, on weekly basis. 2. Develop, train and test forecasting models to improve the operator official weekly forecasts (168 hours), in different scenarios.
The following considerations should be kept to compare forecasting results with the weekly pre-dispatch forecast: 1. Saturday is the first day of each weekly forecast; for instance, Friday is the last day. 2. The first full-week starting on Saturday should be considered as the first week of the year, to number the weeks. 3. A 72 hours gap of unseen records should be considered before the first day to forecast. In other words, next week forecast should be done with records until each Tuesday last hour. 4. Make sure to train and test keeping the chronological order of records.
Data sources provide hourly records, from January 2015 until June 2020. The data composition is the following: 1. Historical electricity load, available on daily post-dispatch reports, from the grid operator (ETESA, CND). 2. Historical weekly forecasts available on weekly pre-dispatch reports, both from ETESA, CND. 3. Calendar information related to school periods, from Panama's Ministry of Education, published in official gazette. 4. Calendar information related to holidays, from "When on Earth?" website. 5. Weather variables, such as temperature, relative humidity, precipitation, and wind speed, for three main cities in Panama, from Earthdata.
The original data sources provide the post-dispatch electricity load in individual Excel files on a daily basis and weekly pre-dispatch electricity load forecast data in individual Excel files on a weekly basis, both with hourly granularity. Holidays and school periods data is sparse, along with websites and PDF files. Weather data is available on daily NetCDF files.
For simplicity, the published datasets are already pre-processed by merging all data sources on the date-time index: 1. A CSV file containing all records in a single continuous dataset with all variables. 2. A CSV file containing the load forecast from weekly pre-dispatch reports. 3. Two Excel files containing suggested regressors and 14 pairs of training/testing datasets as described in the PDF file.
These 14 pairs of raining/testing datasets are selected according to these testing criteria: 1. A testing week for each month before the lockdown due to COVID-19. 2. Select testing weeks containing holidays. 3. Plus, two testing weeks during the lockdown.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set shows the results of a study investigating whether writing self-explanations has a stronger effect than interpolated testing on reducing task-unrelated thoughts and improving learning outcomes.
The data comes from 138 participants distributed across three groups who were all tasked with reviewing the same video. Each participant completed a knowledge test before and after watching the video to compare learning outcomes between the three groups. The first group was a control group; the second group answered interpolated tests; and the third group wrote self-explanations at pauses in the videos.
This dataset contains the following:
A source file with programming code used for the data analysis (analysis.qmd) and a pdf showing the results of the analysis (analysis.pdf).
Two Excel files. One contains the participant scores from the knowledge tests as well as the participants grouping (knowledge_tests.xlsx). The other contains the participants coded thought reports (thought_reports.xlsx).
An accompany PDF file that provides an overview of the materials used in this study, including the specific questions the participants were asked (Materials.pdf).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Usability testing results of the bachelor's thesis titled "The International Image Interoperability Framework (IIIF): raising awareness of the user benefits for scholarly editions".
Remote and in-person usability tests on the Universal Viewer and Mirador, two IIIF-compliant clients, took place between March and May 2017. The tests were conducted with Loop11 (remote testing) and Morae (in-person testing).
The dataset is composed of Excel files, screenshots (tasks and heat maps) as well as videos.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
The data presented here have been collected in the context of the EU LIFE EuroLargeCarnivores Outreach Project (LIFE16 GIE/DE/000661). The data set provided is part of a much larger set of data assembled during two different online stakeholder surveys conducted in late 2018/early 2019 (Baseline) and 2021 (Outcome Survey, last year of the project) in 14 countries participating in the project. The data selected are the basis for the analysis and results presented and discussed in the Research Article “Did we achieve what we aimed for? Assessing the outcomes of a human-carnivore conflict mitigation and coexistence project in Europe” by Carol Grossmann and Laszló Pátkó, published in Wildlife Biology in 2024. The dataset is provided as an excel sheet displaying anonymized numerical respondent IDs (rows), and coded answers to selected questions (columns) of these two surveys. The table includes full explanatory wording for all codes used. The data set provided contains n=1262 individual data-subsets from the Baseline Survey and n=1056 individual data subsets from the Outcome Survey in 2021. Part of the questions are identical in both survey sets for direct comparison. Cross references are provided for questions posed in both surveys for comparison but denominated with different numbers in the respective surveys. Part of the questions were posed only in the 2021 survey. Some questions/answers serve as filters for a differentiated analysis according to stakeholder categories, engagement in networking activities, or stakeholder participation and non-participation in project interventions. For more details about the methods of data collection and analysis see Grossmann et al. 2020 and Grossmann and Patko 2024. The reuse potential of this data set lies in the opportunity to assess project outcomes with further stakeholder categories in correlation with respondents’ (non-)participation in project interventions. No further legal or ethical considerations need to be taken, as all individual respondent sets have been fully anonymized. Methods We conducted two online stakeholder surveys in the 14 project partner countries, within the European outreach project "EuroLargeCarnivores". We used google forms for the questionnaires, as mandated from ELC project lead. In late 2018 and early 2019, we conducted a baseline survey (t0) and in 2021 (t0+3), an ‘endline’ survey to assess changes over the project’s lifetime on the stakeholder level. The baseline survey ‘Large Carnivores in Europe 2018’ took place during the first year of the project in all fourteen countries. In 2021, the second comparative stakeholder perception survey ‘Monitoring the Impact’ was launched during the final year of the outreach project in the same distribution range applying the same distribution method. The Forest Research Institute of Baden-Württemberg (FVA) designed, provided, and coordinated both survey questionnaires and data collection procedures, while staff of the regional project partners provided additional preparations, such as translation of the English master questionnaires into the twelve regional languages, as well as the actual data collection. We used a prearranged multi-channel and pyramid distribution system (Atkinson and Flint 2004, Dillman et al 2014, Grossmann et al. 2020). The links to the surveys were distributed via the partners’ systematically updated distribution lists, individual in-person interviews, websites, and social media propagation, offering survey respondents further distribution of the survey through a snowball system, thereby reaching out to as many stakeholders in the 14 project partner countries as possible (Atkinson and Flint 2004, Dillman et al. 2014, Grossmann et al. 2020). After the closure of the surveys, the country datasets were aggregated, re-translated, cleaned and fully coded for analysis. The 2018 survey received n = 1262 returns, the 2021 survey resulted in n = 1056 data, a delta of 16%. Due to the strict enforcement of the European Union’s General Data Protection Regulation (GDPR), we could not address the respondents of Survey 2018 directly again. Additionally, due to the open accessibility of the survey on social media, no concise distribution list of the recipient population is available. We still assumed a comparability of the two datasets for the research questions at hand (Grossmann et al. 2019, 2020). The statistical analysis used descriptive statistics and the X² Test, including Cramer V and Post Hoc tests (differences in standardized residuals) (Cohen 1988, Agresti 2007) for comparing the samples from 2018 and 2021, as well as subsamples of the 2021 sample for a more focused analysis. The analyses were performed using the statistics programs SPSS and Microsoft Excel. For more details about the methods of data collection and analysis see Grossmann et al. 2020 and Grossmann and Patko 2024.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This excel file will do a statistical tests of whether two ROC curves are different from each other based on the Area Under the Curve. You'll need the coefficient from the presented table in the following article to enter the correct AUC value for the comparison: Hanley JA, McNeil BJ (1983) A method of comparing the areas under receiver operating characteristic curves derived from the same cases. Radiology 148:839-843.