100+ datasets found
  1. f

    UC_vs_US Statistic Analysis.xlsx

    • figshare.com
    xlsx
    Updated Jul 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    F. (Fabiano) Dalpiaz (2020). UC_vs_US Statistic Analysis.xlsx [Dataset]. http://doi.org/10.23644/uu.12631628.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 9, 2020
    Dataset provided by
    Utrecht University
    Authors
    F. (Fabiano) Dalpiaz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sheet 1 (Raw-Data): The raw data of the study is provided, presenting the tagging results for the used measures described in the paper. For each subject, it includes multiple columns: A. a sequential student ID B an ID that defines a random group label and the notation C. the used notation: user Story or use Cases D. the case they were assigned to: IFA, Sim, or Hos E. the subject's exam grade (total points out of 100). Empty cells mean that the subject did not take the first exam F. a categorical representation of the grade L/M/H, where H is greater or equal to 80, M is between 65 included and 80 excluded, L otherwise G. the total number of classes in the student's conceptual model H. the total number of relationships in the student's conceptual model I. the total number of classes in the expert's conceptual model J. the total number of relationships in the expert's conceptual model K-O. the total number of encountered situations of alignment, wrong representation, system-oriented, omitted, missing (see tagging scheme below) P. the researchers' judgement on how well the derivation process explanation was explained by the student: well explained (a systematic mapping that can be easily reproduced), partially explained (vague indication of the mapping ), or not present.

    Tagging scheme:
    Aligned (AL) - A concept is represented as a class in both models, either
    

    with the same name or using synonyms or clearly linkable names; Wrongly represented (WR) - A class in the domain expert model is incorrectly represented in the student model, either (i) via an attribute, method, or relationship rather than class, or (ii) using a generic term (e.g., user'' instead ofurban planner''); System-oriented (SO) - A class in CM-Stud that denotes a technical implementation aspect, e.g., access control. Classes that represent legacy system or the system under design (portal, simulator) are legitimate; Omitted (OM) - A class in CM-Expert that does not appear in any way in CM-Stud; Missing (MI) - A class in CM-Stud that does not appear in any way in CM-Expert.

    All the calculations and information provided in the following sheets
    

    originate from that raw data.

    Sheet 2 (Descriptive-Stats): Shows a summary of statistics from the data collection,
    

    including the number of subjects per case, per notation, per process derivation rigor category, and per exam grade category.

    Sheet 3 (Size-Ratio):
    

    The number of classes within the student model divided by the number of classes within the expert model is calculated (describing the size ratio). We provide box plots to allow a visual comparison of the shape of the distribution, its central value, and its variability for each group (by case, notation, process, and exam grade) . The primary focus in this study is on the number of classes. However, we also provided the size ratio for the number of relationships between student and expert model.

    Sheet 4 (Overall):
    

    Provides an overview of all subjects regarding the encountered situations, completeness, and correctness, respectively. Correctness is defined as the ratio of classes in a student model that is fully aligned with the classes in the corresponding expert model. It is calculated by dividing the number of aligned concepts (AL) by the sum of the number of aligned concepts (AL), omitted concepts (OM), system-oriented concepts (SO), and wrong representations (WR). Completeness on the other hand, is defined as the ratio of classes in a student model that are correctly or incorrectly represented over the number of classes in the expert model. Completeness is calculated by dividing the sum of aligned concepts (AL) and wrong representations (WR) by the sum of the number of aligned concepts (AL), wrong representations (WR) and omitted concepts (OM). The overview is complemented with general diverging stacked bar charts that illustrate correctness and completeness.

    For sheet 4 as well as for the following four sheets, diverging stacked bar
    

    charts are provided to visualize the effect of each of the independent and mediated variables. The charts are based on the relative numbers of encountered situations for each student. In addition, a "Buffer" is calculated witch solely serves the purpose of constructing the diverging stacked bar charts in Excel. Finally, at the bottom of each sheet, the significance (T-test) and effect size (Hedges' g) for both completeness and correctness are provided. Hedges' g was calculated with an online tool: https://www.psychometrica.de/effect_size.html. The independent and moderating variables can be found as follows:

    Sheet 5 (By-Notation):
    

    Model correctness and model completeness is compared by notation - UC, US.

    Sheet 6 (By-Case):
    

    Model correctness and model completeness is compared by case - SIM, HOS, IFA.

    Sheet 7 (By-Process):
    

    Model correctness and model completeness is compared by how well the derivation process is explained - well explained, partially explained, not present.

    Sheet 8 (By-Grade):
    

    Model correctness and model completeness is compared by the exam grades, converted to categorical values High, Low , and Medium.

  2. Classroom-Data

    • kaggle.com
    zip
    Updated Dec 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harinivas Ganjarla (2024). Classroom-Data [Dataset]. https://www.kaggle.com/datasets/harinivasganjarla/classroom-data
    Explore at:
    zip(64213029 bytes)Available download formats
    Dataset updated
    Dec 26, 2024
    Authors
    Harinivas Ganjarla
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset consists of 6 Directories. Each Directory has an Image folder and a labels.csv file. The Images Folder consists of captures of Students Being in the Classroom and the labels consists of the image file name and the corresponding number of people in that image. Labels.csv consists of columns image_file_name and count. Count column may contains NULL values in it.

  3. f

    Data from: Aspects of University Students' Graph Sense in a Virtual Learning...

    • scielo.figshare.com
    jpeg
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fabiana Chagas de Andrade; Carolina Vieira Schiller; Dione Aparecido Ferreira da Silva; Larissa Pereira Menezes; Alexandre Sousa da Silva (2023). Aspects of University Students' Graph Sense in a Virtual Learning Environment [Dataset]. http://doi.org/10.6084/m9.figshare.14304727.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    SciELO journals
    Authors
    Fabiana Chagas de Andrade; Carolina Vieira Schiller; Dione Aparecido Ferreira da Silva; Larissa Pereira Menezes; Alexandre Sousa da Silva
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract To break with the traditional model of Basic Statistics classes in Higher Education, we sought on Statistical Literacy and Critical Education to develop an activity about graphic interpretation, which took place in a Virtual Learning Environment (VLE), as a complement to classroom meetings. Twenty-three engineering students from a public higher education institution in Rio de Janeiro took part in the research. Our objective was to analyze elements of graphic comprehension in an activity that consisted of identifying incorrect statistical graphs, conveyed by the media, followed by argumentation and interaction among students about these errors. The main results evidenced that elements of the Graphic Sense were present in the discussions and were the goal of the students' critical analysis. The VLE was responsible for facilitating communication, fostering student participation, and linguistic writing, so the use of digital technologies and activities favored by collaboration and interaction are important for statistical development, but such construction is a gradual process.

  4. Ad hoc Statistical Analysis for surveys: 2020/21 Quarter 3

    • gov.uk
    • s3.amazonaws.com
    Updated Dec 4, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Digital, Culture, Media & Sport (2020). Ad hoc Statistical Analysis for surveys: 2020/21 Quarter 3 [Dataset]. https://www.gov.uk/government/statistical-data-sets/ad-hoc-statistical-analysis-202021-quarter-3
    Explore at:
    Dataset updated
    Dec 4, 2020
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Digital, Culture, Media & Sport
    Description

    This page lists ad-hoc statistics released during the period October to December 2020. These are additional analyses not included in any of the Department for Digital, Culture, Media and Sport’s standard publications.

    If you would like any further information please contact evidence@dcms.gov.uk.

    October 2020 - Taking Part: Lotteries request

    This piece of analysis covers:

    1. The proportion of adults who had played a National Lottery Game, who also had played any society lotteries in the last 12 months
    2. The proportion of adults who had played a Society Lottery Game, who also had played any National Lottery game in the last 12 months.

    Here is a link to the lotteries and gambling page for the annual Taking Part survey.

    https://assets.publishing.service.gov.uk/media/5f7c439dd3bf7f2d4df83aeb/Lottery_data_table.xlsx">National Lottery and Society Lottery Participation

     <p class="gem-c-attachment_metadata"><span class="gem-c-attachment_attribute">MS Excel Spreadsheet</span>, <span class="gem-c-attachment_attribute">70.2 KB</span></p>
    
    
    
    
     <p class="gem-c-attachment_metadata">This file may not be suitable for users of assistive technology.</p>
     <details data-module="ga4-event-tracker" data-ga4-event='{"event_name":"select_content","type":"detail","text":"Request an accessible format.","section":"Request an accessible format.","index_section":1}' class="gem-c-details govuk-details govuk-!-margin-bottom-0" title="Request an accessible format.">
    

    Request an accessible format.

      If you use assistive technology (such as a screen reader) and need a version of this document in a more accessible format, please email <a href="mailto:enquiries@dcms.gov.uk" target="_blank" class="govuk-link">enquiries@dcms.gov.uk</a>. Please tell us what format you need. It will help us if you say what assistive technology you use.
    

    October 2020 - Community Life Survey: Loneliness request

    This piece of analysis covers how often people feel they lack companionship, feel left out and feel isolated. This analysis also provides demographic breakdowns of the loneliness indicators.

    Here is a link to the wellbeing and loneliness page for the annual Community Life survey.

  5. CBSE Result Statistics Class XII - 2023

    • kaggle.com
    zip
    Updated Aug 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anas Khan (2023). CBSE Result Statistics Class XII - 2023 [Dataset]. https://www.kaggle.com/datasets/fiq423ubf/cbse-result-statistics-class-xii-2023
    Explore at:
    zip(881 bytes)Available download formats
    Dataset updated
    Aug 22, 2023
    Authors
    Anas Khan
    License

    http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

    Description

    Description: In the year 2023, the Central Board of Secondary Education (CBSE) conducted the Class XI examinations across various regions. The dataset presents a comprehensive overview of the results, categorized by different types of schools and regions. The data includes the number of students registered, the number of students who appeared for the exams, and the performance status of each category.

    The results encompass a diverse range of schools, including those under the Central Tibetan School Administration (CTSA), Jawahar Navodaya Vidyalayas (JNV), and Kendriya Vidyalayas (KV), as well as government and government-aided schools, and independent institutions.

    The "Status" column provides insights into the outcome of the exams, highlighting the number of students who successfully cleared the examinations. The "Region" column denotes the geographic distribution of the schools, allowing for a comprehensive analysis of performance across different areas.

    The dataset is a valuable resource for understanding the educational landscape and performance trends within the CBSE Class XI examinations for the year 2023. It offers an in-depth view of student participation, success rates, and the performance of different types of schools across various regions, contributing to a holistic assessment of the CBSE educational system's effectiveness and impact. Researchers, educators, and policymakers can leverage this data to identify patterns, make informed decisions, and implement targeted interventions to enhance the overall quality of education.

  6. d

    Protected Areas Database of the United States (PAD-US) 3.0 Vector Analysis...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Oct 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Protected Areas Database of the United States (PAD-US) 3.0 Vector Analysis and Summary Statistics [Dataset]. https://catalog.data.gov/dataset/protected-areas-database-of-the-united-states-pad-us-3-0-vector-analysis-and-summary-stati
    Explore at:
    Dataset updated
    Oct 22, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    United States
    Description

    Spatial analysis and statistical summaries of the Protected Areas Database of the United States (PAD-US) provide land managers and decision makers with a general assessment of management intent for biodiversity protection, natural resource management, and recreation access across the nation. The PAD-US 3.0 Combined Fee, Designation, Easement feature class (with Military Lands and Tribal Areas from the Proclamation and Other Planning Boundaries feature class) was modified to remove overlaps, avoiding overestimation in protected area statistics and to support user needs. A Python scripted process ("PADUS3_0_CreateVectorAnalysisFileScript.zip") associated with this data release prioritized overlapping designations (e.g. Wilderness within a National Forest) based upon their relative biodiversity conservation status (e.g. GAP Status Code 1 over 2), public access values (in the order of Closed, Restricted, Open, Unknown), and geodatabase load order (records are deliberately organized in the PAD-US full inventory with fee owned lands loaded before overlapping management designations, and easements). The Vector Analysis File ("PADUS3_0VectorAnalysisFile_ClipCensus.zip") associated item of PAD-US 3.0 Spatial Analysis and Statistics ( https://doi.org/10.5066/P9KLBB5D ) was clipped to the Census state boundary file to define the extent and serve as a common denominator for statistical summaries. Boundaries of interest to stakeholders (State, Department of the Interior Region, Congressional District, County, EcoRegions I-IV, Urban Areas, Landscape Conservation Cooperative) were incorporated into separate geodatabase feature classes to support various data summaries ("PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.zip") and Comma-separated Value (CSV) tables ("PADUS3_0SummaryStatistics_TabularData_CSV.zip") summarizing "PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.zip" are provided as an alternative format and enable users to explore and download summary statistics of interest (Comma-separated Table [CSV], Microsoft Excel Workbook [.XLSX], Portable Document Format [.PDF] Report) from the PAD-US Lands and Inland Water Statistics Dashboard ( https://www.usgs.gov/programs/gap-analysis-project/science/pad-us-statistics ). In addition, a "flattened" version of the PAD-US 3.0 combined file without other extent boundaries ("PADUS3_0VectorAnalysisFile_ClipCensus.zip") allow for other applications that require a representation of overall protection status without overlapping designation boundaries. The "PADUS3_0VectorAnalysis_State_Clip_CENSUS2020" feature class ("PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.gdb") is the source of the PAD-US 3.0 raster files (associated item of PAD-US 3.0 Spatial Analysis and Statistics, https://doi.org/10.5066/P9KLBB5D ). Note, the PAD-US inventory is now considered functionally complete with the vast majority of land protection types represented in some manner, while work continues to maintain updates and improve data quality (see inventory completeness estimates at: http://www.protectedlands.net/data-stewards/ ). In addition, changes in protected area status between versions of the PAD-US may be attributed to improving the completeness and accuracy of the spatial data more than actual management actions or new acquisitions. USGS provides no legal warranty for the use of this data. While PAD-US is the official aggregation of protected areas ( https://www.fgdc.gov/ngda-reports/NGDA_Datasets.html ), agencies are the best source of their lands data.

  7. Average government primary school class sizes by year (1997, 2002-2024)

    • data.nsw.gov.au
    • researchdata.edu.au
    csv, pdf
    Updated Oct 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NSW Department of Education (2025). Average government primary school class sizes by year (1997, 2002-2024) [Dataset]. https://data.nsw.gov.au/data/dataset/nsw-education-average-government-primary-school-class-sizes-by-year
    Explore at:
    pdf(211447), pdf(124328), pdf(41671), pdf(199264), csv(1309), pdf(64253), pdf(153579), pdf(158355), pdf(78212)Available download formats
    Dataset updated
    Oct 14, 2025
    Dataset authored and provided by
    NSW Department of Educationhttps://education.nsw.gov.au/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data Notes

    • Class size audits are conducted by CESE (Centre for Education Statistics and Evaluation) in March each year. Audits were not conducted in 1998, 1999, 2000 and 2001.

    • Data for 2020 should be treated with caution. The collection took place in March when schools were impacted by COVID-19, so fewer data checks were carried out.

    • Students attending schools for specific purposes (SSPs), students in support classes in regular schools and distance education students are excluded from average class size calculations.

    • The average class size for each grade is calculated by taking the number of students in all classes that a student from that grade is in (including composite/multi age classes) divided by the total number of classes that includes a student from that grade. This can result in a lower Kindergarten to Year 6 average class size than any individual year level.

    • From 2017, school size is based on primary enrolment rather than school classification.

    • Schools change size, so data in Table 2 is not necessarily comparable to previous iterations in earlier fact sheets.

    Data Source

    Education Statistics and Measurement, Centre for Education Statistics and Evaluation.

    Data quality statement

    The Class Size Audit Data Quality Statement addresses the quality of the Class Size Audit dataset using the dimensions outlined in the NSW Department of Education's data quality management framework: institutional environment, relevance, timeliness, accuracy, coherence, interpretability and accessibility. It provides an overview of the dataset's quality and highlights any known data quality issues.

  8. Education and training

    • gov.uk
    • tnaqa.mirrorweb.com
    • +1more
    Updated Jul 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Education (2020). Education and training [Dataset]. https://www.gov.uk/government/statistical-data-sets/fe-data-library-education-and-training
    Explore at:
    Dataset updated
    Jul 16, 2020
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Education
    Description

    This statistical data set includes information on education and training participation and achievements broken down into a number of reports including sector subject areas, participation by gender, age, ethnicity, disability participation.

    It also includes data on offender learning.

    Can’t find what you’re looking for?

    If you need help finding data please refer to the table finder tool to search for specific breakdowns available for FE statistics.

    Academic year 2019 to 2020 (reported to date)

    https://assets.publishing.service.gov.uk/media/5f0c1995e90e0703146d2393/201920-July_PT_ET_part_ach_demog_LAD.xlsx">Education and training aim participation and achievement demographics by sector subject area and local authority district: academic year 2019 to 2020 Q3 (August 2019 to April 2020)

     <p class="gem-c-attachment_metadata"><span class="gem-c-attachment_attribute">MS Excel Spreadsheet</span>, <span class="gem-c-attachment_attribute">33 MB</span></p>
    
    
    
    
     <p class="gem-c-attachment_metadata">This file may not be suitable for users of assistive technology.</p>
     <details data-module="ga4-event-tracker" data-ga4-event='{"event_name":"select_content","type":"detail","text":"Request an accessible format.","section":"Request an accessible format.","index_section":1}' class="gem-c-details govuk-details govuk-!-margin-bottom-0" title="Request an accessible format.">
    

    Request an accessible format.

      If you use assistive technology (such as a screen reader) and need a version of this document in a more accessible format, please email <a href="mailto:alternative.formats@education.gov.uk" target="_blank" class="govuk-link">alternative.formats@education.gov.uk</a>. Please tell us what format you need. It will help us if you say what assistive technology you use.
    

  9. w

    Vehicle licensing statistics data tables

    • gov.uk
    • s3.amazonaws.com
    Updated Oct 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Transport (2025). Vehicle licensing statistics data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/vehicle-licensing-statistics-data-tables
    Explore at:
    Dataset updated
    Oct 15, 2025
    Dataset provided by
    GOV.UK
    Authors
    Department for Transport
    Description

    Data files containing detailed information about vehicles in the UK are also available, including make and model data.

    Some tables have been withdrawn and replaced. The table index for this statistical series has been updated to provide a full map between the old and new numbering systems used in this page.

    The Department for Transport is committed to continuously improving the quality and transparency of our outputs, in line with the Code of Practice for Statistics. In line with this, we have recently concluded a planned review of the processes and methodologies used in the production of Vehicle licensing statistics data. The review sought to seek out and introduce further improvements and efficiencies in the coding technologies we use to produce our data and as part of that, we have identified several historical errors across the published data tables affecting different historical periods. These errors are the result of mistakes in past production processes that we have now identified, corrected and taken steps to eliminate going forward.

    Most of the revisions to our published figures are small, typically changing values by less than 1% to 3%. The key revisions are:

    Licensed Vehicles (2014 Q3 to 2016 Q3)

    We found that some unlicensed vehicles during this period were mistakenly counted as licensed. This caused a slight overstatement, about 0.54% on average, in the number of licensed vehicles during this period.

    3.5 - 4.25 tonnes Zero Emission Vehicles (ZEVs) Classification

    Since 2023, ZEVs weighing between 3.5 and 4.25 tonnes have been classified as light goods vehicles (LGVs) instead of heavy goods vehicles (HGVs). We have now applied this change to earlier data and corrected an error in table VEH0150. As a result, the number of newly registered HGVs has been reduced by:

    • 3.1% in 2024

    • 2.3% in 2023

    • 1.4% in 2022

    Table VEH0156 (2018 to 2023)

    Table VEH0156, which reports average CO₂ emissions for newly registered vehicles, has been updated for the years 2018 to 2023. Most changes are minor (under 3%), but the e-NEDC measure saw a larger correction, up to 15.8%, due to a calculation error. Other measures (WLTP and Reported) were less notable, except for April 2020 when COVID-19 led to very few new registrations which led to greater volatility in the resultant percentages.

    Neither these specific revisions, nor any of the others introduced, have had a material impact on the statistics overall, the direction of trends nor the key messages that they previously conveyed.

    Specific details of each revision made has been included in the relevant data table notes to ensure transparency and clarity. Users are advised to review these notes as part of their regular use of the data to ensure their analysis accounts for these changes accordingly.

    If you have questions regarding any of these changes, please contact the Vehicle statistics team.

    All vehicles

    Licensed vehicles

    Overview

    VEH0101: https://assets.publishing.service.gov.uk/media/68ecf5acf159f887526bbd7c/veh0101.ods">Vehicles at the end of the quarter by licence status and body type: Great Britain and United Kingdom (ODS, 99.7 KB)

    Detailed breakdowns

    VEH0103: https://assets.publishing.service.gov.uk/media/68ecf5abf159f887526bbd7b/veh0103.ods">Licensed vehicles at the end of the year by tax class: Great Britain and United Kingdom (ODS, 23.8 KB)

    VEH0105: https://assets.publishing.service.gov.uk/media/68ecf5ac2adc28a81b4acfc8/veh0105.ods">Licensed vehicles at

  10. s

    Data from: Data files used to study change dynamics in software systems

    • figshare.swinburne.edu.au
    pdf
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajesh Vasa (2024). Data files used to study change dynamics in software systems [Dataset]. http://doi.org/10.25916/sut.26288227.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    Swinburne
    Authors
    Rajesh Vasa
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    It is a widely accepted fact that evolving software systems change and grow. However, it is less well-understood how change is distributed over time, specifically in object oriented software systems. The patterns and techniques used to measure growth permit developers to identify specific releases where significant change took place as well as to inform them of the longer term trend in the distribution profile. This knowledge assists developers in recording systemic and substantial changes to a release, as well as to provide useful information as input into a potential release retrospective. However, these analysis methods can only be applied after a mature release of the code has been developed. But in order to manage the evolution of complex software systems effectively, it is important to identify change-prone classes as early as possible. Specifically, developers need to know where they can expect change, the likelihood of a change, and the magnitude of these modifications in order to take proactive steps and mitigate any potential risks arising from these changes. Previous research into change-prone classes has identified some common aspects, with different studies suggesting that complex and large classes tend to undergo more changes and classes that changed recently are likely to undergo modifications in the near future. Though the guidance provided is helpful, developers need more specific guidance in order for it to be applicable in practice. Furthermore, the information needs to be available at a level that can help in developing tools that highlight and monitor evolution prone parts of a system as well as support effort estimation activities. The specific research questions that we address in this chapter are: (1) What is the likelihood that a class will change from a given version to the next? (a) Does this probability change over time? (b) Is this likelihood project specific, or general? (2) How is modification frequency distributed for classes that change? (3) What is the distribution of the magnitude of change? Are most modifications minor adjustments, or substantive modifications? (4) Does structural complexity make a class susceptible to change? (5) Does popularity make a class more change-prone? We make recommendations that can help developers to proactively monitor and manage change. These are derived from a statistical analysis of change in approximately 55000 unique classes across all projects under investigation. The analysis methods that we applied took into consideration the highly skewed nature of the metric data distributions. The raw metric data (4 .txt files and 4 .log files in a .zip file measuring ~2MB in total) is provided as a comma separated values (CSV) file, and the first line of the CSV file contains the header. A detailed output of the statistical analysis undertaken is provided as log files generated directly from Stata (statistical analysis software).

  11. Ad hoc statistical analysis: 2024/25 quarter 3

    • gov.uk
    Updated Dec 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Culture, Media and Sport (2024). Ad hoc statistical analysis: 2024/25 quarter 3 [Dataset]. https://www.gov.uk/government/statistical-data-sets/ad-hoc-statistical-analysis-202425-quarter-3
    Explore at:
    Dataset updated
    Dec 19, 2024
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Culture, Media and Sport
    Description

    This page lists ad-hoc statistics released during the period October - December 2024. These are additional analyses not currently included in any of the Department for Culture, Media and Sport’s standard publications.

    If you would like any further information please contact evidence@dcms.gov.uk

    December 2024 - DCMS Sectors Economic Estimates: Art and Antiques Market

    This is an ad-hoc release that provides economic estimates for the art and antiques market. This release includes estimates for the art and antiques market for:

    • Gross value added (GVA), 2010 to 2022, and provisional estimates for 2023. This includes estimates in current prices and in chained volume measure s (data in real terms) for comparisons over time.
    • Employment (number of filled jobs), 2011 to 2023: this includes a breakdown by employment type (employed or self-employed)
    • Imports and exports of goods, 2016 to 2021

    These statistics for the art and antiques market show that:

    • GVA was provisionally estimated to be £0.8 billion in 2023.

    • There were 39,000 filled jobs in 2023.

    • Exports of goods totalled £3.5 billion and imports of goods totalled £1.3 billion in 2021.

    https://assets.publishing.service.gov.uk/media/6762de2bff2c870561bde7e8/DCMS_Economic_Estimates_GVA_Art_Antiques_market_2010_2023.ods">DCMS Sectors Economic Estimates: Art and Antiques Market GVA 2010 to 2022, and 2023 (provisional)

     <p class="gem-c-attachment_metadata"><span class="gem-c-attachment_attribute"><abbr title="OpenDocument Spreadsheet" class="gem-c-attachment_abbr">ODS</abbr></span>, <span class="gem-c-attachment_attribute">10.5 KB</span></p>
    
    
    
      <p class="gem-c-attachment_metadata">
       This file is in an <a href="https://www.gov.uk/guidance/using-open-document-formats-odf-in-your-organisation" target="_self" class="govuk-link">OpenDocument</a> format
    

    https://assets.publishing.service.gov.uk/media/6762de51ff2c870561bde7e9/DCMS_Economic_Estimates_Employment_Art_and_Antiques_market_2011_2023.ods">DCMS Sectors Economic Estimates: Art and Antiques Market Employment 2011 to 2023

     <p class="gem-c-attachment_metadata"><span class="gem-c-attachm
    
  12. Services by employment size class (NACE Rev. 2, H-N, S95) (2005-2020)

    • ec.europa.eu
    Updated Jun 27, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2019). Services by employment size class (NACE Rev. 2, H-N, S95) (2005-2020) [Dataset]. http://doi.org/10.2908/SBS_SC_1B_SE_R2
    Explore at:
    application/vnd.sdmx.data+xml;version=3.0.0, json, application/vnd.sdmx.data+csv;version=1.0.0, application/vnd.sdmx.genericdata+xml;version=2.1, application/vnd.sdmx.data+csv;version=2.0.0, tsvAvailable download formats
    Dataset updated
    Jun 27, 2019
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2005 - 2020
    Area covered
    Lithuania, Spain, Albania, Finland, Estonia, Norway, Netherlands, European Union, Slovenia, European Union
    Description

    Structural business statistics (SBS) describes the structure, conduct and performance of economic activities, down to the most detailed activity level (several hundred economic sectors).

    SBS are transmitted annually by the EU Member States on the basis of a legal obligation from 1995 onwards.

    SBS covers all activities of the business economy with the exception of agricultural activities and personal services and the data are provided by all EU Member States, Iceland, Norway and Switzerland, some candidate and potential candidate countries. The data are collected by domain of activity (annex) :

    • Annex I - Services,
    • Annex II - Industry,
    • Annex III - Trade, and
    • Annex IV- Constructions and by datasets. Each annex contains several datasets as indicated in the SBS Regulation.

    The majority of the data is collected by National Statistical Institutes (NSIs) by means of statistical surveys, business registers or from various administrative sources. Regulatory or controlling national offices for financial institutions or central banks often provide the information required for the financial sector (NACE Rev 2 Section K / NACE Rev 1.1 Section J).

    Member States apply various statistical methods, according to the data source, such as grossing up, model based estimation or different forms of imputation, to ensure the quality of SBSs produced.

    Main characteristics (variables) of the SBS data category:

    • Business Demographic variables (e.g. Number of enterprises),
    • "Output related" variables (e.g. Turnover, Value added),
    • "Input related" variables: labour input (e.g. Employment, Hours worked); goods and services input (e.g. Total of purchases); capital input (e.g. Material investments).

    All SBS characteristics are published on Eurostat’s website by tables and an example of the existent tables is presented below:

    • Annual enterprise statistics: Characteristics collected are published by country and detailed on NACE Rev 2 and NACE Rev 1.1 class level (4-digits). Some classes or groups in 'services' section have been aggregated.
    • Annual enterprise statistics broken down by size classes: Characteristics are published by country and detailed down to NACE Rev 2 and NACE Rev 1.1 group level (3-digits) and employment size class. For trade (NACE Rev 2 and NACE Rev 1.1 Section G) a supplementary breakdown by turnover size class is available.
    • Annual regional statistics: Four characteristics are published by NUTS-2 country region and detailed on NACE Rev 2 and NACE Rev 1.1 division level (2-digits) (but to group level (3-digits) for the trade section).

    More information on the contents of different tables: the detail level and breakdowns required starting with the reference year 2008 is defined in Commission Regulation N° 251/2009. For previous reference years it is included in Commission Regulations (EC) N° 2701/98 and amended by Commission Regulation N°1614/2002 and Commission Regulation N°1669/2003.

    Several important derived indicators are generated in the form of ratios of certain monetary characteristics or per head values. A list with the available derived indicators is available below in the Annex.

  13. FIRE1121: previous data tables

    • gov.uk
    Updated Oct 18, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Home Office (2018). FIRE1121: previous data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/fire1121-previous-data-tables
    Explore at:
    Dataset updated
    Oct 18, 2018
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Home Office
    Description

    FIRE1121: Staff joining fire authorities, by fire and rescue authority, ethnicity and role (17 October 2024)

    https://assets.publishing.service.gov.uk/media/670782963b919067bb482f33/fire-statistics-data-tables-fire1121-191023.xlsx">FIRE1121: Staff joining fire authorities, by fire and rescue authority, ethnicity and role (19 October 2023) (MS Excel Spreadsheet, 568 KB)

    https://assets.publishing.service.gov.uk/media/652d3b15697260000dccf87a/fire-statistics-data-tables-fire1121-201022.xlsx">FIRE1121: Staff joining fire authorities, by fire and rescue authority, ethnicity and role (20 October 2022) (MS Excel Spreadsheet, 583 KB)

    https://assets.publishing.service.gov.uk/media/634e809d8fa8f53463dcb9bb/fire-statistics-data-tables-fire1121-211021.xlsx">FIRE1121: Staff joining fire authorities, by fire and rescue authority, ethnicity and role (21 October 2021) (MS Excel Spreadsheet, 449 KB)

    https://assets.publishing.service.gov.uk/media/616d82f28fa8f529840622a0/fire-statistics-data-tables-fire1121-221020.xlsx">FIRE1121: Staff joining fire authorities, by fire and rescue authority, ethnicity and role (22 October 2020) (MS Excel Spreadsheet, 349 KB)

    https://assets.publishing.service.gov.uk/media/5f86b4d6d3bf7f633bd5225c/fire-statistics-data-tables-fire1121-311019.xlsx">FIRE1121: Staff joining fire authorities, by fire and rescue authority, ethnicity and role (31 October 2019) (MS Excel Spreadsheet, 253 KB)

    https://assets.publishing.service.gov.uk/media/5db712c140f0b637a38efa9b/fire-statistics-data-tables-fire1121-181018.xlsx">FIRE1121: Staff joining fire authorities, by fire and rescue authority, ethnicity and role (18 October 2018) (MS Excel Spreadsheet, 150 KB)

    https://assets.publishing.service.gov.uk/media/5bbcccd940f0b6384861138e/fire-statistics-data-tables-fire1121.xlsx">FIRE1121: Staff joining fire authorities, by fire and rescue authority, ethnicity and role (26 October 2017) (MS Excel Spreadsheet, 28.2 KB)

    Related content

    Fire statistics data tables
    Fire statistics guidance
    Fire statistics

  14. FIRE0601: previous data tables

    • gov.uk
    Updated Sep 6, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Home Office (2018). FIRE0601: previous data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/fire0601-previous-data-tables
    Explore at:
    Dataset updated
    Sep 6, 2018
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Home Office
    Description

    FIRE0601: Primary fires by cause of fire and incident type (19 September 2029)

    https://assets.publishing.service.gov.uk/media/66e3ee36718edd81771316da/fire-statistics-data-tables-fire0601-210923.xlsx">FIRE0601: Primary fires in dwellings and other buildings by cause of fire (21 September 2023) (MS Excel Spreadsheet, 104 KB)

    https://assets.publishing.service.gov.uk/media/650ac9aa27d43b001491c2b3/fire-statistics-data-tables-fire0601-290922.xlsx">FIRE0601: Primary fires in dwellings and other buildings by cause of fire (29 September 2022) (MS Excel Spreadsheet, 45.1 KB)

    https://assets.publishing.service.gov.uk/media/633170b08fa8f51d21dbbf30/fire-statistics-data-tables-fire0601-300921.xlsx">FIRE0601: Primary fires in dwellings and other buildings by cause of fire (30 September 2021) (MS Excel Spreadsheet, 53.3 KB)

    https://assets.publishing.service.gov.uk/media/6151abec8fa8f5610ab86301/fire-statistics-data-tables-fire0601-011020.xlsx">FIRE0601: Primary fires in dwellings and other buildings by cause of fire (1 October 2020) (MS Excel Spreadsheet, 44.2 KB)

    https://assets.publishing.service.gov.uk/media/5f71db7d8fa8f5188883f29a/fire-statistics-data-tables-fire0601-120919.xlsx">FIRE0601: Primary fires in dwellings and other buildings by cause of fire (12 September 2019) (MS Excel Spreadsheet, 31.9 KB)

    https://assets.publishing.service.gov.uk/media/5d762945ed915d08f7111e37/fire-statistics-data-tables-fire0601-060918.xlsx">FIRE0601: Primary fires in dwellings and other buildings by cause of fire (6 September 2018) (MS Excel Spreadsheet, 34.9 KB)

    https://assets.publishing.service.gov.uk/media/5b8d3f7ee5274a0bdab54b2e/fire-statistics-data-tables-fire0601.xlsx">FIRE0601: Primary fires in dwellings and other buildings by cause of fire (12 October 2017) (MS Excel Spreadsheet, 43 KB)

    Related content

    Fire statistics data tables
    Fire statistics guidance
    Fire statistics

  15. Immigration statistics data tables, year ending December 2021

    • gov.uk
    Updated Feb 24, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Home Office (2022). Immigration statistics data tables, year ending December 2021 [Dataset]. https://www.gov.uk/government/statistical-data-sets/immigration-statistics-data-tables-year-ending-december-2021
    Explore at:
    Dataset updated
    Feb 24, 2022
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Home Office
    Description

    The Home Office has changed the format of the published data tables for a number of areas (asylum and resettlement, entry clearance visas, extensions, citizenship, returns, detention, and sponsorship). These now include summary tables, and more detailed datasets (available on a separate page, link below). A list of all available datasets on a given topic can be found in the ‘Contents’ sheet in the ‘summary’ tables. Information on where to find historic data in the ‘old’ format is in the ‘Notes’ page of the ‘summary’ tables.

    The Home Office intends to make these changes in other areas in the coming publications. If you have any feedback, please email MigrationStatsEnquiries@homeoffice.gov.uk.

    Related content

    Immigration statistics, year ending December 2021
    Immigration Statistics Quarterly Release
    Immigration Statistics User Guide
    Publishing detailed data tables in migration statistics
    Policy and legislative changes affecting migration to the UK: timeline
    Immigration statistics data archives

    Asylum and resettlement

    https://assets.publishing.service.gov.uk/media/620f7790d3bf7f4f0981a13b/asylum-summary-dec-2021-tables.ods">Asylum and resettlement summary tables, year ending December 2021 (ODS, 79.8 KB)

    Detailed asylum and resettlement datasets

    Sponsorship

    https://assets.publishing.service.gov.uk/media/620baaef8fa8f54911e2213d/sponsorship-summary-dec-2021-tables.ods"> Sponsorship summary tables, year ending December 2021 (ODS, 45.8 KB)

    Detailed sponsorship datasets

    Entry clearance visas granted outside the UK

    https://assets.publishing.service.gov.uk/media/620d09bcd3bf7f4f0743db21/visas-summary-dec-2021-tables.ods">Entry clearance visas summary tables, year ending December 2021 (ODS, 50.7 KB)

    Detailed entry clearance visas datasets

    Passenger arrivals (admissions)

    https://assets.publishing.service.gov.uk/media/620a6c40d3bf7f4f0adec6eb/passenger-arrivals-admissions-summary-dec-2021-tables.ods"> Passenger arrivals (admissions) summary tables, year ending December 2021 (ODS, 38.1 KB)

    Detailed Passengers initially refused entry at port datasets

    Extensions

    <a class="govuk-link" href="https://assets.publishing.service.gov.uk/media/620a7995e90e0710abe648c1/extentions-summary-dec-2021-tables

  16. Number of students in Ivy League schools in Class of 2028

    • statista.com
    Updated Nov 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of students in Ivy League schools in Class of 2028 [Dataset]. https://www.statista.com/statistics/941545/ivy-league-students-class/
    Explore at:
    Dataset updated
    Nov 28, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    The number of students starting in Ivy League schools for the Class of 2028 (those beginning in the Fall of 2024), varied from school to school. Cornell University had the largest Class of 2028 among the Ivy League schools, with ***** enrolled students.

  17. d

    Data from: Distributed Anomaly Detection using 1-class SVM for Vertically...

    • catalog.data.gov
    • s.cnmilf.com
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Distributed Anomaly Detection using 1-class SVM for Vertically Partitioned Data [Dataset]. https://catalog.data.gov/dataset/distributed-anomaly-detection-using-1-class-svm-for-vertically-partitioned-data
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Dashlink
    Description

    There has been a tremendous increase in the volume of sensor data collected over the last decade for different monitoring tasks. For example, petabytes of earth science data are collected from modern satellites, in-situ sensors and different climate models. Similarly, huge amount of flight operational data is downloaded for different commercial airlines. These different types of datasets need to be analyzed for finding outliers. Information extraction from such rich data sources using advanced data mining methodologies is a challenging task not only due to the massive volume of data, but also because these datasets are physically stored at different geographical locations with only a subset of features available at any location. Moving these petabytes of data to a single location may waste a lot of bandwidth. To solve this problem, in this paper, we present a novel algorithm which can identify outliers in the entire data without moving all the data to a single location. The method we propose only centralizes a very small sample from the different data subsets at different locations. We analytically prove and experimentally verify that the algorithm offers high accuracy compared to complete centralization with only a fraction of the communication cost. We show that our algorithm is highly relevant to both earth sciences and aeronautics by describing applications in these domains. The performance of the algorithm is demonstrated on two large publicly available datasets: (1) the NASA MODIS satellite images and (2) a simulated aviation dataset generated by the ‘Commercial Modular Aero-Propulsion System Simulation’ (CMAPSS).

  18. r

    Evaluation through follow-up - pupils born in 1953

    • researchdata.se
    Updated Aug 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kjell Härnqvist; Sven-Erik Reuterberg; Allan Svensson; Airi Rovio-Johansson (2024). Evaluation through follow-up - pupils born in 1953 [Dataset]. https://researchdata.se/en/catalogue/dataset/snd0480-2
    Explore at:
    Dataset updated
    Aug 15, 2024
    Dataset provided by
    University of Gothenburg
    Authors
    Kjell Härnqvist; Sven-Erik Reuterberg; Allan Svensson; Airi Rovio-Johansson
    Time period covered
    1966 - 1973
    Area covered
    Sweden
    Description

    Since the beginning of the 1960s, Statistics Sweden, in collaboration with various research institutions, has carried out follow-up surveys in the school system. These surveys have taken place within the framework of the IS project (Individual Statistics Project) at the University of Gothenburg and the UGU project (Evaluation through follow-up of students) at the University of Teacher Education in Stockholm, which since 1990 have been merged into a research project called 'Evaluation through Follow-up'. The follow-up surveys are part of the central evaluation of the school and are based on large nationally representative samples from different cohorts of students.

    Evaluation through follow-up (UGU) is one of the country's largest research databases in the field of education. UGU is part of the central evaluation of the school and is based on large nationally representative samples from different cohorts of students. The longitudinal database contains information on nationally representative samples of school pupils from ten cohorts, born between 1948 and 2004. The sampling process was based on the student's birthday for the first two and on the school class for the other cohorts.

    For each cohort, data of mainly two types are collected. School administrative data is collected annually by Statistics Sweden during the time that pupils are in the general school system (primary and secondary school), for most cohorts starting in compulsory school year 3. This information is provided by the school offices and, among other things, includes characteristics of school, class, special support, study choices and grades. Information obtained has varied somewhat, e.g. due to changes in curricula. A more detailed description of this data collection can be found in reports published by Statistics Sweden and linked to datasets for each cohort.

    Survey data from the pupils is collected for the first time in compulsory school year 6 (for most cohorts). Questionnaire in survey in year 6 includes questions related to self-perception and interest in learning, attitudes to school, hobbies, school motivation and future plans. For some cohorts, questionnaire data are also collected in year 3 and year 9 in compulsory school and in upper secondary school.

    Furthermore, results from various intelligence tests and standartized knowledge tests are included in the data collection year 6. The intelligence tests have been identical for all cohorts (except cohort born in 1987 from which questionnaire data were first collected in year 9). The intelligence test consists of a verbal, a spatial and an inductive test, each containing 40 tasks and specially designed for the UGU project. The verbal test is a vocabulary test of the opposite type. The spatial test is a so-called ‘sheet metal folding test’ and the inductive test are made up of series of numbers. The reliability of the test, intercorrelations and connection with school grades are reported by Svensson (1971).

    For the first three cohorts (1948, 1953 and 1967), the standartized knowledge tests in year 6 consist of the standard tests in Swedish, mathematics and English that up to and including the beginning of the 1980s were offered to all pupils in compulsory school year 6. For the cohort 1972, specially prepared tests in reading and mathematics were used. The test in reading consists of 27 tasks and aimed to identify students with reading difficulties. The mathematics test, which was also offered for the fifth cohort, (1977) includes 19 assignments. After a changed version of the test, caused by the previously used test being judged to be somewhat too simple, has been used for the cohort born in 1982. Results on the mathematics test are not available for the 1987 cohort. The mathematics test was not offered to the students in the cohort in 1992, as the test did not seem to fully correspond with current curriculum intentions in mathematics. For further information, see the description of the dataset for each cohort.

    For several of the samples, questionnaires were also collected from the students 'parents and teachers in year 6. The teacher questionnaire contains questions about the teacher, class size and composition, the teacher's assessments of the class' knowledge level, etc., school resources, working methods and parental involvement and questions about the existence of evaluations. The questionnaire for the guardians includes questions about the child's upbringing conditions, ambitions and wishes regarding the child's education, views on the school's objectives and the parents' own educational and professional situation.

    The students are followed up even after they have left primary school. Among other things, data collection is done during the time they are in high school. Then school administrative data such as e.g. choice of upper secondary school line / program and grades after completing studies. For some of the cohorts, in addition to school administrative data, questionnaire data were also collected from the students.

    he sample consisted of students born on the 5th, 15th and 25th of any month in 1953, a total of 10,723 students.

    The data obtained in 1966 were: 1. School administrative data (school form, class type, year and grades). 2. Information about the parents' profession and education, number of siblings, the distance between home and school, etc.

    This information was collected for 93% of all born on the current days. The reason for this is reduced resources for Statistics Sweden for follow-up work - reminders etc. Annual data for cohorts in 1953 were collected by Statistics Sweden up to and including academic year 1972/73.

    1. Answers to certain questions that shed light on students' school motivation, leisure activities and study and career plans. Some of the questions changed significantly compared to the cohort in 1948 due to the fact that they did not function satisfactorily from a metrological point of view.
    2. Results on three aptitude tests, one verbal, one spatial and one inductive.
    3. Standard test results in reading, writing, mathematics and English, which were offered to the students who belonged to year 6.

    Response rate for test and questionnaire data is 88% Standard test results were received for just over 85% of those who took the tests.

    The sample included a total of 9955 students, for whom some form of information was obtained.

    Part of the "Individual Statistics Project" together with cohort 1953.

  19. d

    Data from: Sparse Solutions for Single Class SVMs: A Bi-Criterion Approach

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dashlink (2025). Sparse Solutions for Single Class SVMs: A Bi-Criterion Approach [Dataset]. https://catalog.data.gov/dataset/sparse-solutions-for-single-class-svms-a-bi-criterion-approach
    Explore at:
    Dataset updated
    Nov 14, 2025
    Dataset provided by
    Dashlink
    Description

    In this paper we propose an innovative learning algorithm - a variation of One-class ? Support Vector Machines (SVMs) learning algorithm to produce sparser solutions with much reduced computational complexities. The proposed technique returns an approximate solution, nearly as good as the solution set obtained by the classical approach, by minimizing the original risk function along with a regularization term. We introduce a bi-criterion optimization that helps guide the search towards the optimal set in much reduced time. The outcome of the proposed learning technique was compared with the benchmark one-class Support Vector machines algorithm which more often leads to solutions with redundant support vectors. Through out the analysis, the problem size for both optimization routines was kept consistent. We have tested the proposed algorithm on a variety of data sources under different conditions to demonstrate the effectiveness. In all cases the proposed algorithm closely preserves the accuracy of standard one-class ? SVMs while reducing both training time and test time by several factors.

  20. Graduate labour market statistics - Graduate Employment Rates by Degree...

    • explore-education-statistics.service.gov.uk
    Updated Jun 29, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Education (2023). Graduate labour market statistics - Graduate Employment Rates by Degree Class [Dataset]. https://explore-education-statistics.service.gov.uk/data-catalogue/data-set/184d580c-8d67-4ea9-852e-b21c2da4fba5
    Explore at:
    Dataset updated
    Jun 29, 2023
    Dataset authored and provided by
    Department for Educationhttps://gov.uk/dfe
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Time period covered
    2022
    Description

    Graduate employment rates by degree class in 2022

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
F. (Fabiano) Dalpiaz (2020). UC_vs_US Statistic Analysis.xlsx [Dataset]. http://doi.org/10.23644/uu.12631628.v1

UC_vs_US Statistic Analysis.xlsx

Explore at:
xlsxAvailable download formats
Dataset updated
Jul 9, 2020
Dataset provided by
Utrecht University
Authors
F. (Fabiano) Dalpiaz
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Sheet 1 (Raw-Data): The raw data of the study is provided, presenting the tagging results for the used measures described in the paper. For each subject, it includes multiple columns: A. a sequential student ID B an ID that defines a random group label and the notation C. the used notation: user Story or use Cases D. the case they were assigned to: IFA, Sim, or Hos E. the subject's exam grade (total points out of 100). Empty cells mean that the subject did not take the first exam F. a categorical representation of the grade L/M/H, where H is greater or equal to 80, M is between 65 included and 80 excluded, L otherwise G. the total number of classes in the student's conceptual model H. the total number of relationships in the student's conceptual model I. the total number of classes in the expert's conceptual model J. the total number of relationships in the expert's conceptual model K-O. the total number of encountered situations of alignment, wrong representation, system-oriented, omitted, missing (see tagging scheme below) P. the researchers' judgement on how well the derivation process explanation was explained by the student: well explained (a systematic mapping that can be easily reproduced), partially explained (vague indication of the mapping ), or not present.

Tagging scheme:
Aligned (AL) - A concept is represented as a class in both models, either

with the same name or using synonyms or clearly linkable names; Wrongly represented (WR) - A class in the domain expert model is incorrectly represented in the student model, either (i) via an attribute, method, or relationship rather than class, or (ii) using a generic term (e.g., user'' instead ofurban planner''); System-oriented (SO) - A class in CM-Stud that denotes a technical implementation aspect, e.g., access control. Classes that represent legacy system or the system under design (portal, simulator) are legitimate; Omitted (OM) - A class in CM-Expert that does not appear in any way in CM-Stud; Missing (MI) - A class in CM-Stud that does not appear in any way in CM-Expert.

All the calculations and information provided in the following sheets

originate from that raw data.

Sheet 2 (Descriptive-Stats): Shows a summary of statistics from the data collection,

including the number of subjects per case, per notation, per process derivation rigor category, and per exam grade category.

Sheet 3 (Size-Ratio):

The number of classes within the student model divided by the number of classes within the expert model is calculated (describing the size ratio). We provide box plots to allow a visual comparison of the shape of the distribution, its central value, and its variability for each group (by case, notation, process, and exam grade) . The primary focus in this study is on the number of classes. However, we also provided the size ratio for the number of relationships between student and expert model.

Sheet 4 (Overall):

Provides an overview of all subjects regarding the encountered situations, completeness, and correctness, respectively. Correctness is defined as the ratio of classes in a student model that is fully aligned with the classes in the corresponding expert model. It is calculated by dividing the number of aligned concepts (AL) by the sum of the number of aligned concepts (AL), omitted concepts (OM), system-oriented concepts (SO), and wrong representations (WR). Completeness on the other hand, is defined as the ratio of classes in a student model that are correctly or incorrectly represented over the number of classes in the expert model. Completeness is calculated by dividing the sum of aligned concepts (AL) and wrong representations (WR) by the sum of the number of aligned concepts (AL), wrong representations (WR) and omitted concepts (OM). The overview is complemented with general diverging stacked bar charts that illustrate correctness and completeness.

For sheet 4 as well as for the following four sheets, diverging stacked bar

charts are provided to visualize the effect of each of the independent and mediated variables. The charts are based on the relative numbers of encountered situations for each student. In addition, a "Buffer" is calculated witch solely serves the purpose of constructing the diverging stacked bar charts in Excel. Finally, at the bottom of each sheet, the significance (T-test) and effect size (Hedges' g) for both completeness and correctness are provided. Hedges' g was calculated with an online tool: https://www.psychometrica.de/effect_size.html. The independent and moderating variables can be found as follows:

Sheet 5 (By-Notation):

Model correctness and model completeness is compared by notation - UC, US.

Sheet 6 (By-Case):

Model correctness and model completeness is compared by case - SIM, HOS, IFA.

Sheet 7 (By-Process):

Model correctness and model completeness is compared by how well the derivation process is explained - well explained, partially explained, not present.

Sheet 8 (By-Grade):

Model correctness and model completeness is compared by the exam grades, converted to categorical values High, Low , and Medium.

Search
Clear search
Close search
Google apps
Main menu