100+ datasets found
  1. Data quality and methodology (TSM 2024)

    • gov.uk
    Updated Nov 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Regulator of Social Housing (2024). Data quality and methodology (TSM 2024) [Dataset]. https://www.gov.uk/government/statistics/data-quality-and-methodology-tsm-2024
    Explore at:
    Dataset updated
    Nov 26, 2024
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Regulator of Social Housing
    Description

    Contents

    Introduction

    This report describes the quality assurance arrangements for the registered provider (RP) Tenant Satisfaction Measures statistics, providing more detail on the regulatory and operational context for data collections which feed these statistics and the safeguards that aim to maximise data quality.

    Background

    The statistics we publish are based on data collected directly from local authority registered provider (LARPs) and from private registered providers (PRPs) through the Tenant Satisfaction Measures (TSM) return. We use the data collected through these returns extensively as a source of administrative data. The United Kingdom Statistics Authority (UKSA) encourages public bodies to use administrative data for statistical purposes and, as such, we publish these data.

    These data are first being published in 2024, following the first collection and publication of the TSM.

    Official Statistics in development status

    In February 2018, the UKSA published the Code of Practice for Statistics. This sets standards for organisations producing and publishing statistics, ensuring quality, trustworthiness and value.

    These statistics are drawn from our TSM data collection and are being published for the first time in 2024 as official statistics in development.

    Official statistics in development are official statistics that are undergoing development. Over the next year we will review these statistics and consider areas for improvement to guidance, validations, data processing and analysis. We will also seek user feedback with a view to improving these statistics to meet user needs and to explore issues of data quality and consistency.

    Change of designation name

    Until September 2023, ‘official statistics in development’ were called ‘experimental statistics’. Further information can be found on the https://www.ons.gov.uk/methodology/methodologytopicsandstatisticalconcepts/guidetoofficialstatisticsindevelopment">Office for Statistics Regulation website.

    User feedback

    We are keen to increase the understanding of the data, including the accuracy and reliability, and the value to users. Please https://forms.office.com/e/cetNnYkHfL">complete the form or email feedback, including suggestions for improvements or queries as to the source data or processing to enquiries@rsh.gov.uk.

    Publication schedule

    We intend to publish these statistics in Autumn each year, with the data pre-announced in the release calendar.

    All data and additional information (including a list of individuals (if any) with 24 hour pre-release access) are published on our statistics pages.

    Quality assurance of administrative data

    The data used in the production of these statistics are classed as administrative data. In 2015 the UKSA published a regulatory standard for the quality assurance of administrative data. As part of our compliance to the Code of Practice, and in the context of other statistics published by the UK Government and its agencies, we have determined that the statistics drawn from the TSMs are likely to be categorised as low-quality risk – medium public interest (with a requirement for basic/enhanced assurance).

    The publication of these statistics can be considered as medium publi

  2. Data from: DATA QUALITY ON THE WEB: INTEGRATIVE REVIEW OF PUBLICATION...

    • scielo.figshare.com
    tiff
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Morgana Carneiro de Andrade; Maria José Baños Moreno; Juan-Antonio Pastor-Sánchez (2023). DATA QUALITY ON THE WEB: INTEGRATIVE REVIEW OF PUBLICATION GUIDELINES [Dataset]. http://doi.org/10.6084/m9.figshare.22815541.v1
    Explore at:
    tiffAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Morgana Carneiro de Andrade; Maria José Baños Moreno; Juan-Antonio Pastor-Sánchez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT The exponential increase of published data and the diversity of systems require the adoption of good practices to achieve quality indexes that enable discovery, access, and reuse. To identify good practices, an integrative review was used, as well as procedures from the ProKnow-C methodology. After applying the ProKnow-C procedures to the documents retrieved from the Web of Science, Scopus and Library, Information Science & Technology Abstracts databases, an analysis of 31 items was performed. This analysis allowed observing that in the last 20 years the guidelines for publishing open government data had a great impact on the Linked Data model implementation in several domains and currently the FAIR principles and the Data on the Web Best Practices are the most highlighted in the literature. These guidelines presents orientations in relation to various aspects for the publication of data in order to contribute to the optimization of quality, independent of the context in which they are applied. The CARE and FACT principles, on the other hand, although they were not formulated with the same objective as FAIR and the Best Practices, represent great challenges for information and technology scientists regarding ethics, responsibility, confidentiality, impartiality, security, and transparency of data.

  3. data-quality-assessment-datasets

    • kaggle.com
    zip
    Updated Dec 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shamiul islam shifat (2022). data-quality-assessment-datasets [Dataset]. https://www.kaggle.com/datasets/shamiulislamshifat/dataqualityassessmentdatasets
    Explore at:
    zip(407602 bytes)Available download formats
    Dataset updated
    Dec 23, 2022
    Authors
    shamiul islam shifat
    Description

    Dataset

    This dataset was created by shamiul islam shifat

    Contents

  4. d

    Data from: Questions and responses to USGS-wide poll on quality assurance...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Questions and responses to USGS-wide poll on quality assurance practices for timeseries data, 2021 [Dataset]. https://catalog.data.gov/dataset/questions-and-responses-to-usgs-wide-poll-on-quality-assurance-practices-for-timeseries-da
    Explore at:
    Dataset updated
    Nov 27, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    This data record contains questions and responses to a USGS-wide survey conducted to identify issues and needs associated with quality assurance and quality control (QA/QC) of USGS timeseries data streams. This research was funded by the USGS Community for Data Integration as part of a project titled “From reactive- to condition-based maintenance: Artificial intelligence for anomaly predictions and operational decision-making”. The poll targeted monitoring network managers and technicians and asked questions about operational data streams and timeseries data collection in order to identity opportunities to streamline data access, expedite the response to data quality issues, improve QA/QC procedures, reduce operations costs, and uncover other maintenance needs. The poll was created using an online survey platform. It was sent to 2326 systematically selected USGS email addresses and received 175 responses in 11 days before it was closed to respondents. The poll contained 48 questions of various types including long answer, multiple choice, and ranking questions. The survey contained a mix of mandatory and optional questions. These distinctions as well as full descriptions of survey questions are noted on the metadata.

  5. Data from: Statistical Process Control as a Tool for Quality Improvement A...

    • figshare.com
    docx
    Updated Feb 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Canberk Elmalı; Özge Ural (2023). Statistical Process Control as a Tool for Quality Improvement A Case Study in Denim Pant Production [Dataset]. http://doi.org/10.6084/m9.figshare.22147508.v2
    Explore at:
    docxAvailable download formats
    Dataset updated
    Feb 23, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Canberk Elmalı; Özge Ural
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this paper, we show that concept of Statistical Process Control tools was thoroughly examined and the definitions of quality control concepts were presented. This is significant because of it is anticipated that this study will contribute to the literature as an exemplary application that demonstrates the role of statistical process control (SPC) tools in quality improvement in the evaluation and decision-making phase.

    This is significant because of this study is to investigate applications of quality control, to clarify statistical control methods and problem-solving procedures, to generate proposals for problem-solving approaches, and to disseminate improvement studies in the ready-to-wear industry. The basic Statistical Process Control tools used in the study, the most repetitive faults were detected and these faults were divided into sub-headings for more detailed analysis. In this way, it was tried to prevent the repetition of faults by going down to the root causes of any detected fault. With this different perspective, it is expected that the study will contribute to other fields.

    We give consent for the publication of identifiable details, which can include photograph(s) and case history and details within the text (“Material”) to be published in the Journal of Quality Technology. We confirm that have seen and been given the opportunity to read both the Material and the Article (as attached) to be published by Taylor & Francis.

  6. V

    Data from: Data quality assurance and quality control measures in large...

    • data.virginia.gov
    • catalog.data.gov
    html
    Updated Sep 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Data quality assurance and quality control measures in large multicenter stroke trials: the African-American Antiplatelet Stroke Prevention Study experience [Dataset]. https://data.virginia.gov/dataset/data-quality-assurance-and-quality-control-measures-in-large-multicenter-stroke-trials-the-afri
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Data quality assurance and quality control are critical to the effective conduct of a clinical trial. In the present commentary, we discuss our experience in a large, multicenter stroke trial. In addition to standard data quality control techniques, we have developed novel methods to enhance the entire process. Central to our methods is the use of clinical monitors who are trained in the techniques of data monitoring.

  7. K

    Replication Data for: Quality control and correction method for air...

    • rdr.kuleuven.be
    bin +2
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eva Beele; Eva Beele; Maarten Reyniers; Maarten Reyniers; Raf Aerts; Raf Aerts; Ben Somers; Ben Somers (2025). Replication Data for: Quality control and correction method for air temperature data from a citizen science weather station network in Leuven, Belgium [Dataset]. http://doi.org/10.48804/SSRN3F
    Explore at:
    bin(16164), bin(58376), text/comma-separated-values(154963594), text/comma-separated-values(151156007), text/comma-separated-values(147784385), text/comma-separated-values(159878194), text/comma-separated-values(161578484), bin(10949), bin(24757), bin(33310), bin(10604), txt(1257), text/comma-separated-values(120773313), text/comma-separated-values(142177765), text/comma-separated-values(155399626), text/comma-separated-values(155703873), text/comma-separated-values(146828198), text/comma-separated-values(100874809), text/comma-separated-values(161480206), text/comma-separated-values(143072791), text/comma-separated-values(119230114), text/comma-separated-values(168289190), text/comma-separated-values(162099130), text/comma-separated-values(149774828), text/comma-separated-values(8338), txt(13712), text/comma-separated-values(150623331), text/comma-separated-values(163761533), text/comma-separated-values(163490322), text/comma-separated-values(130717796), text/comma-separated-values(154512726)Available download formats
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    KU Leuven RDR
    Authors
    Eva Beele; Eva Beele; Maarten Reyniers; Maarten Reyniers; Raf Aerts; Raf Aerts; Ben Somers; Ben Somers
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Time period covered
    Jul 1, 2019 - Dec 31, 2024
    Area covered
    Leuven, Belgium
    Description

    This dataset presents crowdsourced data from the Leuven.cool network, a citizen science network of around 100 low-cost weather stations (Fine Offset WH2600) distributed across Leuven, Belgium. The data was quality controlled and corrected by a newly developed station specific temperature quality control (QC) and correction procedure. The procedure consists of three levels removing implausible measurements, while also correcting for inter (in between stations) and intra (station-specific) station temperature biases by means of a random-forest approach.

  8. Manufacturing Defects

    • kaggle.com
    zip
    Updated Jul 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fahmida (2024). Manufacturing Defects [Dataset]. https://www.kaggle.com/datasets/fahmidachowdhury/manufacturing-defects
    Explore at:
    zip(13320 bytes)Available download formats
    Dataset updated
    Jul 1, 2024
    Authors
    Fahmida
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains simulated data related to manufacturing defects observed during quality control processes. It includes information such as defect type, detection date, location within the product, severity level, inspection method used, and repair costs. This dataset can be used for analyzing defect patterns, improving quality control processes, and assessing the impact of defects on product quality and production costs. Columns: - defect_id: Unique identifier for each defect. - product_id: Identifier for the product associated with the defect. - defect_type: Type or category of the defect (e.g., cosmetic, functional, structural). - defect_description: Description of the defect. - defect_date: Date when the defect was detected. - defect_location: Location within the product where the defect was found (e.g., surface, component). - severity: Severity level of the defect (e.g., minor, moderate, critical). - inspection_method: Method used to detect the defect (e.g., visual inspection, automated testing). - repair_action: Action taken to repair or address the defect. - repair_cost: Cost incurred to repair the defect (in local currency).

    Potential Uses: Quality Control Analysis: Analyze defect patterns and trends in manufacturing processes. Process Improvement: Identify areas for process optimization to reduce defect rates. Cost Analysis: Evaluate the financial impact of defects on production costs and profitability. Product Quality Assurance: Enhance product quality assurance strategies based on defect data analysis. This dataset is entirely synthetic and generated for educational and research purposes. It can be a valuable resource for manufacturing engineers, quality assurance professionals, and researchers interested in defect analysis and quality control.

  9. Z

    Data quality assurance at research data repositories: Survey data

    • data.niaid.nih.gov
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kindling, Maxi; Strecker, Dorothea; Wang, Yi (2024). Data quality assurance at research data repositories: Survey data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6457848
    Explore at:
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Berlin School of Library and Information Science, Humboldt-Universität zu Berlin
    Authors
    Kindling, Maxi; Strecker, Dorothea; Wang, Yi
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset documents findings form a survey on the status quo of data quality assurance practices at research data repositories.

    The personalized online survey was conducted among repositories indexed in re3data in 2021. It covered the scope of the repository, types of data quality assessment, quality criteria, responsibilities, details of the review process, and data quality information, and yielded 332 complete responses.

    The dataset comprises a documentation file, the data file, a codebook, and the survey instrument.

    The documentation file (documentation.pdf) outlines details of the survey design and administration, survey response, and data processing. The data file (01_survey_data.csv) contains all 332 complete responses to 19 survey questions, fully anonymized. The codebook (02_codebook.csv) describes the variables, and the survey instrument (03_survey_instrument.pdf) comprises the questionnaire that was distributed to survey participants.

  10. Data from: Assessment of positional accuracy in spatial data using...

    • scielo.figshare.com
    png
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Afonso de Paula dos Santos; Dalto Domingos Rodrigues; Nerilson Terra Santos; Joel Gripp Junior (2023). Assessment of positional accuracy in spatial data using techniques of spatial statistics: proposal of a method and an example using the Brazilian standard [Dataset]. http://doi.org/10.6084/m9.figshare.14327671.v1
    Explore at:
    pngAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Afonso de Paula dos Santos; Dalto Domingos Rodrigues; Nerilson Terra Santos; Joel Gripp Junior
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This paper presents the importance of simple spatial statistics techniques applied in positional quality control of spatial data. To this end, Analysis methods of point data spatial distribution pattern are presented, as well as bias analysis in the positional discrepancies samples. To evaluate the points spatial distribution Nearest Neighbor and Ripley's K function methods were used. As for bias analysis, the average directional vectors of discrepancies and the circular variance were used. A methodology for positional quality control of spatial data is proposed, in which includes sampling planning and its spatial distribution pattern evaluation, analyzing the data normality through the application of bias tests, and positional accuracy classification according to a standard. For the practical experiment, an orthoimage generated from a PRISM scene of the ALOS satellite was evaluated. Results showed that the orthoimage is accurate on a scale of 1:25,000, being classified as Class A according to the Brazilian standard positional accuracy, not showing bias at the coordinates. The main contribution of this work is the incorporation of spatial statistics techniques in cartographic quality control.

  11. H

    Hydroinformatics Instruction Module Example Code: Sensor Data Quality...

    • hydroshare.org
    • beta.hydroshare.org
    • +1more
    zip
    Updated Mar 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amber Spackman Jones (2022). Hydroinformatics Instruction Module Example Code: Sensor Data Quality Control with pyhydroqc [Dataset]. https://www.hydroshare.org/resource/451c4f9697654b1682d87ee619cd7924
    Explore at:
    zip(159.5 MB)Available download formats
    Dataset updated
    Mar 3, 2022
    Dataset provided by
    HydroShare
    Authors
    Amber Spackman Jones
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This resource contains Jupyter Notebooks with examples for conducting quality control post processing for in situ aquatic sensor data. The code uses the Python pyhydroqc package. The resource is part of set of materials for hydroinformatics and water data science instruction. Complete learning module materials are found in HydroLearn: Jones, A.S., Horsburgh, J.S., Bastidas Pacheco, C.J. (2022). Hydroinformatics and Water Data Science. HydroLearn. https://edx.hydrolearn.org/courses/course-v1:USU+CEE6110+2022/about.

    This resources consists of 3 example notebooks and associated data files.

    Notebooks: 1. Example 1: Import and plot data 2. Example 2: Perform rules-based quality control 3. Example 3: Perform model-based quality control (ARIMA)

    Data files: Data files are available for 6 aquatic sites in the Logan River Observatory. Each file contains data for one site for a single year. Each file corresponds to a single year of data. The files are named according to monitoring site (FranklinBasin, TonyGrove, WaterLab, MainStreet, Mendon, BlackSmithFork) and year. The files were sourced by querying the Logan River Observatory relational database, and equivalent data could be obtained from the LRO website or on HydroShare. Additional information on sites, variables, and methods can be found on the LRO website (http://lrodata.usu.edu/tsa/) or HydroShare (https://www.hydroshare.org/search/?q=logan%20river%20observatory). Each file has the same structure indexed with a datetime column (mountain standard time) with three columns corresponding to each variable. Variable abbreviations and units are: - temp: water temperature, degrees C - cond: specific conductance, μS/cm - ph: pH, standard units - do: dissolved oxygen, mg/L - turb: turbidity, NTU - stage: stage height, cm

    For each variable, there are 3 columns: - Raw data value measured by the sensor (column header is the variable abbreviation). - Technician quality controlled (corrected) value (column header is the variable abbreviation appended with '_cor'). - Technician labels/qualifiers (column header is the variable abbreviation appended with '_qual').

  12. Data from: Encoding laboratory testing data: case studies of the national...

    • data-staging.niaid.nih.gov
    • search.dataone.org
    • +3more
    zip
    Updated May 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raja Cholan; Gregory Pappas; Greg Rehwoldt; Andrew Sills; Elizabeth Korte; I. Khalil Appleton; Natalie Scott; Wendy Rubinstein; Sara Brenner; Riki Merrick; Wilbur Hadden; Keith Campbell; Michael Waters (2022). Encoding laboratory testing data: case studies of the national implementation of HHS requirements and related standards in five laboratories [Dataset]. http://doi.org/10.5061/dryad.0cfxpnw55
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 10, 2022
    Dataset provided by
    United States Department of Health and Human Serviceshttp://www.hhs.gov/
    Food and Drug Administrationhttp://www.fda.gov/
    Association of Public Health Laboratorieshttps://www.aphl.org/
    Office of the National Coordinator for Health Information Technologyhttp://healthit.gov/
    University of Maryland, College Park
    Deloitte (United States)
    Authors
    Raja Cholan; Gregory Pappas; Greg Rehwoldt; Andrew Sills; Elizabeth Korte; I. Khalil Appleton; Natalie Scott; Wendy Rubinstein; Sara Brenner; Riki Merrick; Wilbur Hadden; Keith Campbell; Michael Waters
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Objective: Assess the effectiveness of providing Logical Observation Identifiers Names and Codes (LOINC®)-to-In Vitro Diagnostic (LIVD) coding specification, required by the United States Department of Health and Human Services for SARS-CoV-2 reporting, in medical center laboratories and utilize findings to inform future United States Food and Drug Administration policy on the use of real-world evidence in regulatory decisions. Materials and Methods: We compared gaps and similarities between diagnostic test manufacturers’ recommended LOINC® codes and the LOINC® codes used in medical center laboratories for the same tests. Results: Five medical centers and three test manufacturers extracted data from laboratory information systems (LIS) for prioritized tests of interest. The data submission ranged from 74 to 532 LOINC® codes per site. Three test manufacturers submitted 15 LIVD catalogs representing 26 distinct devices, 6956 tests, and 686 LOINC® codes. We identified mismatches in how medical centers use LOINC® to encode laboratory tests compared to how test manufacturers encode the same laboratory tests. Of 331 tests available in the LIVD files, 136 (41%) were represented by a mismatched LOINC® code by the medical centers (chi-square 45.0, 4 df, P < .0001). Discussion: The five medical centers and three test manufacturers vary in how they organize, categorize, and store LIS catalog information. This variation impacts data quality and interoperability. Conclusion: The results of the study indicate that providing the LIVD mappings was not sufficient to support laboratory data interoperability. National implementation of LIVD and further efforts to promote laboratory interoperability will require a more comprehensive effort and continuing evaluation and quality control. Methods Five medical centers and three test manufacturers extracted data from laboratory information systems (LIS) for prioritized tests of interest. The data submission ranged from 74 to 532 LOINC® codes per site. Three test manufacturers submitted 15 LIVD catalogs representing 26 distinct devices, 6,956 tests, and 686 LOINC® codes. We identified mismatches in how medical centers use LOINC® to encode laboratory tests compared to how test manufacturers encode the same laboratory tests. Of 331 tests available in the LIVD files, 136 (41%) were represented by a mismatched LOINC® code by the medical centers (Chi-square 45.0,4 df,p < .0001). Data Collection from Medical Center Laboratory Pilot Sites: Each medical center was asked to extract about 100 LOINC® Codes from their LIS for prioritized tests of interest focused on high-risk conditions and SARS-CoV-2. For each selected test (e.g., SARS-CoV-2 RNA COVID-19), we collected the following data elements: test names/descriptions (e.g., SARS coronavirus 2 RNA [Presence] in Respiratory specimen by NAA with probe detection), associated instruments (e.g., IVD Vendor Model), and LOINC® codes (e.g., 94500-6). High risk conditions were defined by referencing the CDC’s published list of Underlying Medical Conditions Associated with High Risk for Severe COVID-19.[29] A data collection template spreadsheet was created and disseminated to the medical centers to help provide consistency and reporting clarity for data elements from sites. Data Collection from IVD Manufacturers: We coordinated with SHIELD stakeholders and the IICC to request manufacturer LIVD catalogs containing the LOINC® codes per IVD instrument per test from manufacturers.

  13. Test Data Management Market Analysis, Size, and Forecast 2025-2029: North...

    • technavio.com
    pdf
    Updated May 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Test Data Management Market Analysis, Size, and Forecast 2025-2029: North America (US and Canada), Europe (France, Germany, Italy, and UK), APAC (Australia, China, India, and Japan), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/test-data-management-market-industry-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 1, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    License

    https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

    Time period covered
    2025 - 2029
    Area covered
    United States
    Description

    Snapshot img

    Test Data Management Market Size 2025-2029

    The test data management market size is forecast to increase by USD 727.3 million, at a CAGR of 10.5% between 2024 and 2029.

    The market is experiencing significant growth, driven by the increasing adoption of automation by enterprises to streamline their testing processes. The automation trend is fueled by the growing consumer spending on technological solutions, as businesses seek to improve efficiency and reduce costs. However, the market faces challenges, including the lack of awareness and standardization in test data management practices. This obstacle hinders the effective implementation of test data management solutions, requiring companies to invest in education and training to ensure successful integration. To capitalize on market opportunities and navigate challenges effectively, businesses must stay informed about emerging trends and best practices in test data management. By doing so, they can optimize their testing processes, reduce risks, and enhance overall quality.

    What will be the Size of the Test Data Management Market during the forecast period?

    Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
    Request Free SampleThe market continues to evolve, driven by the ever-increasing volume and complexity of data. Data exploration and analysis are at the forefront of this dynamic landscape, with data ethics and governance frameworks ensuring data transparency and integrity. Data masking, cleansing, and validation are crucial components of data management, enabling data warehousing, orchestration, and pipeline development. Data security and privacy remain paramount, with encryption, access control, and anonymization key strategies. Data governance, lineage, and cataloging facilitate data management software automation and reporting. Hybrid data management solutions, including artificial intelligence and machine learning, are transforming data insights and analytics. Data regulations and compliance are shaping the market, driving the need for data accountability and stewardship. Data visualization, mining, and reporting provide valuable insights, while data quality management, archiving, and backup ensure data availability and recovery. Data modeling, data integrity, and data transformation are essential for data warehousing and data lake implementations. Data management platforms are seamlessly integrated into these evolving patterns, enabling organizations to effectively manage their data assets and gain valuable insights. Data management services, cloud and on-premise, are essential for organizations to adapt to the continuous changes in the market and effectively leverage their data resources.

    How is this Test Data Management Industry segmented?

    The test data management industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments. ApplicationOn-premisesCloud-basedComponentSolutionsServicesEnd-userInformation technologyTelecomBFSIHealthcare and life sciencesOthersSectorLarge enterpriseSMEsGeographyNorth AmericaUSCanadaEuropeFranceGermanyItalyUKAPACAustraliaChinaIndiaJapanRest of World (ROW).

    By Application Insights

    The on-premises segment is estimated to witness significant growth during the forecast period.In the realm of data management, on-premises testing represents a popular approach for businesses seeking control over their infrastructure and testing process. This approach involves establishing testing facilities within an office or data center, necessitating a dedicated team with the necessary skills. The benefits of on-premises testing extend beyond control, as it enables organizations to upgrade and configure hardware and software at their discretion, providing opportunities for exploration testing. Furthermore, data security is a significant concern for many businesses, and on-premises testing alleviates the risk of compromising sensitive information to third-party companies. Data exploration, a crucial aspect of data analysis, can be carried out more effectively with on-premises testing, ensuring data integrity and security. Data masking, cleansing, and validation are essential data preparation techniques that can be executed efficiently in an on-premises environment. Data warehousing, data pipelines, and data orchestration are integral components of data management, and on-premises testing allows for seamless integration and management of these elements. Data governance frameworks, lineage, catalogs, and metadata are essential for maintaining data transparency and compliance. Data security, encryption, and access control are paramount, and on-premises testing offers greater control over these aspects. Data reporting, visualization, and insigh

  14. COVID Testing and Testing-Related Services Provided to Medicaid and CHIP...

    • healthdata.gov
    • data.virginia.gov
    • +3more
    csv, xlsx, xml
    Updated Mar 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.medicaid.gov (2023). COVID Testing and Testing-Related Services Provided to Medicaid and CHIP Beneficiaries [Dataset]. https://healthdata.gov/d/x6kx-6hpr
    Explore at:
    xml, csv, xlsxAvailable download formats
    Dataset updated
    Mar 28, 2023
    Dataset provided by
    data.medicaid.gov
    Description

    This data set includes monthly counts and rates (per 1,000 beneficiaries) of COVID-19 testing services provided to Medicaid and CHIP beneficiaries, by state.

    These metrics are based on data in the T-MSIS Analytic Files (TAF). Some states have serious data quality issues for one or more months, making the data unusable for calculating COVID-19 testing services measures. To assess data quality, analysts adapted measures featured in the DQ Atlas. Data for a state and month are considered unusable if at least one of the following topics meets the DQ Atlas threshold for unusable: Total Medicaid and CHIP Enrollment, Procedure Codes - OT Professional, Claims Volume - OT. Please refer to the DQ Atlas at http://medicaid.gov/dq-atlas for more information about data quality assessment methods. Cells with a value of “DQ” indicate that data were suppressed due to unusable data.

    Some cells have a value of “DS”. This indicates that data were suppressed for confidentiality reasons because the group included fewer than 11 beneficiaries.

  15. Data from: Standard Quality Controlled Research Weather Data – USDA-ARS,...

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Standard Quality Controlled Research Weather Data – USDA-ARS, Bushland, Texas [Dataset]. https://catalog.data.gov/dataset/standard-quality-controlled-research-weather-data-usda-ars-bushland-texas-f4f0b
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Area covered
    Texas, Bushland
    Description

    [ NOTE – 2022/05/06: this dataset supersedes the earlier versions https://doi.org/10.15482/USDA.ADC/1482548 and https://doi.org/10.15482/USDA.ADC/1526329 ]. This dataset contains 15-minute mean weather data from the USDA-ARS Conservation and Production Laboratory (CPRL), Soil and Water Management Research Unit (SWMRU) research weather station, Bushland, Texas (Lat. 35.186714°, Long. -102.094189°, elevation 1170 m above MSL) for all days in each year. The data are from sensors placed at 2-m height over a level, grass surface mowed to not exceed 12 cm height and irrigated and fertilized to maintain reference conditions as promulgated by Allen et al. (2005, 1998). Irrigation was by surface flood in 1989 through 1994, and by subsurface drip irrigation after 1994. Sensors were replicated and intercompared between replicates and with data from nearby weather stations, which were sometimes used for gap filling. Quality control and assurance methods are described by Evett et al. (2018). Data from a duplicate sensor were used to fill gaps in data from the primary sensor using appropriate regression relationships. Gap filling was also accomplished using sensors deployed at one of the four large weighing lysimeters immediately west of the weather station, or using sensors at other nearby stations when reliable regression relationships could be developed. The primary paper describes details of the sensors used and methods of testing, calibration, inter-comparison, and use. The weather data include air temperature (C) and relative humidity (%), wind speed (m/s), solar irradiance (W m-2), barometric pressure (kPa), and precipitation (rain and snow in mm). Because the large (3 m by 3 m surface area) weighing lysimeters are better rain gages than are tipping bucket gages, the 15-minute precipitation data are derived for each lysimeter from changes in lysimeter mass. The land slope is <0.3% and flat. The mean annual precipitation is ~470 mm, the 20-year pan evaporation record indicates ~2,600 mm Class A pan evaporation per year, and winds are typically from the South and Southwest. The climate is semi-arid with ~70% (350 mm) of the annual precipitation occurring from May to September, during which period the pan evaporation averages ~1520 mm. These datasets originate from research aimed at determining crop water use (ET), crop coefficients for use in ET-based irrigation scheduling based on a reference ET, crop growth, yield, harvest index, and crop water productivity as affected by irrigation method, timing, amount (full or some degree of deficit), agronomic practices, cultivar, and weather. The data have utility for testing simulation models of crop ET, growth, and yield and have been used by the Agricultural Model Intercomparison and Improvement Project (AgMIP), by OPENET, and by many others for testing, and calibrating models of ET that use satellite and/or weather data. See the README for details of each data resource.

  16. Relevant number of data points, discrepancy type, number of discrepancies...

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vivienne X. Guan; Yasmine C. Probst; Elizabeth P. Neale; Linda C. Tapsell (2023). Relevant number of data points, discrepancy type, number of discrepancies and discrepancy rate. [Dataset]. http://doi.org/10.1371/journal.pone.0221047.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Vivienne X. Guan; Yasmine C. Probst; Elizabeth P. Neale; Linda C. Tapsell
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Relevant number of data points, discrepancy type, number of discrepancies and discrepancy rate.

  17. U

    Water Quality Data from the Yukon River Basin in Alaska and Canada Data...

    • data.usgs.gov
    • search.dataone.org
    • +2more
    Updated Nov 19, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicole Herman-Mercer (2021). Water Quality Data from the Yukon River Basin in Alaska and Canada Data Quality Assurance Field Blanks [Dataset]. http://doi.org/10.5066/F77D2S7B
    Explore at:
    Dataset updated
    Nov 19, 2021
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Nicole Herman-Mercer
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Area covered
    Yukon River, Canada, Alaska
    Description

    This dataset contains data collected from field blanks. Field blanks are deionized water processed in the field by community technicians using processing methods identical to those for surface water samples. Field blanks are then analyzed in the laboratory following procedures identical to those for surface water samples.

  18. Data from: Behavioral Health Workforce: Quality Assurance Practices in...

    • data.virginia.gov
    • healthdata.gov
    • +1more
    html
    Updated Sep 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Substance Abuse and Mental Health Services Administration (2025). Behavioral Health Workforce: Quality Assurance Practices in Mental Health Treatment Facilities [Dataset]. https://data.virginia.gov/dataset/behavioral-health-workforce-quality-assurance-practices-in-mental-health-treatment-facilities
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
    Description

    This report examines the number, percentage, and characteristics of specialty mental health treatment facilities in the United States that use three quality assurance practices related to the behavioral health workforce as part of their standard operating procedures.

  19. f

    Data from: Comparison of statistical methods and the use of quality control...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Aug 30, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Portier, Chris; Espín-Pérez, Almudena; de Kok, Theo M. C. M.; van Veldhoven, Karin; Chadeau-Hyam, Marc; Kleinjans, Jos C. S. (2018). Comparison of statistical methods and the use of quality control samples for batch effect correction in human transcriptome data [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000664759
    Explore at:
    Dataset updated
    Aug 30, 2018
    Authors
    Portier, Chris; Espín-Pérez, Almudena; de Kok, Theo M. C. M.; van Veldhoven, Karin; Chadeau-Hyam, Marc; Kleinjans, Jos C. S.
    Description

    Batch effects are technical sources of variation introduced by the necessity of conducting gene expression analyses on different dates due to the large number of biological samples in population-based studies. The aim of this study is to evaluate the performances of linear mixed models (LMM) and Combat in batch effect removal. We also assessed the utility of adding quality control samples in the study design as technical replicates. In order to do so, we simulated gene expression data by adding “treatment” and batch effects to a real gene expression dataset. The performances of LMM and Combat, with and without quality control samples, are assessed in terms of sensitivity and specificity while correcting for the batch effect using a wide range of effect sizes, statistical noise, sample sizes and level of balanced/unbalanced designs. The simulations showed small differences among LMM and Combat. LMM identifies stronger relationships between big effect sizes and gene expression than Combat, while Combat identifies in general more true and false positives than LMM. However, these small differences can still be relevant depending on the research goal. When any of these methods are applied, quality control samples did not reduce the batch effect, showing no added value for including them in the study design.

  20. r

    Data from: Towards CRISP-ML(Q): A Machine Learning Process Model with...

    • resodate.org
    Updated May 19, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Studer; Thanh Binh Bui; Christian Drescher; Alexander Hanuschkin; Ludwig Winkler; Steven Peters; Klaus-Robert Müller (2021). Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology [Dataset]. http://doi.org/10.14279/depositonce-11926
    Explore at:
    Dataset updated
    May 19, 2021
    Dataset provided by
    DepositOnce
    Technische Universität Berlin
    Authors
    Stefan Studer; Thanh Binh Bui; Christian Drescher; Alexander Hanuschkin; Ludwig Winkler; Steven Peters; Klaus-Robert Müller
    Description

    Machine learning is an established and frequently used technique in industry and academia, but a standard process model to improve success and efficiency of machine learning applications is still missing. Project organizations and machine learning practitioners face manifold challenges and risks when developing machine learning applications and have a need for guidance to meet business expectations. This paper therefore proposes a process model for the development of machine learning applications, covering six phases from defining the scope to maintaining the deployed machine learning application. Business and data understanding are executed simultaneously in the first phase, as both have considerable impact on the feasibility of the project. The next phases are comprised of data preparation, modeling, evaluation, and deployment. Special focus is applied to the last phase, as a model running in changing real-time environments requires close monitoring and maintenance to reduce the risk of performance degradation over time. With each task of the process, this work proposes quality assurance methodology that is suitable to address challenges in machine learning development that are identified in the form of risks. The methodology is drawn from practical experience and scientific literature, and has proven to be general and stable. The process model expands on CRISP-DM, a data mining process model that enjoys strong industry support, but fails to address machine learning specific tasks. The presented work proposes an industry- and application-neutral process model tailored for machine learning applications with a focus on technical tasks for quality assurance.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Regulator of Social Housing (2024). Data quality and methodology (TSM 2024) [Dataset]. https://www.gov.uk/government/statistics/data-quality-and-methodology-tsm-2024
Organization logo

Data quality and methodology (TSM 2024)

Explore at:
Dataset updated
Nov 26, 2024
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Regulator of Social Housing
Description

Contents

Introduction

This report describes the quality assurance arrangements for the registered provider (RP) Tenant Satisfaction Measures statistics, providing more detail on the regulatory and operational context for data collections which feed these statistics and the safeguards that aim to maximise data quality.

Background

The statistics we publish are based on data collected directly from local authority registered provider (LARPs) and from private registered providers (PRPs) through the Tenant Satisfaction Measures (TSM) return. We use the data collected through these returns extensively as a source of administrative data. The United Kingdom Statistics Authority (UKSA) encourages public bodies to use administrative data for statistical purposes and, as such, we publish these data.

These data are first being published in 2024, following the first collection and publication of the TSM.

Official Statistics in development status

In February 2018, the UKSA published the Code of Practice for Statistics. This sets standards for organisations producing and publishing statistics, ensuring quality, trustworthiness and value.

These statistics are drawn from our TSM data collection and are being published for the first time in 2024 as official statistics in development.

Official statistics in development are official statistics that are undergoing development. Over the next year we will review these statistics and consider areas for improvement to guidance, validations, data processing and analysis. We will also seek user feedback with a view to improving these statistics to meet user needs and to explore issues of data quality and consistency.

Change of designation name

Until September 2023, ‘official statistics in development’ were called ‘experimental statistics’. Further information can be found on the https://www.ons.gov.uk/methodology/methodologytopicsandstatisticalconcepts/guidetoofficialstatisticsindevelopment">Office for Statistics Regulation website.

User feedback

We are keen to increase the understanding of the data, including the accuracy and reliability, and the value to users. Please https://forms.office.com/e/cetNnYkHfL">complete the form or email feedback, including suggestions for improvements or queries as to the source data or processing to enquiries@rsh.gov.uk.

Publication schedule

We intend to publish these statistics in Autumn each year, with the data pre-announced in the release calendar.

All data and additional information (including a list of individuals (if any) with 24 hour pre-release access) are published on our statistics pages.

Quality assurance of administrative data

The data used in the production of these statistics are classed as administrative data. In 2015 the UKSA published a regulatory standard for the quality assurance of administrative data. As part of our compliance to the Code of Practice, and in the context of other statistics published by the UK Government and its agencies, we have determined that the statistics drawn from the TSMs are likely to be categorised as low-quality risk – medium public interest (with a requirement for basic/enhanced assurance).

The publication of these statistics can be considered as medium publi

Search
Clear search
Close search
Google apps
Main menu