Facebook
TwitterThis data table provides the detailed data quality assessment scores for the Long Term Development Statement dataset. The quality assessment was carried out on 31st March. At SPEN, we are dedicated to sharing high-quality data with our stakeholders and being transparent about its' quality. This is why we openly share the results of our data quality assessments. We collaborate closely with Data Owners to address any identified issues and enhance our overall data quality; to demonstrate our progress we conduct annual assessments of our data quality in line with the dataset refresh rate. To learn more about our approach to how we assess data quality, visit Data Quality - SP Energy Networks.We welcome feedback and questions from our stakeholders regarding this process. Our Open Data Team is available to answer any enquiries or receive feedback on the assessments. You can contact them via our Open Data mailbox at opendata@spenergynetworks.co.uk.The first phase of our comprehensive data quality assessment measures the quality of our datasets across three dimensions. Please refer to the data table schema for the definitions of these dimensions. We are now in the process of expanding our quality assessments to include additional dimensions to provide a more comprehensive evaluation and will update the data tables with the results when available.DisclaimerThe data quality assessment may not represent the quality of the current dataset that is published on the Open Data Portal. Please check the date of the latest quality assessment and compare to the 'Modified' date of the corresponding dataset. The data quality assessments will be updated on either a quarterly or annual basis, dependent on the update frequency of the dataset. This information can be found in the dataset metadata, within the Information tab. If you require a more up to date quality assessment, please contact the Open Data Team at opendata@spenergynetworks.co.uk and a member of the team will be in contact.
Facebook
TwitterThis data table provides the detailed data quality assessment scores for the Single Digital View dataset. The quality assessment was carried out on the 31st of March. At SPEN, we are dedicated to sharing high-quality data with our stakeholders and being transparent about its' quality. This is why we openly share the results of our data quality assessments. We collaborate closely with Data Owners to address any identified issues and enhance our overall data quality. To demonstrate our progress we conduct, at a minimum, bi-annual assessments of our data quality - for datasets that are refreshed more frequently than this, please note that the quality assessment may be based on an earlier version of the dataset. To learn more about our approach to how we assess data quality, visit Data Quality - SP Energy Networks.We welcome feedback and questions from our stakeholders regarding this process. Our Open Data Team is available to answer any enquiries or receive feedback on the assessments. You can contact them via our Open Data mailbox at opendata@spenergynetworks.co.uk.The first phase of our comprehensive data quality assessment measures the quality of our datasets across three dimensions. Please refer to the data table schema for the definitions of these dimensions. We are now in the process of expanding our quality assessments to include additional dimensions to provide a more comprehensive evaluation and will update the data tables with the results when available.DisclaimerThe data quality assessment may not represent the quality of the current dataset that is published on the Open Data Portal. Please check the date of the latest quality assessment and compare to the 'Modified' date of the corresponding dataset. The data quality assessments will be updated on either a quarterly or annual basis, dependent on the update frequency of the dataset. This information can be found in the dataset metadata, within the Information tab. If you require a more up to date quality assessment, please contact the Open Data Team at opendata@spenergynetworks.co.uk and a member of the team will be in contact.
Facebook
TwitterData Quality identifies FMCSA resources for evaluating, monitoring, and improving the quality of data submitted by States to the Motor Carrier Management Information System (MCMIS).
Facebook
TwitterThis dataset was created by Isa Zeynalov
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ObjectiveLinkage of longitudinal administrative data for mothers and babies supports research and service evaluation in several populations around the world. We established a linked mother-baby cohort using pseudonymised, population-level data for England.Design and SettingRetrospective linkage study using electronic hospital records of mothers and babies admitted to NHS hospitals in England, captured in Hospital Episode Statistics between April 2001 and March 2013.ResultsOf 672,955 baby records in 2012/13, 280,470 (42%) linked deterministically to a maternal record using hospital, GP practice, maternal age, birthweight, gestation, birth order and sex. A further 380,164 (56%) records linked using probabilistic methods incorporating additional variables that could differ between mother/baby records (admission dates, ethnicity, 3/4-character postcode district) or that include missing values (delivery variables). The false-match rate was estimated at 0.15% using synthetic data. Data quality improved over time: for 2001/02, 91% of baby records were linked (holding the estimated false-match rate at 0.15%). The linked cohort was representative of national distributions of gender, gestation, birth weight and maternal age, and captured approximately 97% of births in England.ConclusionProbabilistic linkage of maternal and baby healthcare characteristics offers an efficient way to enrich maternity data, improve data quality, and create longitudinal cohorts for research and service evaluation. This approach could be extended to linkage of other datasets that have non-disclosive characteristics in common.
Facebook
TwitterThis data table provides the detailed data quality assessment scores for the Curtailment dataset. The quality assessment was carried out on the 31st of March. At SPEN, we are dedicated to sharing high-quality data with our stakeholders and being transparent about its' quality. This is why we openly share the results of our data quality assessments. We collaborate closely with Data Owners to address any identified issues and enhance our overall data quality. To demonstrate our progress we conduct, at a minimum, bi-annual assessments of our data quality - for datasets that are refreshed more frequently than this, please note that the quality assessment may be based on an earlier version of the dataset. To learn more about our approach to how we assess data quality, visit Data Quality - SP Energy Networks.We welcome feedback and questions from our stakeholders regarding this process. Our Open Data Team is available to answer any enquiries or receive feedback on the assessments. You can contact them via our Open Data mailbox at opendata@spenergynetworks.co.uk.The first phase of our comprehensive data quality assessment measures the quality of our datasets across three dimensions. Please refer to the data table schema for the definitions of these dimensions. We are now in the process of expanding our quality assessments to include additional dimensions to provide a more comprehensive evaluation and will update the data tables with the results when available.DisclaimerThe data quality assessment may not represent the quality of the current dataset that is published on the Open Data Portal. Please check the date of the latest quality assessment and compare to the 'Modified' date of the corresponding dataset. The data quality assessments will be updated on either a quarterly or annual basis, dependent on the update frequency of the dataset. This information can be found in the dataset metadata, within the Information tab. If you require a more up to date quality assessment, please contact the Open Data Team at opendata@spenergynetworks.co.uk and a member of the team will be in contact.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Evaluation of data quality in large healthcare datasets.
abstract: Data quality and fitness for analysis are crucial if outputs of big data analyses should be trusted by the public and the research community. Here we analyze the output from a data quality tool called Achilles Heel as it was applied to 24 datasets across seven different organizations. We highlight 12 data quality rules that identified issues in at least 10 of the 24 datasets and provide a full set of 71 rules identified in at least one dataset. Achilles Heel is developed by Observational Health Data Sciences and Informatics (OHDSI) community and is a freely available software that provides a useful starter set of data quality rules. Our analysis represents the first data quality comparison of multiple datasets across several countries in America, Europe and Asia.
Facebook
TwitterThe Data Quality Utility performs comprehensive checks on AFCARS data to help title IV-E agencies assess and improve data quality. Metadata-only record linking to the original dataset. Open original dataset below.
Facebook
TwitterThe dataset contains the questions asked in the survey over which quantitative data was collected to evaluate the effects of data quality, system quality, and service quality on citizens' trust.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Administrative data are increasingly important in statistics, but, like other types of data, may contain measurement errors. To prevent such errors from invalidating analyses of scientific interest, it is therefore essential to estimate the extent of measurement errors in administrative data. Currently, however, most approaches to evaluate such errors involve either prohibitively expensive audits or comparison with a survey that is assumed perfect. We introduce the “generalized multitrait-multimethod” (GMTMM) model, which can be seen as a general framework for evaluating the quality of administrative and survey data simultaneously. This framework allows both survey and administrative data to contain random and systematic measurement errors. Moreover, it accommodates common features of administrative data such as discreteness, nonlinearity, and nonnormality, improving similar existing models. The use of the GMTMM model is demonstrated by application to linked survey-administrative data from the German Federal Employment Agency on income from of employment, and a simulation study evaluates the estimates obtained and their robustness to model misspecification. Supplementary materials for this article are available online.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Organizations are increasingly accepting data quality (DQ) as a major key to their success. In order to assess and improve DQ, methods have been devised. Many of these methods attempt to raise DQ by directly manipulating low quality data. Such methods operate reactively and are suitable for organizations with highly developed integrated systems. However, there is a lack of a proactive DQ method for businesses with weak IT infrastructure where data quality is largely affected by tasks that are performed by human agents. This study aims to develop and evaluate a new method for structured data, which is simple and practical so that it can easily be applied to real world situations. The new method detects the potentially risky tasks within a process, and adds new improving tasks to counter them. To achieve continuous improvement, an award system is also developed to help with the better selection of the proposed improving tasks. The task-based DQ method (TBDQ) is most appropriate for small and medium organizations, and simplicity in implementation is one of its most prominent features. TBDQ is case studied in an international trade company. The case study shows that TBDQ is effective in selecting optimal activities for DQ improvement in terms of cost and improvement.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this seminar, the presenter introduces essential concepts of ArcGIS Data Reviewer and highlights automated and semi-automated methods to streamline and expedite data validation.This seminar was developed to support the following:ArcGIS Desktop 10.3 (Basic, Standard, or Advanced)ArcGIS Server 10.3 Workgroup (Standard Or Advanced)ArcGIS Data Reviewer for DesktopArcGIS Data Reviewer for Server
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This dataset is a meta-data evaluation of the public datasets on the Open Data portal. Each public dataset is evaluated based on a variety of topics, and assigned a score between 0 and 100.
The datasets are assigned a meta data attribute based on the following scores:
• 0-70: Bronze
• 71-80: Silver
• 81-100: Gold
For more information about the method by which the score is calculated, please visit the following PDF: http://wpgopendata.blob.core.windows.net/documents/Data-Quality-Score-Documentation.pdf
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A key aim of the FNS-Cloud project (grant agreement no. 863059) was to overcome fragmentation within food, nutrition and health data through development of tools and services facilitating matching and merging of data to promote increased reuse. However, in an era of increasing data reuse, it is imperative that the scientific quality of data analysis is maintained. Whilst it is true that many datasets can be reused, questions remain regarding whether they should be, thus, there is a need to support researchers making such a decision. This paper describes the development and evaluation of the FNS-Cloud data quality assessment tool for dietary intake datasets. Markers of quality were identified from the literature for dietary intake, lifestyle, demographic, anthropometric, and consumer behavior data at all levels of data generation (data collection, underlying data sources used, dataset management and data analysis). These markers informed the development of a quality assessment framework, which comprised of decision trees and feedback messages relating to each quality parameter. These fed into a report provided to the researcher on completion of the assessment, with considerations to support them in deciding whether the dataset is appropriate for reuse. This quality assessment framework was transformed into an online tool and a user evaluation study undertaken. Participants recruited from three centres (N = 13) were observed and interviewed while using the tool to assess the quality of a dataset they were familiar with. Participants positively rated the assessment format and feedback messages in helping them assess the quality of a dataset. Several participants quoted the tool as being potentially useful in training students and inexperienced researchers in the use of secondary datasets. This quality assessment tool, deployed within FNS-Cloud, is openly accessible to users as one of the first steps in identifying datasets suitable for use in their specific analyses. It is intended to support researchers in their decision-making process of whether previously collected datasets under consideration for reuse are fit their new intended research purposes. While it has been developed and evaluated, further testing and refinement of this resource would improve its applicability to a broader range of users.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
For more up to date quality metadata, please visit https://w3id.org/lodquator
This dataset is a collection of TRiG files with quality metadata for different datasets on the LOD cloud. Each dataset was assessed for
The length of URIs
Usage of RDF primitives
Re-use of existing terms
Usage of undefined terms
Usage of blank nodes
Indication for different serialisation formats
Usage of multiple languages
This data dump is part of the empirical study conducted for the paper "Are LOD Cloud Datasets Well Represented? A Data Representation Quality Survey."
For more information visit http://jerdeb.github.io/lodqa
Facebook
Twitterhttps://creativecommons.org/share-your-work/public-domain/pdmhttps://creativecommons.org/share-your-work/public-domain/pdm
The USACE IENCs coverage area consists of 7,260 miles across 21 rivers primarily located in the Central United States. IENCs apply to inland waterways that are maintained for navigation by USACE for shallow-draft vessels (e.g., maintained at a depth of 9-14 feet, dependent upon the waterway project authorization). Generally, IENCs are produced for those commercially navigable waterways which the National Oceanic and Atmospheric Administration (NOAA) does not produce Electronic Navigational Charts (ENCs). However, Special Purpose IENCs may be produced in agreement with NOAA. IENC POC: IENC_POC@usace.army.mil
Facebook
Twitterhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7910/DVN/BXV4AThttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7910/DVN/BXV4AT
Political scientists routinely face the challenge of assessing the quality (validity and reliability) of measures in order to use them in substantive research. While stand-alone assessment tools exist, researchers rarely combine them comprehensively. Further, while a large literature informs data producers, data consumers lack guidance on how to assess existing measures for use in substantive research. We delineate a three-component practical approach to data quality assessment that integrates complementary multi-method tools to assess: 1) content validity; 2) the validity and reliability of the data generation process; and 3) convergent validity. We apply our quality assessment approach to the corruption measures from the Varieties of Democracy (V-Dem) project, both illustrating our rubric and unearthing several quality advantages and disadvantages of the V-Dem measures, compared to other existing measures of corruption.
Facebook
TwitterExplore the data quality of the 2014-2019 National Survey on Drug Use and Health (NSDUH) Public Use Files (PUFs) and its comparability with the NSDUH Restricted Use Files (RUFs). This report demonstrates the overall quality of the NSDUH PUFs and the statistical disclosure control techniques used to create them.Chapters:Describes NSDUH and lays out the objective of the report.Presents an overview of the NSDUH disclosure concerns, briefly discusses the disclosure technique known as Micro Agglomeration, Substitution, Subsampling, and Calibration (MASSC), and provides a summary of this study’s quality assessment and research methods.Discusses how some of the detailed tables based on RUF data were selected and replicated using PUF data.Describes the data quality assessment results and findings.Summarizes the conclusions.There are also five appendices that support the analysis further.Aprevious reportanalyzed the PUF data from 2002-2013.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
In 2013, the first of several Regional Stream Quality Assessments (RSQA) was done in the Midwest United States. The Midwest Stream Quality Assessment (MSQA) was a collaborative study by the U.S. Geological Survey National Water Quality Assessment and the U.S. Environmental Protection Agency National Rivers and Streams Assessment. One of the objectives of the RSQA, and thus the MSQA, is to characterize relations between stream ecology and water-quality stressors to determine the relative effects of these stressors on aquatic biota in streams. Data required to meet this objective included fish species and abundance data and physical and chemical water-quality characteristics of the ecological reaches of the sites that were sampled. This dataset comprises 135 fish species, 39,920 fish, 10 selected water-quality stressor metrics, and six selected fish community stressor response variables for 98 sites sampled for the MSQA.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset of the paper: Reliability and applicability of the revised Cochrane risk-of-bias tool for randomised trials (RoB 2): low inter-rater reliability and challenges in application
Facebook
TwitterThis data table provides the detailed data quality assessment scores for the Long Term Development Statement dataset. The quality assessment was carried out on 31st March. At SPEN, we are dedicated to sharing high-quality data with our stakeholders and being transparent about its' quality. This is why we openly share the results of our data quality assessments. We collaborate closely with Data Owners to address any identified issues and enhance our overall data quality; to demonstrate our progress we conduct annual assessments of our data quality in line with the dataset refresh rate. To learn more about our approach to how we assess data quality, visit Data Quality - SP Energy Networks.We welcome feedback and questions from our stakeholders regarding this process. Our Open Data Team is available to answer any enquiries or receive feedback on the assessments. You can contact them via our Open Data mailbox at opendata@spenergynetworks.co.uk.The first phase of our comprehensive data quality assessment measures the quality of our datasets across three dimensions. Please refer to the data table schema for the definitions of these dimensions. We are now in the process of expanding our quality assessments to include additional dimensions to provide a more comprehensive evaluation and will update the data tables with the results when available.DisclaimerThe data quality assessment may not represent the quality of the current dataset that is published on the Open Data Portal. Please check the date of the latest quality assessment and compare to the 'Modified' date of the corresponding dataset. The data quality assessments will be updated on either a quarterly or annual basis, dependent on the update frequency of the dataset. This information can be found in the dataset metadata, within the Information tab. If you require a more up to date quality assessment, please contact the Open Data Team at opendata@spenergynetworks.co.uk and a member of the team will be in contact.