https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.15139/S3/JDLVZ8https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.15139/S3/JDLVZ8
We compare two widely publicized measures of state electoral integrity in the United States: the Electoral Integrity Project’s 2016 U.S. Perceptions of Electoral Integrity Survey and the Pew 2014 Elections Performance Index. First, we review the theoretical and empirical differences between the two measures and find that they correlate at a surprisingly low level across the states. Second, given this low correlation, we examine the component parts of these indices and find that both are capturing multiple dimensions. Third, we examine how the components and the individual indicators that comprise each measure are linked to citizens’ stated perceptions about electoral integrity. Throughout the paper, we articulate a set of preemptive recommendations that urge researchers to be cautious and deliberate when choosing among measures of electoral integrity to use in future empirical studies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Number of encrypted data blocks and their associated tags: With/without the data deduplication approach.
Strategic Direction 2023 (SD23) Measure HE.D.3 reports the number and percentage of creeks and lakes in good or excellent health. This measure is calculated every two years using the monitoring data from the Watershed Protection Department's Environmental Integrity Index (EII) and Austin Lakes Index (ALI) programs. These programs monitor and assess the chemical, biological, and physical integrity of Austin’s creeks and lakes. Note: Due to software limitations, the scores for one biennial reporting period (e.g., FY2013/2014) are repeated twice in the dataset in order to enable the creation of data visualizations that require annual reporting. View more details and insights related to this measure on the story page: https://data.austintexas.gov/stories/s/d5yi-gac8
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The AQS Data Mart is a database containing all of the information from AQS. It has every measured value the EPA has collected via the national ambient air monitoring program. It also includes the associated aggregate values calculated by EPA (8-hour, daily, annual, etc.). The AQS Data Mart is a copy of AQS made once per week and made accessible to the public through web-based applications. The intended users of the Data Mart are air quality data analysts in the regulatory, academic, and health research communities. It is intended for those who need to download large volumes of detailed technical data stored at EPA and does not provide any interactive analytical tools. It serves as the back-end database for several Agency interactive tools that could not fully function without it: AirData, AirCompare, The Remote Sensing Information Gateway, the Map Monitoring Sites KML page, etc.
AQS must maintain constant readiness to accept data and meet high data integrity requirements, thus is limited in the number of users and queries to which it can respond. The Data Mart, as a read only copy, can allow wider access.
The most commonly requested aggregation levels of data (and key metrics in each) are:
Sample Values (2.4 billion values back as far as 1957, national consistency begins in 1980, data for 500 substances routinely collected) The sample value converted to standard units of measure (generally 1-hour averages as reported to EPA, sometimes 24-hour averages) Local Standard Time (LST) and GMT timestamps Measurement method Measurement uncertainty, where known Any exceptional events affecting the data NAAQS Averages NAAQS average values (8-hour averages for ozone and CO, 24-hour averages for PM2.5) Daily Summary Values (each monitor has the following calculated each day) Observation count Observation per cent (of expected observations) Arithmetic mean of observations Max observation and time of max AQI (air quality index) where applicable Number of observations > Standard where applicable Annual Summary Values (each monitor has the following calculated each year) Observation count and per cent Valid days Required observation count Null observation count Exceptional values count Arithmetic Mean and Standard Deviation 1st - 4th maximum (highest) observations Percentiles (99, 98, 95, 90, 75, 50) Number of observations > Standard Site and Monitor Information FIPS State Code (the first 5 items on this list make up the AQS Monitor Identifier) FIPS County Code Site Number (unique within the county) Parameter Code (what is measured) POC (Parameter Occurrence Code) to distinguish from different samplers at the same site Latitude Longitude Measurement method information Owner / operator / data-submitter information Monitoring Network to which the monitor belongs Exemptions from regulatory requirements Operational dates City and CBSA where the monitor is located Quality Assurance Information Various data fields related to the 19 different QA assessments possible
You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.epa_historical_air_quality.[TABLENAME]
. Fork this kernel to get started.
Data provided by the US Environmental Protection Agency Air Quality System Data Mart.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview This dataset provides the measurements of raw water storage levels in reservoirs crucial for public water supply, The reservoirs included in this dataset are natural bodies of water that have been dammed to store untreated water. Key Definitions Aggregation The process of summarizing or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes. Capacity The maximum volume of water a reservoir can hold above the natural level of the surrounding land, with thresholds for regulation at 10,000 cubic meters in England, Wales and Northern Ireland and a modified threshold of 25,000 cubic meters in Scotland pending full implementation of the Reservoirs (Scotland) Act 2011. Current Level The present volume of water held in a reservoir measured above a set baseline crucial for safety and regulatory compliance. Current Percentage The current water volume in a reservoir as a percentage of its total capacity, indicating how full the reservoir is at any given time. Dataset Structured and organized collection of related elements, often stored digitally, used for analysis and interpretation in various fields. Granularity Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours ID Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance. Open Data Triage The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data. Reservoir Large natural lake used for storing raw water intended for human consumption. Its volume is measurable, allowing for careful management and monitoring to meet demand for clean, safe water. Reservoir Type The classification of a reservoir based on the method of construction, the purpose it serves or the source of water it stores. Schema Structure for organizing and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute. Units Standard measurements used to quantify and compare different physical quantities. Data History Data Origin Reservoir level data is sourced from water companies who may also update this information on their website and government publications such as the Water situation reports provided by the UK government. Data Triage Considerations Identification of Critical Infrastructure Special attention is given to safeguard data on essential reservoirs in line with the National Infrastructure Act, to mitigate security risks and ensure resilience of public water systems. Currently, it is agreed that only reservoirs with a location already available in the public domain are included in this dataset. Commercial Risks and Anonymisation The risk of personal information exposure is minimal to none since the data concerns reservoir levels, which are not linked to individuals or households. Data Freshness It is not currently possible to make the dataset live. Some companies have digital monitoring, and some are measuring reservoir levels analogically. This dataset may not be used to determine reservoir level in place of visual checks where these are advised. Data Triage Review Frequency Annually unless otherwise requested Data Specifications Data specifications define what is included and excluded in the dataset to maintain clarity and focus. For this dataset: Each dataset covers measurements taken by the publisher. This dataset is published periodically in line with the publisher’s capabilities Historical datasets may be provided for comparison but are not required The location data provided may be a point from anywhere within the body of water or on its boundary. Reservoirs included in the dataset must be: Open bodies of water used to store raw/untreated water Filled naturally Measurable Contain water that may go on to be used for public supply Context This dataset must not be used to determine the implementation of low supply or high supply measures such as hose pipe bans being put in place or removed. Please await guidance from your water supplier regarding any changes required to your usage of water. Particularly high or low reservoir levels may be considered normal or as expected given the season or recent weather. This dataset does not remove the requirement for visual checks on reservoir level that are in place for caving/pot holing safety. Some water companies calculate the capacity of reservoirs differently than others. The capacity can mean the useable volume of the reservoir or the overall volume that can be held in the reservoir including water below the water table. Data Publish Frequency Annually
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cryptographic operations and their computational time (in seconds).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Twelve RNA-seq datasets generated from human brain Glioblastoma (GBM) cell line2. Accession number, RNA Integrity Numbers (RIN), the median Transcript Integrity Numbers (medTIN), total read pairs, read pairs with mapping quality > 30, and number of genes with at least 10 reads are listed. (XLS 7 kb)
The contents of this dataset is divided into two main sections: Files documenting the literature review process of search query development, screening results, and the form used during data extraction. These files are found under "data_supplementary". Data files used in the analysis, using the R programming language in the form of R-markdown files. These files are found under "R-project". The aim of the related publication was to review the indicators used in peer-reviewed research to measure impacts on biosphere integrity from renewable energy generation and utilization, and categorize them into biosphere approaches defined by the authors.
Abstract Background Frogeye leaf spot is a disease of soybean, and there are limited sources of crop genetic resistance. Accurate quantification of resistance is necessary for the discovery of novel resistance sources, which can be accelerated by using a low-cost and easy-to-use image analysis system to phenotype the disease. The objective herein was to develop an automated image analysis phenotyping pipeline to measure and count frogeye leaf spot lesions on soybean leaves with high precision and resolution while ensuring data integrity. Results The image analysis program developed measures two traits: the percent of diseased leaf area and the number of lesions on a leaf. Percent of diseased leaf area is calculated by dividing the number of diseased pixels by the total number of leaf pixels, which are segmented through a series of color space transformations and pixel value thresholding. Lesion number is determined by counting the number of objects remaining in the image when the lesions are segmented. Automated measurement of the percent of diseased leaf area deviates from the manually measured value by less than 0.05% on average. Automatic lesion counting deviates by an average of 1.6 lesions from the manually counted value. The proposed method is highly correlated with a conventional method using a 1–5 ordinal scale based on a standard area diagram. Input image compression was optimal at a resolution of 1500 × 1000 pixels. At this resolution, the image analysis method proposed can process an image in less than 10 s and is highly concordant with uncompressed images. Conclusion Image analysis provides improved resolution over conventional methods of frogeye leaf spot disease phenotyping. This method can improve the precision and resolution of phenotyping frogeye leaf spot, which can be used in genetic mapping to identify QTLs for crop genetic resistance and in breeding efforts for resistance to the disease.
edgeR detected top 1000 differentially expressed genes (FDR cutoff = 0.01) in GBM samples without TIN correction. FC = Fold Change; CPM = Count Per Million; FDR = False Discovery Rate. (XLS 143 kb)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
This dataset offers valuable insights into yearly domestic water consumption across various Lower Super Output Areas (LSOAs) or Data Zones, accompanied by the count of water meters within each area. It is instrumental for analysing residential water use patterns, facilitating water conservation efforts, and guiding infrastructure development and policy making at a localised level.
Key Definitions
Aggregation
The process of summarising or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes.
AMR Meter
Automatic meter reading (AMR) is the technology of automatically collecting consumption, diagnostic, and status data from a water meter remotely and periodically.
Dataset
Structured and organised collection of related elements, often stored digitally, used for analysis and interpretation in various fields.
Data Zone
Data zones are the key geography for the dissemination of small area statistics in Scotland
Dumb Meter
A dumb meter or analogue meter is read manually. It does not have any external connectivity.
Granularity
Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours
ID
Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance.
LSOA
Lower Layer Super Output Areas (LSOA) are a geographic hierarchy designed to improve the reporting of small area statistics in England and Wales.
Open Data Triage
The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data.
Schema
Structure for organising and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute.
Smart Meter
A smart meter is an electronic device that records information and communicates it to the consumer and the supplier. It differs from automatic meter reading (AMR) in that it enables two-way communication between the meter and the supplier.
Units
Standard measurements used to quantify and compare different physical quantities.
Water Meter
Water metering is the practice of measuring water use. Water meters measure the volume of water used by residential and commercial building units that are supplied with water by a public water supply system.
Data History
Data Origin
Domestic consumption data is recorded using water meters. The consumption recorded is then sent back to water companies. This dataset is extracted from the water companies.
Data Triage Considerations
This section discusses the careful handling of data to maintain anonymity and addresses the challenges associated with data updates, such as identifying household changes or meter replacements.
Identification of Critical Infrastructure
This aspect is not applicable for the dataset, as the focus is on domestic water consumption and does not contain any information that reveals critical infrastructure details.
Commercial Risks and Anonymisation
Individual Identification Risks
There is a potential risk of identifying individuals or households if the consumption data is updated irregularly (e.g., every 6 months) and an out-of-cycle update occurs (e.g., after 2 months), which could signal a change in occupancy or ownership. Such patterns need careful handling to avoid accidental exposure of sensitive information.
Meter and Property Association
Challenges arise in maintaining historical data integrity when meters are replaced but the property remains the same. Ensuring continuity in the data without revealing personal information is crucial.
Interpretation of Null Consumption
Instances of null consumption could be misunderstood as a lack of water use, whereas they might simply indicate missing data. Distinguishing between these scenarios is vital to prevent misleading conclusions.
Meter Re-reads
The dataset must account for instances where meters are read multiple times for accuracy.
Joint Supplies & Multiple Meters per Household
Special consideration is required for households with multiple meters as well as multiple households that share a meter as this could complicate data aggregation.
Schema Consistency with the Energy Industry:
In formulating the schema for the domestic water consumption dataset, careful consideration was given to the potential risks to individual privacy. This evaluation included examining the frequency of data updates, the handling of property and meter associations, interpretations of null consumption, meter re-reads, joint suppliers, and the presence of multiple meters within a single household as described above.
After a thorough assessment of these factors and their implications for individual privacy, it was decided to align the dataset's schema with the standards established within the energy industry. This decision was influenced by the energy sector's experience and established practices in managing similar risks associated with smart meters. This ensures a high level of data integrity and privacy protection.
Schema
The dataset schema is aligned with those used in the energy industry, which has encountered similar challenges with smart meters. However, it is important to note that the energy industry has a much higher density of meter distribution, especially smart meters.
Aggregation to Mitigate Risks
The dataset employs an elevated level of data aggregation to minimise the risk of individual identification. This approach is crucial in maintaining the utility of the dataset while ensuring individual privacy. The aggregation level is carefully chosen to remove identifiable risks without excluding valuable data, thus balancing data utility with privacy concerns.
Data Freshness
Users should be aware that this dataset reflects historical consumption patterns and does not represent real-time data.
Publish Frequency
Annually
Data Triage Review Frequency
An annual review is conducted to ensure the dataset's relevance and accuracy, with adjustments made based on specific requests or evolving data trends.
Data Specifications
For the domestic water consumption dataset, the data specifications are designed to ensure comprehensiveness and relevance, while maintaining clarity and focus. The specifications for this dataset include:
·
Each
dataset encompasses recordings of domestic water consumption as measured and
reported by the data publisher. It excludes commercial consumption.
· Where it is necessary to estimate consumption, this is calculated based on actual meter readings.
· Meters of all types (smart, dumb, AMR) are included in this dataset.
·
The
dataset is updated and published annually.
·
Historical
data may be made available to facilitate trend analysis and comparative
studies, although it is not mandatory for each dataset release.
Context
Users are cautioned against using the dataset for immediate operational decisions regarding water supply management. The data should be interpreted considering potential seasonal and weather-related influences on water consumption patterns.
The geographical data provided does not pinpoint locations of water meters within an LSOA.
The dataset aims to cover a broad spectrum of households, from single-meter homes to those with multiple meters, to accurately reflect the diversity of water use within an LSOA.
Supplementary Information
Below is a curated selection of links for additional reading, which provide a deeper understanding of this dataset.
Ofwat guidance on water meters
https://www.ofwat.gov.uk/wp-content/uploads/2015/11/prs_lft_101117meters.pdf
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
List of 10 low RIN/medTIN mCRPC and 10 higher RIN/medTIN mCRPC samples used for differential expression analysis. “N” = Metastatic bone site, “V1” = Visit 1. Whole datasets are available with accession # GSM1722952. (XLS 97 kb)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
edgeR detected top 117 differentially expressed genes (FDR cutoff = 0.01) in GBM samples using 3′ count method (3TC). FC = Fold Change; CPM = Count Per Million; FDR = False Discovery Rate. (XLS 22 kb)
DRAKO is a leader in providing Device Graph Data, focusing on understanding the relationships between consumer devices and identities. Our data allows businesses to create holistic profiles of users, track engagement across platforms, and measure the effectiveness of advertising efforts.
Device Graph Data is essential for accurate audience targeting, cross-device attribution, and understanding consumer journeys. By integrating data from multiple sources, we provide a unified view of user interactions, helping businesses make informed decisions.
Key Features: - Comprehensive device mapping to understand user behaviour across multiple platforms - Detailed Identity Graph Data for cross-device identification and engagement tracking - Integration with Connected TV Data for enhanced insights into video consumption habits - Mobile Attribution Data to measure the effectiveness of mobile campaigns - Customizable analytics to segment audiences based on device usage and demographics - Some ID types offered: AAID, idfa, Unified ID 2.0, AFAI, MSAI, RIDA, AAID_CTV, IDFA_CTV
Use Cases: - Cross-device marketing strategies - Attribution modelling and campaign performance measurement - Audience segmentation and targeting - Enhanced insights for Connected TV advertising - Comprehensive consumer journey mapping
Data Compliance: All of our Device Graph Data is sourced responsibly and adheres to industry standards for data privacy and protection. We ensure that user identities are handled with care, providing insights without compromising individual privacy.
Data Quality: DRAKO employs robust validation techniques to ensure the accuracy and reliability of our Device Graph Data. Our quality assurance processes include continuous monitoring and updates to maintain data integrity and relevance.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Math equations for public verification.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview This dataset offers valuable insights into yearly domestic water consumption across various Lower Super Output Areas (LSOAs) or Data Zones, accompanied by the count of water meters within each area. It is instrumental for analysing residential water use patterns, facilitating water conservation efforts, and guiding infrastructure development and policy making at a localised level. Key Definitions Aggregation The process of summarising or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes. AMR Meter Automatic meter reading (AMR) is the technology of automatically collecting consumption, diagnostic, and status data from a water meter remotely and periodically. Dataset Structured and organised collection of related elements, often stored digitally, used for analysis and interpretation in various fields. Data Zone Data zones are the key geography for the dissemination of small area statistics in Scotland Dumb Meter A dumb meter or analogue meter is read manually. It does not have any external connectivity. Granularity Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours ID Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance. LSOA Lower Layer Super Output Areas (LSOA) are a geographic hierarchy designed to improve the reporting of small area statistics in England and Wales. Open Data Triage The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data. Schema Structure for organising and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute. Smart Meter A smart meter is an electronic device that records information and communicates it to the consumer and the supplier. It differs from automatic meter reading (AMR) in that it enables two-way communication between the meter and the supplier. Units Standard measurements used to quantify and compare different physical quantities. Water Meter Water metering is the practice of measuring water use. Water meters measure the volume of water used by residential and commercial building units that are supplied with water by a public water supply system. Data History Data Origin Domestic consumption data is recorded using water meters. The consumption recorded is then sent back to water companies. This dataset is extracted from the water companies. Data Triage Considerations This section discusses the careful handling of data to maintain anonymity and addresses the challenges associated with data updates, such as identifying household changes or meter replacements. Identification of Critical Infrastructure This aspect is not applicable for the dataset, as the focus is on domestic water consumption and does not contain any information that reveals critical infrastructure details. Commercial Risks and Anonymisation Individual Identification Risks There is a potential risk of identifying individuals or households if the consumption data is updated irregularly (e.g., every 6 months) and an out-of-cycle update occurs (e.g., after 2 months), which could signal a change in occupancy or ownership. Such patterns need careful handling to avoid accidental exposure of sensitive information. Meter and Property Association Challenges arise in maintaining historical data integrity when meters are replaced but the property remains the same. Ensuring continuity in the data without revealing personal information is crucial. Interpretation of Null Consumption Instances of null consumption could be misunderstood as a lack of water use, whereas they might simply indicate missing data. Distinguishing between these scenarios is vital to prevent misleading conclusions. Meter Re-reads The dataset must account for instances where meters are read multiple times for accuracy. Joint Supplies & Multiple Meters per Household Special consideration is required for households with multiple meters as well as multiple households that share a meter as this could complicate data aggregation. Schema Consistency with the Energy Industry: In formulating the schema for the domestic water consumption dataset, careful consideration was given to the potential risks to individual privacy. This evaluation included examining the frequency of data updates, the handling of property and meter associations, interpretations of null consumption, meter re-reads, joint suppliers, and the presence of multiple meters within a single household as described above. After a thorough assessment of these factors and their implications for individual privacy, it was decided to align the dataset's schema with the standards established within the energy industry. This decision was influenced by the energy sector's experience and established practices in managing similar risks associated with smart meters. This ensures a high level of data integrity and privacy protection. Schema The dataset schema is aligned with those used in the energy industry, which has encountered similar challenges with smart meters. However, it is important to note that the energy industry has a much higher density of meter distribution, especially smart meters. Aggregation to Mitigate Risks The dataset employs an elevated level of data aggregation to minimise the risk of individual identification. This approach is crucial in maintaining the utility of the dataset while ensuring individual privacy. The aggregation level is carefully chosen to remove identifiable risks without excluding valuable data, thus balancing data utility with privacy concerns. Data Freshness Users should be aware that this dataset reflects historical consumption patterns and does not represent real-time data. Publish Frequency Annually Data Triage Review Frequency An annual review is conducted to ensure the dataset's relevance and accuracy, with adjustments made based on specific requests or evolving data trends. Data Specifications For the domestic water consumption dataset, the data specifications are designed to ensure comprehensiveness and relevance, while maintaining clarity and focus. The specifications for this dataset include: Each dataset encompasses recordings of domestic water consumption as measured and reported by the data publisher. It excludes commercial consumption. Where it is necessary to estimate consumption, this is calculated based on actual meter readings. Meters of all types (smart, dumb, AMR) are included in this dataset. The dataset is updated and published annually. Historical data may be made available to facilitate trend analysis and comparative studies, although it is not mandatory for each dataset release. Context Users are cautioned against using the dataset for immediate operational decisions regarding water supply management. The data should be interpreted considering potential seasonal and weather-related influences on water consumption patterns. The geographical data provided does not pinpoint locations of water meters within an LSOA. The dataset aims to cover a broad spectrum of households, from single-meter homes to those with multiple meters, to accurately reflect the diversity of water use within an LSOA.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
OverviewThis dataset offers valuable insights into yearly domestic water consumption across various Lower Super Output Areas (LSOAs) or Data Zones, accompanied by the count of water meters within each area. It is instrumental for analysing residential water use patterns, facilitating water conservation efforts, and guiding infrastructure development and policy making at a localised level.Key DefinitionsAggregation The process of summarising or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes. AMR MeterAutomatic meter reading (AMR) is the technology of automatically collecting consumption, diagnostic, and status data from a water meter remotely and periodically.Dataset Structured and organised collection of related elements, often stored digitally, used for analysis and interpretation in various fields.Data ZoneData zones are the key geography for the dissemination of small area statistics in ScotlandDumb MeterA dumb meter or analogue meter is read manually. It does not have any external connectivity.Granularity Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours.ID Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance. LSOALower Layer Super Output Areas (LSOA) are a geographic hierarchy designed to improve the reporting of small area statistics in England and Wales.Open Data Triage The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data. Schema Structure for organising and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute. Smart MeterA smart meter is an electronic device that records information and communicates it to the consumer and the supplier. It differs from automatic meter reading (AMR) in that it enables two-way communication between the meter and the supplier.Units Standard measurements used to quantify and compare different physical quantities.Water MeterWater metering is the practice of measuring water use. Water meters measure the volume of water used by residential and commercial building units that are supplied with water by a public water supply system.Data HistoryData Origin Domestic consumption data is recorded using water meters. The consumption recorded is then sent back to water companies. This dataset is extracted from the water companies.Data Triage ConsiderationsThis section discusses the careful handling of data to maintain anonymity and addresses the challenges associated with data updates, such as identifying household changes or meter replacements.Identification of Critical InfrastructureThis aspect is not applicable for the dataset, as the focus is on domestic water consumption and does not contain any information that reveals critical infrastructure details.Commercial Risks and AnonymisationIndividual Identification RisksThere is a potential risk of identifying individuals or households if the consumption data is updated irregularly (e.g., every 6 months) and an out-of-cycle update occurs (e.g., after 2 months), which could signal a change in occupancy or ownership. Such patterns need careful handling to avoid accidental exposure of sensitive information.Meter and Property AssociationChallenges arise in maintaining historical data integrity when meters are replaced but the property remains the same. Ensuring continuity in the data without revealing personal information is crucial.Interpretation of Null ConsumptionInstances of null consumption could be misunderstood as a lack of water use, whereas they might simply indicate missing data. Distinguishing between these scenarios is vital to prevent misleading conclusions.Meter Re-reads The dataset must account for instances where meters are read multiple times for accuracy.Joint Supplies & Multiple Meters per Household Special consideration is required for households with multiple meters as well as multiple households that share a meter as this could complicate data aggregation.Schema Consistency with the Energy IndustryIn formulating the schema for the domestic water consumption dataset, careful consideration was given to the potential risks to individual privacy. This evaluation included examining the frequency of data updates, the handling of property and meter associations, interpretations of null consumption, meter re-reads, joint suppliers, and the presence of multiple meters within a single household as described above. After a thorough assessment of these factors and their implications for individual privacy, it was decided to align the dataset's schema with the standards established within the energy industry. This decision was influenced by the energy sector's experience and established practices in managing similar risks associated with smart meters. This ensures a high level of data integrity and privacy protection.SchemaThe dataset schema is aligned with those used in the energy industry, which has encountered similar challenges with smart meters. However, it is important to note that the energy industry has a much higher density of meter distribution, especially smart meters. Aggregation to Mitigate RisksThe dataset employs an elevated level of data aggregation to minimise the risk of individual identification. This approach is crucial in maintaining the utility of the dataset while ensuring individual privacy. The aggregation level is carefully chosen to remove identifiable risks without excluding valuable data, thus balancing data utility with privacy concerns.Data Triage Review FrequencyAn annual review is conducted to ensure the dataset's relevance and accuracy, with adjustments made based on specific requests or evolving data trends.Data FreshnessUsers should be aware that this dataset reflects historical consumption patterns and does not represent real-time data.Publish FrequencyAnnuallyData SpecificationsFor the domestic water consumption dataset, the data specifications are designed to ensure comprehensiveness and relevance, while maintaining clarity and focus. The specifications for this dataset include:• Each dataset encompasses recordings of domestic water consumption as measured and reported by the data publisher. It excludes commercial consumption.• Where it is necessary to estimate consumption, this is calculated based on actual meter readings.• Meters of all types (smart, dumb, AMR) are included in this dataset.• The dataset is updated and published annually.• Historical data may be made available to facilitate trend analysis and comparative studies, although it is not mandatory for each dataset release.• The dataset includes LSOAs with 2 or more meters. Any LSOAs with less than 2 meters have been excluded.Context Users are cautioned against using the dataset for immediate operational decisions regarding water supply management. The data should be interpreted considering potential seasonal and weather-related influences on water consumption patterns.The geographical data provided does not pinpoint locations of water meters within an LSOA. The dataset aims to cover a broad spectrum of households, from single-meter homes to those with multiple meters, or a single meter for multiple domestic units, to accurately reflect the diversity of water use within an LSOA.This dataset has been aggregated from actual read data and does not use estimated values to align reads to calendar years. Our approach subtracts the latest meter read from the preceding year from the latest meter read in the reported year; this is divided by the number of days between the two reads to obtain an average daily consumption. Data are removed from meters considered as void for seven or more months during the year. Void properties are those within the company’s supply area, which are connected for either a water service only, a wastewater service only or both services but do not receive a charge, as there are no occupants.Supplementary InformationBelow is a curated selection of links for additional reading, which provide a deeper understanding of this dataset. 1. Ofwat guidance on water meters: https://www.ofwat.gov.uk/wp-content/uploads/2015/11/prs_lft_101117meters.pdf2. Wessex Water performance commitment data on void sites: https://marketplace.wessexwater.co.uk/dataset/void-sites-performance-commitment-data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview This dataset offers valuable insights into yearly domestic water consumption across various Lower Super Output Areas (LSOAs) or Data Zones, accompanied by the count of water meters within each area. It is instrumental for analysing residential water use patterns, facilitating water conservation efforts, and guiding infrastructure development and policy making at a localised level. Key Definitions Aggregation The process of summarising or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes. AMR Meter Automatic meter reading (AMR) is the technology of automatically collecting consumption, diagnostic, and status data from a water meter remotely and periodically. Dataset Structured and organised collection of related elements, often stored digitally, used for analysis and interpretation in various fields. Data Zone Data zones are the key geography for the dissemination of small area statistics in Scotland Dumb Meter A dumb meter or analogue meter is read manually. It does not have any external connectivity. Granularity Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours ID Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance. LSOA Lower Layer Super Output Areas (LSOA) are a geographic hierarchy designed to improve the reporting of small area statistics in England and Wales. Open Data Triage The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data. Schema Structure for organising and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute. Smart Meter A smart meter is an electronic device that records information and communicates it to the consumer and the supplier. It differs from automatic meter reading (AMR) in that it enables two-way communication between the meter and the supplier. Units Standard measurements used to quantify and compare different physical quantities. Water Meter Water metering is the practice of measuring water use. Water meters measure the volume of water used by residential and commercial building units that are supplied with water by a public water supply system. Data History Data Origin Domestic consumption data is recorded using water meters. The consumption recorded is then sent back to water companies. This dataset is extracted from the water companies. Data Triage Considerations This section discusses the careful handling of data to maintain anonymity and addresses the challenges associated with data updates, such as identifying household changes or meter replacements. Identification of Critical Infrastructure This aspect is not applicable for the dataset, as the focus is on domestic water consumption and does not contain any information that reveals critical infrastructure details. Commercial Risks and Anonymisation Individual Identification Risks There is a potential risk of identifying individuals or households if the consumption data is updated irregularly (e.g., every 6 months) and an out-of-cycle update occurs (e.g., after 2 months), which could signal a change in occupancy or ownership. Such patterns need careful handling to avoid accidental exposure of sensitive information. Meter and Property Association Challenges arise in maintaining historical data integrity when meters are replaced but the property remains the same. Ensuring continuity in the data without revealing personal information is crucial. Interpretation of Null Consumption Instances of null consumption could be misunderstood as a lack of water use, whereas they might simply indicate missing data. Distinguishing between these scenarios is vital to prevent misleading conclusions. Meter Re-reads The dataset must account for instances where meters are read multiple times for accuracy. Joint Supplies & Multiple Meters per Household Special consideration is required for households with multiple meters as well as multiple households that share a meter as this could complicate data aggregation. Schema Consistency with the Energy Industry: In formulating the schema for the domestic water consumption dataset, careful consideration was given to the potential risks to individual privacy. This evaluation included examining the frequency of data updates, the handling of property and meter associations, interpretations of null consumption, meter re-reads, joint suppliers, and the presence of multiple meters within a single household as described above. After a thorough assessment of these factors and their implications for individual privacy, it was decided to align the dataset's schema with the standards established within the energy industry. This decision was influenced by the energy sector's experience and established practices in managing similar risks associated with smart meters. This ensures a high level of data integrity and privacy protection. Schema The dataset schema is aligned with those used in the energy industry, which has encountered similar challenges with smart meters. However, it is important to note that the energy industry has a much higher density of meter distribution, especially smart meters. Aggregation to Mitigate Risks The dataset employs an elevated level of data aggregation to minimise the risk of individual identification. This approach is crucial in maintaining the utility of the dataset while ensuring individual privacy. The aggregation level is carefully chosen to remove identifiable risks without excluding valuable data, thus balancing data utility with privacy concerns. Data Freshness Users should be aware that this dataset reflects historical consumption patterns and does not represent real-time data. Publish Frequency Annually
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global penetration testing (pen-testing) market size is poised to witness a robust growth trajectory, with estimates suggesting it will reach USD 3.5 billion in 2023 and is projected to grow to USD 9.8 billion by 2032, reflecting a compound annual growth rate (CAGR) of 11.8% over the forecast period. The burgeoning demand for pen-testing arises from the increasing necessity for cybersecurity amid a rise in sophisticated cyber threats. Organizations across diverse sectors are becoming more vigilant about their cyber defense mechanisms, which is driving the adoption of penetration testing services and solutions.
The growth of the pen-testing market is significantly influenced by the escalating frequency and complexity of cyberattacks. With businesses increasingly relying on digital platforms, there is an urgent need for robust cybersecurity measures. Pen-testing allows organizations to proactively identify vulnerabilities within their systems, thereby minimizing the risk of data breaches and ensuring data integrity. This proactive security measure has gained traction, especially among sectors handling sensitive data, such as BFSI and healthcare, which are prime targets for cybercriminals. As the landscape of cyber threats evolves, the demand for advanced pen-testing solutions is anticipated to rise, thereby contributing to the market's growth.
Technological advancements and the integration of artificial intelligence (AI) and machine learning (ML) in pen-testing solutions have catalyzed market growth. These technologies enhance the efficiency and accuracy of penetration testing by automating routine tasks and providing deeper insights into potential vulnerabilities. The use of AI and ML in pen-testing tools enables the rapid analysis of large datasets and improves the identification of sophisticated threats. Moreover, these advanced solutions offer predictive analytics, allowing businesses to anticipate potential attacks before they occur. Consequently, the integration of AI and ML in pen-testing is expected to bolster the market's expansion over the coming years.
Another pivotal factor driving market growth is the increasing regulatory requirements for data protection and cybersecurity across various industries. Governments worldwide are implementing stringent regulations to protect consumer data, compelling organizations to adopt comprehensive security measures, including penetration testing. Compliance with standards such as the General Data Protection Regulation (GDPR) in Europe and the Cybersecurity Maturity Model Certification (CMMC) in the United States necessitates regular security assessments, thereby fueling the demand for pen-testing. As regulatory landscapes continue to evolve, businesses will seek pen-testing services to ensure compliance, further propelling market growth.
Regionally, North America is expected to hold a significant share of the pen-testing market, driven by the presence of numerous cybersecurity firms and a high adoption rate of advanced technologies. This region's market growth is also supported by the increasing number of cyber threats and stringent regulatory requirements. Meanwhile, Asia Pacific is projected to witness the highest growth rate during the forecast period. The rapid digitization and increasing reliance on online platforms in countries like China and India have heightened the need for effective cybersecurity solutions, including pen-testing. As such, these regions are likely to experience robust market growth, contributing to the overall expansion of the global pen-testing market.
The penetration testing market by offering is segmented into solutions and services, each playing a crucial role in the overall market landscape. Solutions in the pen-testing market primarily encompass software tools that aid in the identification and analysis of vulnerabilities within an organization's IT infrastructure. These solutions are increasingly being enhanced with AI and ML capabilities, offering sophisticated threat detection and predictive analysis. In addition, the growing trend of digital transformation across industries necessitates the adoption of comprehensive pen-testing solutions to safeguard digital assets. The solutions segment is expected to experience substantial growth, driven by technological advancements and increased awareness of cybersecurity threats.
Services offered in the pen-testing market include consulting, training, and managed services, among others. These services are essential for organizations lacking the in-house ex
Use the fraud triangle theory to measure intention to cheat in exams.
https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.15139/S3/JDLVZ8https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.15139/S3/JDLVZ8
We compare two widely publicized measures of state electoral integrity in the United States: the Electoral Integrity Project’s 2016 U.S. Perceptions of Electoral Integrity Survey and the Pew 2014 Elections Performance Index. First, we review the theoretical and empirical differences between the two measures and find that they correlate at a surprisingly low level across the states. Second, given this low correlation, we examine the component parts of these indices and find that both are capturing multiple dimensions. Third, we examine how the components and the individual indicators that comprise each measure are linked to citizens’ stated perceptions about electoral integrity. Throughout the paper, we articulate a set of preemptive recommendations that urge researchers to be cautious and deliberate when choosing among measures of electoral integrity to use in future empirical studies.