Facebook
TwitterCARETS is a systematic test suite to measure consistency and robustness of modern VQA models through a series of six fine-grained capability tests.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Key Definitions DatasetA structured and organized collection of related elements, often stored digitally, used for analysis and interpretation in various fields. Data TriageThe process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data. District Metered Area (DMA)The role of a district metered area (DMA) is to divide the water distribution network into manageable areas or sectors into which the flow can be measured. These areas provide the water providers with guidance as to which DMAs (District Metered Areas) require leak detection work.LeakageThe accidental admission or escape of a fluid or gas through a hole or crackNight FlowThis technique considers that in a DMA, leakages can be estimated when the flow into theDMA is at its minimum. Typically, this is measured at night between 2am and 4am when customer demand is low so that network leakage can be detected.CentroidThe centre of a geometric object.Data History Data Origin Companies have configured their networks to be able to continuously monitor night flows using district meters. Flow data is recorded on meters and normally transmitted daily to a data centre. Data is analysed to confirm its validity and used to derive continuous night flow in each monitored area.Data Triage Considerations Data QualityNot all DMAs provide quality data for the purposes of trend analysis. It was decided that water companies should choose 10% of their DMAs to be represented in this data set to begin with. The advice to publishers is to choose those with reliable and consistent telemetry, indicative of genuine low demand during measurement times and not revealing of sensitive night usage patterns.Data Consistency There is a concern that companies measure flow allowance for legitimate night use and/or potential night use differently. To avoid any inconsistency, it was decided that we would share the net flow.Critical National InfrastructureThe release of boundary data for district metered areas has been deemed to be revealing of critical national infrastructure. Because of this, it has been decided that the data set shall only contain point data from a centroid within the DMA.Data Triage Review FrequencyEvery 12 months, unless otherwise requested.Data LimitationsSome of the flow recorded may be legitimate night-time usage of the networkSome measuring systems automatically infill estimated measurements where none have been received via telemetry. These estimates are based on past flow.The reason for a fluctuation in night flow may not be determined by this dataset but potential causes can include seasonal variation in night-time water usage and mains burstsData Publish Frequency Monthly
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the code and datasets of the DSPD proposed in the study for measuring the trajectory simialrity which has been accepted by IJGIS. If you use the code in this study, please cite the study below. Ju Peng, Min Deng, Jianbo Tang et al. 2024,IJGIS. A movement-aware measure for trajectory similarity and its application for ride-sharing path extraction in a road network.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Key Definitions Dataset A structured and organized collection of related elements, often stored digitally, used for analysis and interpretation in various fields. Data Triage The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data. District Metered Area (DMA) The role of a district metered area (DMA) is to divide the water distribution network into manageable areas or sectors into which the flow can be measured. These areas provide the water providers with guidance as to which DMAs (District Metered Areas) require leak detection work. Leakage The accidental admission or escape of a fluid or gas through a hole or crack Night Flow This technique considers that in a DMA, leakages can be estimated when the flow into the DMA is at its minimum. Typically, this is measured at night between 2am and 4am when customer demand is low so that network leakage can be detected. Centroid The centre of a geometric object. Data History Data Origin Companies have configured their networks to be able to continuously monitor night flows using district meters. Flow data is recorded on meters and normally transmitted daily to a data centre. Data is analysed to confirm its validity and used to derive continuous night flow in each monitored area. Data Triage Considerations Data Quality Not all DMAs provide quality data for the purposes of trend analysis. It was decided that water companies should choose 10% of their DMAs to be represented in this data set to begin with. The advice to publishers is to choose those with reliable and consistent telemetry, indicative of genuine low demand during measurement times and not revealing of sensitive night usage patterns. Data Consistency There is a concern that companies measure flow allowance for legitimate night use and/or potential night use differently. To avoid any inconsistency, it was decided that we would share the net flow. Critical National Infrastructure The release of boundary data for district metered areas has been deemed to be revealing of critical national infrastructure. Because of this, it has been decided that the data set shall only contain point data from a centroid within the DMA. Data Triage Review Frequency Every 12 months, unless otherwise requested. Data Limitations Some of the flow recorded may be legitimate night-time usage of the network Some measuring systems automatically infill estimated measurements where none have been received via telemetry. These estimates are based on past flow. The reason for a fluctuation in night flow may not be determined by this dataset but potential causes can include seasonal variation in night-time water usage and mains bursts Data Publish Frequency Monthly
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data History
Data Origin Companies have configured their networks to be able to continuously monitor night flows using district meters. Flow data is recorded on meters and normally transmitted daily to a data centre. Data is analysed to confirm its validity and used to derive continuous night flow in each monitored area.
Data Triage Considerations
Data Quality Not all DMAs provide quality data for the purposes of trend analysis. It was decided that water companies should choose 10% of their DMAs to be represented in this data set to begin with. The advice to publishers is to choose those with reliable and consistent telemetry, indicative of genuine low demand during measurement times and not revealing of sensitive night usage patterns.
Data Consistency There is a concern that companies measure flow allowance for legitimate night use and/or potential night use differently. To avoid any inconsistency, it was decided that we would share the net flow.
Critical National Infrastructure The release of boundary data for district metered areas has been deemed to be revealing of Critical National Infrastructure. Because of this, it has been decided that the data set shall only contain point data from a centroid within the DMA.
Data Triage Review Frequency Every 12 months, unless otherwise requested.
Data Limitations Some of the flow recorded may be legitimate night-time usage of the network.
Some measuring systems automatically infill estimated measurements where none have been received via telemetry. These estimates are based on past flow.
The reason for a fluctuation in night flow may not be determined by this dataset but potential causes can include seasonal variation in night-time water usage and mains bursts.
Data Publish Frequency Monthly.
Supplementary information Below is a curated selection of links for additional reading, which provide a deeper understanding of this dataset.
Data Schema DATA_SOURCE: Company that owns the DMA DATE/TIME_STAMP: Date and time of measured net flow DMA_ID: Identity of the district metered area CENTROID_X: DMA centroid X coordinate CENTROID_Y : DMA centroid Y coordinate ACTUAL_MIN_NIGHT_FLOW: The lowest recorded average net flow between 12 and 6am MIN_NIGHT_FLOW : The average flow within the 2-4 am time window UNITS: Measurement of flow
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract The objective of this study was to analyze the scientific literature in public oral health regarding calculation, presentation, and discussion of the effect size in observational studies. The scientific literature (2015 to 2019) was analyzed regarding: a) general information (journal and guidelines to authors, number of variables and outcomes), b) objective and consistency with sample calculation presentation; c) effect size (presentation, measure used and consistency with data discussion and conclusion). A total of 123 articles from 66 journals were analyzed. Most articles analyzed presented a single outcome (74%) and did not mention sample size calculation (69.9%). Among those who did, 70.3% showed consistency between sample calculation used and the objective. Only 3.3% of articles mentioned the term effect size and 24.4% did not consider that in the discussion of results, despite showing effect size calculation. Logistic regression was the most commonly used statistical methodology (98.4%) and Odds Ratio was the most commonly used effect size measure (94.3%), although it was not cited and discussed as an effect size measure in most studies (96.7%). It could be concluded that most researchers restrict the discussion of their results only to the statistical significance found in associations under study.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview This dataset offers valuable insights into yearly domestic water consumption across various Lower Super Output Areas (LSOAs) or Data Zones, accompanied by the count of water meters within each area. It is instrumental for analysing residential water use patterns, facilitating water conservation efforts, and guiding infrastructure development and policy making at a localised level. Key Definitions Aggregation The process of summarising or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes. AMR Meter Automatic meter reading (AMR) is the technology of automatically collecting consumption, diagnostic, and status data from a water meter remotely and periodically. Dataset Structured and organised collection of related elements, often stored digitally, used for analysis and interpretation in various fields. Data Zone Data zones are the key geography for the dissemination of small area statistics in Scotland Dumb Meter A dumb meter or analogue meter is read manually. It does not have any external connectivity. Granularity Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours ID Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance. LSOA Lower Layer Super Output Areas (LSOA) are a geographic hierarchy designed to improve the reporting of small area statistics in England and Wales. Open Data Triage The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data. Schema Structure for organising and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute. Smart Meter A smart meter is an electronic device that records information and communicates it to the consumer and the supplier. It differs from automatic meter reading (AMR) in that it enables two-way communication between the meter and the supplier. Units Standard measurements used to quantify and compare different physical quantities. Water Meter Water metering is the practice of measuring water use. Water meters measure the volume of water used by residential and commercial building units that are supplied with water by a public water supply system. Data History Data Origin Domestic consumption data is recorded using water meters. The consumption recorded is then sent back to water companies. This dataset is extracted from the water companies. Data Triage Considerations This section discusses the careful handling of data to maintain anonymity and addresses the challenges associated with data updates, such as identifying household changes or meter replacements. Identification of Critical Infrastructure This aspect is not applicable for the dataset, as the focus is on domestic water consumption and does not contain any information that reveals critical infrastructure details. Commercial Risks and Anonymisation Individual Identification Risks There is a potential risk of identifying individuals or households if the consumption data is updated irregularly (e.g., every 6 months) and an out-of-cycle update occurs (e.g., after 2 months), which could signal a change in occupancy or ownership. Such patterns need careful handling to avoid accidental exposure of sensitive information. Meter and Property Association Challenges arise in maintaining historical data integrity when meters are replaced but the property remains the same. Ensuring continuity in the data without revealing personal information is crucial. Interpretation of Null Consumption Instances of null consumption could be misunderstood as a lack of water use, whereas they might simply indicate missing data. Distinguishing between these scenarios is vital to prevent misleading conclusions. Meter Re-reads The dataset must account for instances where meters are read multiple times for accuracy. Joint Supplies & Multiple Meters per Household Special consideration is required for households with multiple meters as well as multiple households that share a meter as this could complicate data aggregation. Schema Consistency with the Energy Industry: In formulating the schema for the domestic water consumption dataset, careful consideration was given to the potential risks to individual privacy. This evaluation included examining the frequency of data updates, the handling of property and meter associations, interpretations of null consumption, meter re-reads, joint suppliers, and the presence of multiple meters within a single household as described above. After a thorough assessment of these factors and their implications for individual privacy, it was decided to align the dataset's schema with the standards established within the energy industry. This decision was influenced by the energy sector's experience and established practices in managing similar risks associated with smart meters. This ensures a high level of data integrity and privacy protection. Schema The dataset schema is aligned with those used in the energy industry, which has encountered similar challenges with smart meters. However, it is important to note that the energy industry has a much higher density of meter distribution, especially smart meters. Aggregation to Mitigate Risks The dataset employs an elevated level of data aggregation to minimise the risk of individual identification. This approach is crucial in maintaining the utility of the dataset while ensuring individual privacy. The aggregation level is carefully chosen to remove identifiable risks without excluding valuable data, thus balancing data utility with privacy concerns. Data Freshness Users should be aware that this dataset reflects historical consumption patterns and does not represent real-time data. Publish Frequency Annually Data Triage Review Frequency An annual review is conducted to ensure the dataset's relevance and accuracy, with adjustments made based on specific requests or evolving data trends. Data Specifications For the domestic water consumption dataset, the data specifications are designed to ensure comprehensiveness and relevance, while maintaining clarity and focus. The specifications for this dataset include: Each dataset encompasses recordings of domestic water consumption as measured and reported by the data publisher. It excludes commercial consumption. Where it is necessary to estimate consumption, this is calculated based on actual meter readings. Meters of all types (smart, dumb, AMR) are included in this dataset. The dataset is updated and published annually. Historical data may be made available to facilitate trend analysis and comparative studies, although it is not mandatory for each dataset release. Context Users are cautioned against using the dataset for immediate operational decisions regarding water supply management. The data should be interpreted considering potential seasonal and weather-related influences on water consumption patterns. The geographical data provided does not pinpoint locations of water meters within an LSOA. The dataset aims to cover a broad spectrum of households, from single-meter homes to those with multiple meters, to accurately reflect the diversity of water use within an LSOA.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The AQS Data Mart is a database containing all of the information from AQS. It has every measured value the EPA has collected via the national ambient air monitoring program. It also includes the associated aggregate values calculated by EPA (8-hour, daily, annual, etc.). The AQS Data Mart is a copy of AQS made once per week and made accessible to the public through web-based applications. The intended users of the Data Mart are air quality data analysts in the regulatory, academic, and health research communities. It is intended for those who need to download large volumes of detailed technical data stored at EPA and does not provide any interactive analytical tools. It serves as the back-end database for several Agency interactive tools that could not fully function without it: AirData, AirCompare, The Remote Sensing Information Gateway, the Map Monitoring Sites KML page, etc.
AQS must maintain constant readiness to accept data and meet high data integrity requirements, thus is limited in the number of users and queries to which it can respond. The Data Mart, as a read only copy, can allow wider access.
The most commonly requested aggregation levels of data (and key metrics in each) are:
Sample Values (2.4 billion values back as far as 1957, national consistency begins in 1980, data for 500 substances routinely collected) The sample value converted to standard units of measure (generally 1-hour averages as reported to EPA, sometimes 24-hour averages) Local Standard Time (LST) and GMT timestamps Measurement method Measurement uncertainty, where known Any exceptional events affecting the data NAAQS Averages NAAQS average values (8-hour averages for ozone and CO, 24-hour averages for PM2.5) Daily Summary Values (each monitor has the following calculated each day) Observation count Observation per cent (of expected observations) Arithmetic mean of observations Max observation and time of max AQI (air quality index) where applicable Number of observations > Standard where applicable Annual Summary Values (each monitor has the following calculated each year) Observation count and per cent Valid days Required observation count Null observation count Exceptional values count Arithmetic Mean and Standard Deviation 1st - 4th maximum (highest) observations Percentiles (99, 98, 95, 90, 75, 50) Number of observations > Standard Site and Monitor Information FIPS State Code (the first 5 items on this list make up the AQS Monitor Identifier) FIPS County Code Site Number (unique within the county) Parameter Code (what is measured) POC (Parameter Occurrence Code) to distinguish from different samplers at the same site Latitude Longitude Measurement method information Owner / operator / data-submitter information Monitoring Network to which the monitor belongs Exemptions from regulatory requirements Operational dates City and CBSA where the monitor is located Quality Assurance Information Various data fields related to the 19 different QA assessments possible
You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.epa_historical_air_quality.[TABLENAME]. Fork this kernel to get started.
Data provided by the US Environmental Protection Agency Air Quality System Data Mart.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Blood sampling from tarsal vein - each sample centrifuged and flash frozen in the field. Samples maintained at -80 prior to analysis.
Aliquots of plasma assayed for malondialdehyde, super oxide dismutase, total antioxidant and uric acid concenrations.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Objective:
Improve understanding of real estate performance.
Leverage data to support business decisions.
Scope:
Track property sales, visits, and performance metrics.
Step 1: Creating an Azure SQL Database
Action: Provisioned an Azure SQL Database to host real estate data.
Why Azure?: Scalability, security, and integration with Power BI.
Step 2: Importing Data
Action: Imported datasets (properties, visits, sales, agents, etc.) into the SQL database.
Tools Used: SQL Server Management Studio (SSMS) and Azure Data Studio.
Step 3: Data Transformation in SQL
Normalized Data: Ensured data consistency by normalizing the formats of dates and categorical fields.
Calculated Fields:
Time on Market: DATEDIFF function to calculate the difference between listing and sale dates.
Conversion Rate: Aggregated sales and visits data using COUNT and SUM to calculate conversion rates per agent and property.
Buyer Segmentation: Identified first-time vs repeat buyers using JOINs and COUNT functions.
Data Cleaning: Removed duplicates, handled null values, and standardized city names and property types.
Step 4: Connecting Power BI to Azure SQL
Action: Established a live connection to Azure SQL Database in Power BI.
Benefit: Real-time data updates and efficient analysis.
Step 5: Data Modeling in Power BI
Relationships:
Defined relationships between tables (e.g., Sales, Visits, Properties, Agents) using primary and foreign keys.
Utilized active and inactive relationships for dynamic calculations like time-based comparisons.
Calculated Columns and Measures:
Time on Market: Created a calculated measure using DATEDIFF.
Conversion Rates: Used DIVIDE and CALCULATE for accurate per-agent and per-property analysis.
Step 6: Creating Visualizations
Key Visuals:
Sales Heatmap by City: Geographic visualization to highlight sales performance.
Conversion Rates: Bar charts and line graphs for trend analysis.
Time on Market: Boxplots and histograms for distribution insights.
Buyer Segmentation: Pie charts and bar graphs to show buyer profiles.
Step 7: Building Dashboards
Page 1: Overview (Key Metrics and Sales Heatmap).
Page 2: Performance Analysis (Conversion Rates, Time on Market).
Page 3: Buyer Insights (First-Time vs Repeat Buyers, Property Distribution).
Insight 1: Sales Performance by City
Cities highest sales volume.
City low performance, requiring further investigation.
Insight 2: Conversion Rates
Agent highest conversion rate.
Certain properties (e.g., luxury villas) outperform others in conversion.
Insight 3: Time on Market
Average time on market.
Insight 4: Buyer Trends
Repeat Buyers make up 60% of purchases.
First-Time Buyers prefer apartments over villas.
Focus on High-Performing Cities Recommendation 2: Support Low-Performing Areas
Investigate challenges to develop targeted marketing strategies.
Enhance Conversion Rates
Train agents based on techniques used by top performers.
Prioritize marketing for properties with high conversion rates.
Engage First-Time Buyers
Create specific campaigns for apartments to attract first-time buyers.
Offer financial guidance programs to boost their confidence.
Summary:
Built a robust data solution from Azure SQL to Power BI.
Derived actionable insights that can drive real estate growth.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Consistency for each dissimilarity measure.
Facebook
TwitterDuring the first year of the RPG Program, HHS, with Office of Management and Budget approval, developed a web-based RPG Data Collection and Reporting System to compile the performance measure data across all 53 grantees. Grantees began submitting case-level child and adult data to the RPG Data System in December 2008 and then uploaded their latest cumulative data files in December and June of each program year. Grantees' final data upload was in December 2012. The RPG Data System links data for children and adults together as a family unit and follows clients served over the course of the grant project, making it the most extensive quantitative dataset currently available on outcomes for children, adults, and families affected by substance abuse and child maltreatment. Grantees collected and reported on the performance measures that aligned with their program models, services and activities, goals, and intended outcomes. While grantee programs may have varied in terms of the interventions implemented, grantees reporting on the same performance measures submitted their data with specified data elements drawn from existing substance abuse and child welfare treatment reporting systems. Thus, grantees submitted data using standardized definitions and coding (grantees were provided a Data Dictionary) to ensure consistency across RPG grantees collecting the same performance measures. Each grantee was provided with individualized customized data plans for each of their RPG participant and control/comparison groups (some grantees had multiple treatment and control/comparison groups). Each customized data plan included child and adult demographic information and the distinct data elements required to calculate the selected standardized child and adult performance measures. The creation of individual data plans allowed for case-level data to be submitted in a standardized uniform file format, which further ensured consistent data collection and reporting across RPG grantees. To further strengthen data quality and consistency, two immediate levels of automated quality assurance checks occurred when grantees submitted their data to the RPG Data System. The first level of checks validated the accuracy of individual data elements based on valid coding and date ranges (e.g., a date of 2015 is identified as invalid, as the year has not occurred). The second level of review involved approximately 150 data validation checks that addressed illogical coding (e.g., a male client is coded as pregnant), as well as potential relational inconsistencies or possible errors between data elements (e.g., a substance abuse assessment that occurs after substance abuse treatment entry instead of prior to entry). To complete their data uploads, grantees had to correct definite coding errors and confirm or correct warnings regarding potential data inconsistencies. The dataset is a compilation of data from multiple administrative data sources, including child maltreatment data from the National Child Abuse and Neglect Data System (NCANDS), foster care data from the Adoption and Foster Care Analysis and Reporting System (AFCARS), and caregiver substance abuse treatment data from the Treatment Episode Data Set (TEDS). Data from the North Carolina Family Assessment Scale (NCFAS) are the only non-administrative data included in this collection. Researchers who order the RPG data from NDACAN should review the RPG user support information on the NDACAN website. Investigators: Young, N. K., DeCerchio, K., Rodi, C.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The development of effective detection algorithms under a complex background for small infrared (IR) targets has always been difficult. The existing algorithms have poor resistance to complex backgrounds, easily leading to false alarms. Furthermore, each target and its background correspond to different component signals, and changes in the components in space cause observation uncertainty. Inspired by this phenomenon, we propose a method for detecting small targets in complex backgrounds using local uncertainty measurements based on the compositional consistency principle. First, a multilayer nested sliding window is constructed, and a local component uncertainty measure algorithm is used to suppress the complex background by evaluating the component comprising local area signals. Subsequently, an energy weighting factor is introduced to reinforce the energy information embedded in the target in the uncertainty distribution map, thereby enhancing the target signal. Validation results obtained on real IR images show that the energy-weighted local uncertainty measure performs better when detecting small targets hidden in complex backgrounds, with a high signal-to-clutter ratio (SCR) gain and background suppression factor (BSF). The effectiveness of our proposed method on several typical open-source datasets is provided in this dataset, and quantitatively compared with several other sets of state-of-the-art algorithms.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This study re-examines the empirical support for one of the most influential explanations of leadership tenure, “selectorate theory,” by testing for consistency across key regime categories. The argument made herein is that if the measures are good, the consistency of their relationships should not be limited to particular nominal regime categories, and they should capture the implications of the theory differentiating it from competing theories. Current measures of selectorate theory concepts are wanting on both fronts. I find that the measure used for winning coalition size is correlated with the destabilization of leaders in democracies and the stabilization of leaders in nondemocracies. I also find that the measure of selectorate size exhibits two behaviors inconsistent with the theory: larger selectorates are only stabilizing after the leader has already been in office for an extended period of time; and the effect is only substantial for differentiating between types of military regimes. These findings have five implications: (1) they cast serious doubt on the utility of current measures of selectorate theory; (2) they raise conceptual questions about the treatment of political regimes as vectors or categories; (3) they define substantive, not just statistical, issues that future measures will need to address; (4) they give baselines for re-analysis of the effect of these measures on other implications of interest; and (5) they provide an interesting comment on the comparative politics literature on hybrid regimes and the effect of parliamentary institutions in nondemocratic regimes.
Facebook
Twitter
According to our latest research, the global Data Quality Scorecards market size in 2024 stands at USD 1.42 billion, reflecting robust demand across diverse sectors. The market is projected to expand at a CAGR of 14.8% from 2025 to 2033, reaching an estimated USD 4.45 billion by the end of the forecast period. Key growth drivers include the escalating need for reliable data-driven decision-making, stringent regulatory compliance requirements, and the proliferation of digital transformation initiatives across enterprises of all sizes. As per our latest research, organizations are increasingly recognizing the significance of maintaining high data quality standards to fuel analytics, artificial intelligence, and business intelligence capabilities.
One of the primary growth factors for the Data Quality Scorecards market is the exponential rise in data volumes generated by organizations worldwide. The digital economy has led to a surge in data collection from various sources, including customer interactions, IoT devices, and transactional systems. This data explosion has heightened the complexity of managing and ensuring data accuracy, completeness, and consistency. As a result, businesses are investing in comprehensive data quality management solutions, such as scorecards, to monitor, measure, and improve the quality of their data assets. These tools provide actionable insights, enabling organizations to proactively address data quality issues and maintain data integrity across their operations. The growing reliance on advanced analytics and artificial intelligence further amplifies the demand for high-quality data, making data quality scorecards an indispensable component of modern data management strategies.
Another significant growth driver is the increasing regulatory scrutiny and compliance requirements imposed on organizations, particularly in industries such as BFSI, healthcare, and government. Regulatory frameworks such as GDPR, HIPAA, and CCPA mandate stringent controls over data accuracy, privacy, and security. Non-compliance can result in severe financial penalties and reputational damage, compelling organizations to adopt robust data quality management practices. Data quality scorecards help organizations monitor compliance by providing real-time visibility into data quality metrics and highlighting areas that require remediation. This proactive approach to compliance not only mitigates regulatory risks but also enhances stakeholder trust and confidence in organizational data assets. The integration of data quality scorecards into enterprise data governance frameworks is becoming a best practice for organizations aiming to achieve continuous compliance and data excellence.
The rapid adoption of cloud computing and digital transformation initiatives across industries is also fueling the growth of the Data Quality Scorecards market. As organizations migrate their data infrastructure to the cloud and embrace hybrid IT environments, the complexity of managing data quality across disparate systems increases. Cloud-based data quality scorecards offer scalability, flexibility, and ease of deployment, making them an attractive option for organizations seeking to modernize their data management practices. Moreover, the proliferation of self-service analytics and business intelligence tools has democratized data access, necessitating robust data quality monitoring to ensure that decision-makers are working with accurate and reliable information. The convergence of cloud, AI, and data quality management is expected to create new opportunities for innovation and value creation in the market.
From a regional perspective, North America continues to dominate the Data Quality Scorecards market, driven by the presence of leading technology vendors, high adoption rates of advanced analytics, and stringent regulatory frameworks. However, the Asia Pacific region is expected to witness the fastest growth during the forecast period, fueled by rapid digitalization, increasing investments in IT infrastructure, and growing awareness of data quality management among enterprises. Europe also represents a significant market, characterized by strong regulatory compliance requirements and a mature data management ecosystem. Latin America and the Middle East & Africa are emerging markets, with increasing adoption of data quality solutions in sectors such as BFSI, healthcare, and government. The global market landscape is evolving rapidly, with regional
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Overview:
The Earthquake and Tsunami Dataset includes 782 records with 13 numerical features capturing key earthquake characteristics such as magnitude, depth, intensity, and geographic location. With no missing values, the dataset ensures high reliability and quality. The target variable tsunami (1 = occurred, 0 = not occurred) enables effective predictive modeling. Overall, this dataset is well-suited for tsunami prediction, risk assessment, and earthquake pattern analysis using machine learning techniques.
Dataset Name: Earthquake and Tsunami Dataset
Total Records: 782
Total Features: 13
Data Type: Numerical (all columns are numeric — float64 or int64)
Target Variable: tsunami (1 = tsunami occurred, 0 = no tsunami)
magnitude The magnitude of the earthquake on the Richter scale. cdi Community Determined Intensity – represents reported earthquake effects. mmi Modified Mercalli Intensity – measures perceived shaking intensity. sig Significance level – overall impact measure combining magnitude and reports. nst Number of seismic stations used to determine the earthquake location. dmin Minimum distance from the earthquake epicenter to the nearest station (in degrees). gap Azimuthal gap between recording stations (degrees). depth Depth of the earthquake focus in kilometers. latitude Geographic latitude of the earthquake epicenter. longitude Geographic longitude of the earthquake epicenter. Year Year in which the earthquake occurred. Month Month of the earthquake occurrence. tsunami Binary indicator (1 = Tsunami occurred, 0 = No tsunami). Data Quality Assessment Missing Values: None (All 13 columns have 0 missing values).
5 columns of type int64
8 columns of type float64 Outliers: Possible in magnitude, depth, or distance values (requires further exploration).
Data Consistency: Consistent numerical format with valid ranges for latitude, longitude, and depth.
This dataset is well-suited for both classification and regression problems, such as:
Tsunami Prediction (Classification): Predict whether an earthquake will trigger a tsunami (tsunami as target).
Earthquake Intensity Estimation (Regression): Predict magnitude or MMI using seismic parameters.
Risk Analysis: Identify high-risk regions or time periods based on magnitude, depth, and frequency.
Geospatial Modeling: Use latitude and longitude to map earthquake-prone or tsunami-prone .
Facebook
Twitter
According to our latest research, the global smart tape measure market size reached USD 1.12 billion in 2024, driven by robust demand across professional and consumer segments. The market is anticipated to grow at a CAGR of 8.3% during the forecast period, with the market size projected to reach USD 2.18 billion by 2033. This strong growth is attributed to the increasing adoption of digital and connected measuring solutions in construction, home improvement, and industrial sectors. As per our latest research, the surge in smart home and IoT applications, coupled with the construction industry's digital transformation, are pivotal growth factors shaping the market trajectory.
One of the primary growth drivers for the smart tape measure market is the ongoing wave of digitalization sweeping across the construction and interior design sectors. Traditional measuring methods are being rapidly replaced by smart tape measures due to their enhanced precision, ease of use, and ability to seamlessly integrate with digital workflows. Contractors, architects, and interior designers increasingly rely on these devices to accelerate project timelines, reduce manual errors, and ensure data consistency. The integration of Bluetooth and wireless connectivity in smart tape measures allows for real-time data transfer to design software and project management tools, further streamlining operations and boosting productivity. The trend towards smart cities and connected infrastructure is also fueling demand, as accurate and efficient measurement tools become integral to modern building practices.
Another significant factor propelling the smart tape measure market is the rising popularity of DIY and home improvement activities, particularly in developed regions. Consumers are seeking user-friendly tools that offer advanced functionalities such as voice commands, digital displays, and mobile app integration. These features not only enhance measurement accuracy but also provide added convenience for non-professional users. The proliferation of e-commerce platforms has made smart tape measures more accessible to a broader audience, enabling manufacturers to reach customers directly and offer tailored solutions. Moreover, the growing emphasis on home automation and smart living environments is creating new opportunities for product innovation and differentiation, encouraging market players to introduce versatile and feature-rich smart tape measures.
The industrial sector's shift towards automation and Industry 4.0 principles is further accelerating the adoption of smart tape measures. Industrial applications demand high-precision measurement tools that can withstand harsh operating conditions and deliver reliable performance. Smart tape measures equipped with laser technology, rugged casings, and wireless connectivity are gaining traction in manufacturing plants, warehouses, and logistics centers. These devices facilitate efficient space planning, inventory management, and quality control processes. Additionally, the trend of integrating smart tape measures with enterprise resource planning (ERP) and building information modeling (BIM) systems is enabling organizations to optimize resource allocation and improve operational efficiency, thereby driving market growth.
From a regional perspective, North America currently dominates the smart tape measure market, accounting for the largest share in 2024, followed closely by Europe and Asia Pacific. The strong presence of leading construction companies, robust home improvement culture, and high consumer awareness in North America have contributed to this region's leadership. Europe is witnessing steady growth due to stringent building regulations and the rapid adoption of digital construction technologies. Meanwhile, Asia Pacific is emerging as a lucrative market, fueled by rapid urbanization, infrastructure development, and increasing disposable incomes. The Middle East & Africa and Latin America are also exhibiting promising growth potential, albeit from a smaller base, as awareness and adoption of smart measuring tools continue to rise.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
Reporting of leakage from water networks is based on the concept of monitoring flows at a time when demand is at a minimum which is normally during the night. This dataset includes net night flow measurements for 10% of the publisher’s total district metered areas. This 10% has been chosen on the basis that the telemetry on site is reliable, that it is not revealing of sensitive usage patterns and that the night flow there is typical of low demand.
Key Definitions
Dataset
A structured and organized collection of related elements, often stored digitally, used for analysis and interpretation in various fields.
Data Triage
The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data.
District Metered Area (DMA)
The role of a district metered area (DMA) is to divide the water distribution network into manageable areas or sectors into which the flow can be measured. These areas provide the water providers with guidance as to which DMAs (District Metered Areas) require leak detection work.
Leakage
The accidental admission or escape of a fluid or gas through a hole or crack
Night Flow
This technique considers that in a DMA, leakages can be estimated when the flow into the
DMA is at its minimum. Typically, this is measured at night between 3am and 4am when customer demand is low so that network leakage can be detected.
Centroid
The centre of a geometric object.
Data History
Data Origin
Companies have configured their networks to be able to continuously monitor night flows using district meters. Flow data is recorded on meters and normally transmitted daily to a data centre. Data is analysed to confirm its validity and used to derive continuous night flow in each monitored area.
Data Triage Considerations
Data Quality
Not all DMAs provide quality data for the purposes of trend analysis. It was decided that water companies should choose 10% of their DMAs to be represented in this data set to begin with. The advice to publishers is to choose those with reliable and consistent telemetry, indicative of genuine low demand during measurement times and not revealing of sensitive night usage patterns.
Data Consistency
There is a concern that companies measure flow allowance for legitimate night use and/or potential night use differently. To avoid any inconsistency, it was decided that we would share the net flow.
Critical National Infrastructure
The release of boundary data for district metered areas has been deemed to be revealing of critical national infrastructure. Because of this, it has been decided that the data set shall only contain point data from a centroid within the DMA.
Data Triage Review Frequency
Every 12 months, unless otherwise requested.
Data Limitations
Some of the flow recorded may be legitimate nighttime usage of the network
Some measuring systems automatically infill estimated measurements where none have been received via telemetry. These estimates are based on past flow.
The reason for a fluctuation in night flow may not be determined by this dataset but potential causes can include seasonal variation in nighttime water usage and mains bursts
Data Publish Frequency
Monthly
Supplementary information
Below is a curated selection of links for additional reading, which provide a deeper understanding of this dataset.
Ofwat – Reporting Guidance https://www.ofwat.gov.uk/wp-content/uploads/2018/03/Reporting-guidance-leakage.pdf
Water UK – UK Leakage https://www.water.org.uk/wp-content/uploads/2022/03/Water-UK-A-leakage-Routemap-to-2050.pdf
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
Water companies in the UK are responsible for testing the quality of drinking water. This dataset contains the results of samples taken from the taps in domestic households to make sure they meet the standards set out by UK and European legislation. This data shows the location, date, and measured levels of determinands set out by the Drinking Water Inspectorate (DWI).
Key Definitions
Aggregation
Process involving summarizing or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes
Anonymisation
Anonymised data is a type of information sanitization in which data anonymisation tools encrypt or remove personally identifiable information from datasets for the purpose of preserving a data subject's privacy
Dataset
Structured and organized collection of related elements, often stored digitally, used for analysis and interpretation in various fields.
Determinand
A constituent or property of drinking water which can be determined or estimated.
DWI
Drinking Water Inspectorate, an organisation “providing independent reassurance that water supplies in England and Wales are safe and drinking water quality is acceptable to consumers.”
DWI Determinands
Constituents or properties that are tested for when evaluating a sample for its quality as per the guidance of the DWI. For this dataset, only determinands with “point of compliance” as “customer taps” are included.
Granularity
Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours
ID
Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance.
LSOA
Lower-Level Super Output Area is made up of small geographic areas used for statistical and administrative purposes by the Office for National Statistics. It is designed to have homogeneous populations in terms of population size, making them suitable for statistical analysis and reporting. Each LSOA is built from groups of contiguous Output Areas with an average of about 1,500 residents or 650 households allowing for granular data collection useful for analysis, planning and policy- making while ensuring privacy.
ONS
Office for National Statistics
Open Data Triage
The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data. <
Sample
A sample is a representative segment or portion of water taken from a larger whole for the purpose of analysing or testing to ensure compliance with safety and quality standards.
Schema
Structure for organizing and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute.
Units
Standard measurements used to quantify and compare different physical quantities.
Water Quality
The chemical, physical, biological, and radiological characteristics of water, typically in relation to its suitability for a specific purpose, such as drinking, swimming, or ecological health. It is determined by assessing a variety of parameters, including but not limited to pH, turbidity, microbial content, dissolved oxygen, presence of substances and temperature.
Data History
Data Origin
These samples were taken from customer taps. They were then analysed for water quality, and the results were uploaded to a database. This dataset is an extract from this database.
Data Triage Considerations
Granularity
Is it useful to share results as averages or individual?
We decided to share as individual results as the lowest level of granularity
Anonymisation
It is a requirement that this data cannot be used to identify a singular person or household. We discussed many options for aggregating the data to a specific geography to ensure this requirement is met. The following geographical aggregations were discussed:
<!--·
Water Supply Zone (WSZ) - Limits interoperability
with other datasets
<!--·
Postcode – Some postcodes contain very few
households and may not offer necessary anonymisation
<!--·
Postal Sector – Deemed not granular enough in
highly populated areas
<!--·
Rounded Co-ordinates – Not a recognised standard
and may cause overlapping areas
<!--·
MSOA – Deemed not granular enough
<!--·
LSOA – Agreed as a recognised standard appropriate
for England and Wales
<!--·
Data Zones – Agreed as a recognised standard
appropriate for Scotland
Data Specifications
Each dataset will cover a calendar year of samples
This dataset will be published annually
Historical datasets will be published as far back as 2016 from the introduction of of The Water Supply (Water Quality) Regulations 2016
The Determinands included in the dataset are as per the list that is required to be reported to the Drinking Water Inspectorate.
Context
Many UK water companies provide a search tool on their websites where you can search for water quality in your area by postcode. The results of the search may identify the water supply zone that supplies the postcode searched. Water supply zones are not linked to LSOAs which means the results may differ to this dataset
Some sample results are influenced by internal plumbing and may not be representative of drinking water quality in the wider area.
Some samples are tested on site and others are sent to scientific laboratories.
Data Publish Frequency
Annually
Data Triage Review Frequency
Annually unless otherwise requested
Supplementary information
Below is a curated selection of links for additional reading, which provide a deeper understanding of this dataset.
<!--1.
Drinking Water
Inspectorate Standards and Regulations:
<!--2.
https://www.dwi.gov.uk/drinking-water-standards-and-regulations/
<!--3.
LSOA (England
and Wales) and Data Zone (Scotland):
<!--5.
Description
for LSOA boundaries by the ONS: Census
2021 geographies - Office for National Statistics (ons.gov.uk)
<!--[6.
Postcode to
LSOA lookup tables: Postcode
to 2021 Census Output Area to Lower Layer Super Output Area to Middle Layer
Super Output Area to Local Authority District (August 2023) Lookup in the UK
(statistics.gov.uk)
<!--7.
Legislation history: Legislation -
Drinking Water Inspectorate (dwi.gov.uk)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract Gather evidence of construct and convergent validity and internal consistency of the Emotion Regulation Questionnaire (ERQ). A total of 441 students, mostly female (54.6%), with a mean age of 16 years (SD = 1.14), answered the ERQ and demographic questions. They were randomly distributed in two databases, which were submitted to exploratory (sample 1) and confirmatory factor analysis (sample 2). The exploratory results indicated a three-factor structure: Cognitive Reappraisal, Redirection of Attentional Focus and Emotional Suppression, which together explained 59.3% of the total variance (α = 0,67; α = 0,63; α = 0,64). For the confirmatory analyses, the following goodness-of-fit indices were found: χ² (24) = 67.02, p
Facebook
TwitterCARETS is a systematic test suite to measure consistency and robustness of modern VQA models through a series of six fine-grained capability tests.