100+ datasets found
  1. t

    Data from: CARETS: A Consistency And Robustness Evaluative Test Suite for...

    • service.tib.eu
    Updated Jan 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). CARETS: A Consistency And Robustness Evaluative Test Suite for VQA [Dataset]. https://service.tib.eu/ldmservice/dataset/carets--a-consistency-and-robustness-evaluative-test-suite-for-vqa
    Explore at:
    Dataset updated
    Jan 2, 2025
    Description

    CARETS is a systematic test suite to measure consistency and robustness of modern VQA models through a series of six fine-grained capability tests.

  2. s

    Affinity Water Night flow Monitoring

    • streamwaterdata.co.uk
    • hub.arcgis.com
    Updated May 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hvekeria_affinity (2024). Affinity Water Night flow Monitoring [Dataset]. https://www.streamwaterdata.co.uk/items/db5976c9e50e42f299346fa17ff17942
    Explore at:
    Dataset updated
    May 28, 2024
    Dataset authored and provided by
    hvekeria_affinity
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Key Definitions  DatasetA structured and organized collection of related elements, often stored digitally, used for analysis and interpretation in various fields.  Data TriageThe process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data.  District Metered Area (DMA)The role of a district metered area (DMA) is to divide the water distribution network into manageable areas or sectors into which the flow can be measured. These areas provide the water providers with guidance as to which DMAs (District Metered Areas) require leak detection work.LeakageThe accidental admission or escape of a fluid or gas through a hole or crackNight FlowThis technique considers that in a DMA, leakages can be estimated when the flow into theDMA is at its minimum. Typically, this is measured at night between 2am and 4am when customer demand is low so that network leakage can be detected.CentroidThe centre of a geometric object.Data History  Data Origin  Companies have configured their networks to be able to continuously monitor night flows using district meters. Flow data is recorded on meters and normally transmitted daily to a data centre. Data is analysed to confirm its validity and used to derive continuous night flow in each monitored area.Data Triage Considerations Data QualityNot all DMAs provide quality data for the purposes of trend analysis. It was decided that water companies should choose 10% of their DMAs to be represented in this data set to begin with. The advice to publishers is to choose those with reliable and consistent telemetry, indicative of genuine low demand during measurement times and not revealing of sensitive night usage patterns.Data Consistency There is a concern that companies measure flow allowance for legitimate night use and/or potential night use differently. To avoid any inconsistency, it was decided that we would share the net flow.Critical National InfrastructureThe release of boundary data for district metered areas has been deemed to be revealing of critical national infrastructure. Because of this, it has been decided that the data set shall only contain point data from a centroid within the DMA.Data Triage Review FrequencyEvery 12 months, unless otherwise requested.Data LimitationsSome of the flow recorded may be legitimate night-time usage of the networkSome measuring systems automatically infill estimated measurements where none have been received via telemetry. These estimates are based on past flow.The reason for a fluctuation in night flow may not be determined by this dataset but potential causes can include seasonal variation in night-time water usage and mains burstsData Publish Frequency  Monthly

  3. Data from: A movement-aware measure for trajectory similarity and its...

    • figshare.com
    zip
    Updated May 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ju Peng; Min Deng; Jianbo Tang (2024). A movement-aware measure for trajectory similarity and its application for ride-sharing path extraction in a road network [Dataset]. http://doi.org/10.6084/m9.figshare.25712415.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 7, 2024
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Ju Peng; Min Deng; Jianbo Tang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the code and datasets of the DSPD proposed in the study for measuring the trajectory simialrity which has been accepted by IJGIS. If you use the code in this study, please cite the study below. Ju Peng, Min Deng, Jianbo Tang et al. 2024,IJGIS. A movement-aware measure for trajectory similarity and its application for ride-sharing path extraction in a road network.

  4. s

    SES Water Night Flow Monitoring

    • streamwaterdata.co.uk
    Updated Apr 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SESWater2 (2024). SES Water Night Flow Monitoring [Dataset]. https://www.streamwaterdata.co.uk/items/6ab069aa9fe54f979aa7ca7352e7311d
    Explore at:
    Dataset updated
    Apr 30, 2024
    Dataset authored and provided by
    SESWater2
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Key Definitions   Dataset A structured and organized collection of related elements, often stored digitally, used for analysis and interpretation in various fields.  Data Triage The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data.  District Metered Area (DMA) The role of a district metered area (DMA) is to divide the water distribution network into manageable areas or sectors into which the flow can be measured. These areas provide the water providers with guidance as to which DMAs (District Metered Areas) require leak detection work. Leakage The accidental admission or escape of a fluid or gas through a hole or crack Night Flow This technique considers that in a DMA, leakages can be estimated when the flow into the DMA is at its minimum. Typically, this is measured at night between 2am and 4am when customer demand is low so that network leakage can be detected. Centroid The centre of a geometric object. Data History   Data Origin   Companies have configured their networks to be able to continuously monitor night flows using district meters. Flow data is recorded on meters and normally transmitted daily to a data centre. Data is analysed to confirm its validity and used to derive continuous night flow in each monitored area. Data Triage Considerations  Data Quality Not all DMAs provide quality data for the purposes of trend analysis. It was decided that water companies should choose 10% of their DMAs to be represented in this data set to begin with. The advice to publishers is to choose those with reliable and consistent telemetry, indicative of genuine low demand during measurement times and not revealing of sensitive night usage patterns. Data Consistency  There is a concern that companies measure flow allowance for legitimate night use and/or potential night use differently. To avoid any inconsistency, it was decided that we would share the net flow. Critical National Infrastructure The release of boundary data for district metered areas has been deemed to be revealing of critical national infrastructure. Because of this, it has been decided that the data set shall only contain point data from a centroid within the DMA. Data Triage Review Frequency Every 12 months, unless otherwise requested. Data Limitations Some of the flow recorded may be legitimate night-time usage of the network Some measuring systems automatically infill estimated measurements where none have been received via telemetry. These estimates are based on past flow. The reason for a fluctuation in night flow may not be determined by this dataset but potential causes can include seasonal variation in night-time water usage and mains bursts Data Publish Frequency   Monthly

  5. YW Night-flow monitoring 2025

    • streamwaterdata.co.uk
    Updated Oct 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yorkshire Water Services (2025). YW Night-flow monitoring 2025 [Dataset]. https://www.streamwaterdata.co.uk/datasets/yorkshire-water::yw-night-flow-monitoring-2025
    Explore at:
    Dataset updated
    Oct 9, 2025
    Dataset provided by
    Yorkshire Waterhttps://www.yorkshirewater.com/
    Authors
    Yorkshire Water Services
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    Data History

    Data Origin Companies have configured their networks to be able to continuously monitor night flows using district meters. Flow data is recorded on meters and normally transmitted daily to a data centre. Data is analysed to confirm its validity and used to derive continuous night flow in each monitored area.

    Data Triage Considerations

    Data Quality Not all DMAs provide quality data for the purposes of trend analysis. It was decided that water companies should choose 10% of their DMAs to be represented in this data set to begin with. The advice to publishers is to choose those with reliable and consistent telemetry, indicative of genuine low demand during measurement times and not revealing of sensitive night usage patterns.

    Data Consistency There is a concern that companies measure flow allowance for legitimate night use and/or potential night use differently. To avoid any inconsistency, it was decided that we would share the net flow.

    Critical National Infrastructure The release of boundary data for district metered areas has been deemed to be revealing of Critical National Infrastructure. Because of this, it has been decided that the data set shall only contain point data from a centroid within the DMA.

    Data Triage Review Frequency Every 12 months, unless otherwise requested.

    Data Limitations Some of the flow recorded may be legitimate night-time usage of the network.

    Some measuring systems automatically infill estimated measurements where none have been received via telemetry. These estimates are based on past flow.

    The reason for a fluctuation in night flow may not be determined by this dataset but potential causes can include seasonal variation in night-time water usage and mains bursts.

    Data Publish Frequency Monthly.

    Supplementary information Below is a curated selection of links for additional reading, which provide a deeper understanding of this dataset.

    1. Ofwat – Reporting Guidance https://www.ofwat.gov.uk/wp-content/uploads/2018/03/Reporting-guidance-leakage.pdf 2. Water UK – UK Leakage https://www.water.org.uk/wp-content/uploads/2022/03/Water-UK-A-leakage-Routemap-to-2050.pdf

    Data Schema DATA_SOURCE: Company that owns the DMA DATE/TIME_STAMP: Date and time of measured net flow DMA_ID: Identity of the district metered area CENTROID_X: DMA centroid X coordinate CENTROID_Y : DMA centroid Y coordinate ACTUAL_MIN_NIGHT_FLOW: The lowest recorded average net flow between 12 and 6am MIN_NIGHT_FLOW : The average flow within the 2-4 am time window UNITS: Measurement of flow

  6. Data from: Size effect in observational studies in Public Oral Health:...

    • scielo.figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Flávia Martão Flório; Luciane Zanin; Leônidas Marinho dos Santos Júnior; Marcelo de Castro Meneghim; Gláucia Maria Bovi Ambrosano (2023). Size effect in observational studies in Public Oral Health: importance, calculation and interpretation [Dataset]. http://doi.org/10.6084/m9.figshare.21907666.v1
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Flávia Martão Flório; Luciane Zanin; Leônidas Marinho dos Santos Júnior; Marcelo de Castro Meneghim; Gláucia Maria Bovi Ambrosano
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract The objective of this study was to analyze the scientific literature in public oral health regarding calculation, presentation, and discussion of the effect size in observational studies. The scientific literature (2015 to 2019) was analyzed regarding: a) general information (journal and guidelines to authors, number of variables and outcomes), b) objective and consistency with sample calculation presentation; c) effect size (presentation, measure used and consistency with data discussion and conclusion). A total of 123 articles from 66 journals were analyzed. Most articles analyzed presented a single outcome (74%) and did not mention sample size calculation (69.9%). Among those who did, 70.3% showed consistency between sample calculation used and the objective. Only 3.3% of articles mentioned the term effect size and 24.4% did not consider that in the discussion of results, despite showing effect size calculation. Logistic regression was the most commonly used statistical methodology (98.4%) and Odds Ratio was the most commonly used effect size measure (94.3%), although it was not cited and discussed as an effect size measure in most studies (96.7%). It could be concluded that most researchers restrict the discussion of their results only to the statistical significance found in associations under study.

  7. a

    SES Water Domestic Consumption

    • hub.arcgis.com
    Updated Apr 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SESWater2 (2024). SES Water Domestic Consumption [Dataset]. https://hub.arcgis.com/maps/f2cdc1248fcf4fd289ac1d3f25e75b3b_0/about
    Explore at:
    Dataset updated
    Apr 26, 2024
    Dataset authored and provided by
    SESWater2
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview    This dataset offers valuable insights into yearly domestic water consumption across various Lower Super Output Areas (LSOAs) or Data Zones, accompanied by the count of water meters within each area. It is instrumental for analysing residential water use patterns, facilitating water conservation efforts, and guiding infrastructure development and policy making at a localised level. Key Definitions    Aggregation   The process of summarising or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes.     AMR Meter Automatic meter reading (AMR) is the technology of automatically collecting consumption, diagnostic, and status data from a water meter remotely and periodically. Dataset   Structured and organised collection of related elements, often stored digitally, used for analysis and interpretation in various fields.  Data Zone Data zones are the key geography for the dissemination of small area statistics in Scotland Dumb Meter A dumb meter or analogue meter is read manually. It does not have any external connectivity. Granularity   Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours   ID   Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance.    LSOA Lower Layer Super Output Areas (LSOA) are a geographic hierarchy designed to improve the reporting of small area statistics in England and Wales. Open Data Triage   The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data.    Schema   Structure for organising and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute.    Smart Meter A smart meter is an electronic device that records information and communicates it to the consumer and the supplier. It differs from automatic meter reading (AMR) in that it enables two-way communication between the meter and the supplier. Units   Standard measurements used to quantify and compare different physical quantities.  Water Meter Water metering is the practice of measuring water use. Water meters measure the volume of water used by residential and commercial building units that are supplied with water by a public water supply system. Data History    Data Origin    Domestic consumption data is recorded using water meters. The consumption recorded is then sent back to water companies. This dataset is extracted from the water companies. Data Triage Considerations    This section discusses the careful handling of data to maintain anonymity and addresses the challenges associated with data updates, such as identifying household changes or meter replacements. Identification of Critical Infrastructure  This aspect is not applicable for the dataset, as the focus is on domestic water consumption and does not contain any information that reveals critical infrastructure details. Commercial Risks and Anonymisation Individual Identification Risks There is a potential risk of identifying individuals or households if the consumption data is updated irregularly (e.g., every 6 months) and an out-of-cycle update occurs (e.g., after 2 months), which could signal a change in occupancy or ownership. Such patterns need careful handling to avoid accidental exposure of sensitive information. Meter and Property Association Challenges arise in maintaining historical data integrity when meters are replaced but the property remains the same. Ensuring continuity in the data without revealing personal information is crucial. Interpretation of Null Consumption Instances of null consumption could be misunderstood as a lack of water use, whereas they might simply indicate missing data. Distinguishing between these scenarios is vital to prevent misleading conclusions. Meter Re-reads The dataset must account for instances where meters are read multiple times for accuracy. Joint Supplies & Multiple Meters per Household Special consideration is required for households with multiple meters as well as multiple households that share a meter as this could complicate data aggregation. Schema Consistency with the Energy Industry: In formulating the schema for the domestic water consumption dataset, careful consideration was given to the potential risks to individual privacy. This evaluation included examining the frequency of data updates, the handling of property and meter associations, interpretations of null consumption, meter re-reads, joint suppliers, and the presence of multiple meters within a single household as described above. After a thorough assessment of these factors and their implications for individual privacy, it was decided to align the dataset's schema with the standards established within the energy industry. This decision was influenced by the energy sector's experience and established practices in managing similar risks associated with smart meters. This ensures a high level of data integrity and privacy protection. Schema The dataset schema is aligned with those used in the energy industry, which has encountered similar challenges with smart meters. However, it is important to note that the energy industry has a much higher density of meter distribution, especially smart meters. Aggregation to Mitigate Risks The dataset employs an elevated level of data aggregation to minimise the risk of individual identification. This approach is crucial in maintaining the utility of the dataset while ensuring individual privacy. The aggregation level is carefully chosen to remove identifiable risks without excluding valuable data, thus balancing data utility with privacy concerns. Data Freshness  Users should be aware that this dataset reflects historical consumption patterns and does not represent real-time data. Publish Frequency  Annually Data Triage Review Frequency    An annual review is conducted to ensure the dataset's relevance and accuracy, with adjustments made based on specific requests or evolving data trends. Data Specifications   For the domestic water consumption dataset, the data specifications are designed to ensure comprehensiveness and relevance, while maintaining clarity and focus. The specifications for this dataset include: Each dataset encompasses recordings of domestic water consumption as measured and reported by the data publisher. It excludes commercial consumption. Where it is necessary to estimate consumption, this is calculated based on actual meter readings. Meters of all types (smart, dumb, AMR) are included in this dataset. The dataset is updated and published annually. Historical data may be made available to facilitate trend analysis and comparative studies, although it is not mandatory for each dataset release. Context   Users are cautioned against using the dataset for immediate operational decisions regarding water supply management. The data should be interpreted considering potential seasonal and weather-related influences on water consumption patterns. The geographical data provided does not pinpoint locations of water meters within an LSOA. The dataset aims to cover a broad spectrum of households, from single-meter homes to those with multiple meters, to accurately reflect the diversity of water use within an LSOA.

  8. Historical Air Quality

    • kaggle.com
    zip
    Updated Feb 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    US Environmental Protection Agency (2019). Historical Air Quality [Dataset]. https://www.kaggle.com/datasets/epa/epa-historical-air-quality
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Feb 12, 2019
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Authors
    US Environmental Protection Agency
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The AQS Data Mart is a database containing all of the information from AQS. It has every measured value the EPA has collected via the national ambient air monitoring program. It also includes the associated aggregate values calculated by EPA (8-hour, daily, annual, etc.). The AQS Data Mart is a copy of AQS made once per week and made accessible to the public through web-based applications. The intended users of the Data Mart are air quality data analysts in the regulatory, academic, and health research communities. It is intended for those who need to download large volumes of detailed technical data stored at EPA and does not provide any interactive analytical tools. It serves as the back-end database for several Agency interactive tools that could not fully function without it: AirData, AirCompare, The Remote Sensing Information Gateway, the Map Monitoring Sites KML page, etc.

    AQS must maintain constant readiness to accept data and meet high data integrity requirements, thus is limited in the number of users and queries to which it can respond. The Data Mart, as a read only copy, can allow wider access.

    The most commonly requested aggregation levels of data (and key metrics in each) are:

    Sample Values (2.4 billion values back as far as 1957, national consistency begins in 1980, data for 500 substances routinely collected) The sample value converted to standard units of measure (generally 1-hour averages as reported to EPA, sometimes 24-hour averages) Local Standard Time (LST) and GMT timestamps Measurement method Measurement uncertainty, where known Any exceptional events affecting the data NAAQS Averages NAAQS average values (8-hour averages for ozone and CO, 24-hour averages for PM2.5) Daily Summary Values (each monitor has the following calculated each day) Observation count Observation per cent (of expected observations) Arithmetic mean of observations Max observation and time of max AQI (air quality index) where applicable Number of observations > Standard where applicable Annual Summary Values (each monitor has the following calculated each year) Observation count and per cent Valid days Required observation count Null observation count Exceptional values count Arithmetic Mean and Standard Deviation 1st - 4th maximum (highest) observations Percentiles (99, 98, 95, 90, 75, 50) Number of observations > Standard Site and Monitor Information FIPS State Code (the first 5 items on this list make up the AQS Monitor Identifier) FIPS County Code Site Number (unique within the county) Parameter Code (what is measured) POC (Parameter Occurrence Code) to distinguish from different samplers at the same site Latitude Longitude Measurement method information Owner / operator / data-submitter information Monitoring Network to which the monitor belongs Exemptions from regulatory requirements Operational dates City and CBSA where the monitor is located Quality Assurance Information Various data fields related to the 19 different QA assessments possible

    Querying BigQuery tables

    You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.epa_historical_air_quality.[TABLENAME]. Fork this kernel to get started.

    Acknowledgements

    Data provided by the US Environmental Protection Agency Air Quality System Data Mart.

  9. n

    Data from: Consistent measures of oxidative balance predict survival but not...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Apr 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Bodey; Ian Cleasby; Jonathan Blount; Graham McElwaine; Freydis Vigfusdottir; Stuart Bearhop (2020). Consistent measures of oxidative balance predict survival but not reproduction in a long-distance migrant [Dataset]. http://doi.org/10.5061/dryad.70rxwdbtv
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 16, 2020
    Dataset provided by
    University of Iceland
    University of Exeter
    Irish Brent Goose Research Group
    Authors
    Thomas Bodey; Ian Cleasby; Jonathan Blount; Graham McElwaine; Freydis Vigfusdottir; Stuart Bearhop
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description
    1. Physiological processes, including those that disrupt oxidative balance, have been proposed as key to understanding fundamental life history trade-offs. Yet examination of changes in oxidative balance within wild animals across time, space and major life history challenges remain uncommon. For example, migration presents substantial physiological challenges for individuals, and data on migratory individuals would provide crucial context for exposing the importance of relationships between oxidative balance and fitness outcomes.
    2. Here we examined the consistency of commonly used measures of oxidative balance in longitudinally sampled free-living individuals of a long-lived, long-distance migrant, the Brent goose Branta bernicla hrota over periods of months to years.
    3. Although inter-individual and temporal variation in measures of oxidative balance were substantial, we found high consistency in measures of lipid peroxidation and circulating non-enzymatic antioxidants in longitudinally sampled individuals. This suggests the potential for the existence of individual oxidative phenotypes.
    4. Given intra-individual consistency, we then examined how these physiological measures relate to survival and reproductive success across all sampled individuals. Surprisingly, lower survival was predicted for individuals with lower levels of damage, with no measured physiological metric associated with reproductive success.
    5. Our results demonstrate that snapshot measurements of a consistent measure of oxidative balance can inform our understanding of differences in a key demographic trait. However, the positive relationship between oxidative damage and survival emphasises the need to investigate relationships between the oxidative system and fitness outcomes in other species undergoing similar physiologically challenging lifecycles. This would highlight the extent to which variation in such traits and resource allocation trade-offs is a result of adaptation to different life history strategies. 30-Mar-2020 Methods Live capture of wild brent geese using cannon nets on wintering and staging grounds.

    Blood sampling from tarsal vein - each sample centrifuged and flash frozen in the field. Samples maintained at -80 prior to analysis.

    Aliquots of plasma assayed for malondialdehyde, super oxide dismutase, total antioxidant and uric acid concenrations.

  10. Real-Estate Dashboard

    • kaggle.com
    zip
    Updated May 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ramy Elbouhy (2025). Real-Estate Dashboard [Dataset]. https://www.kaggle.com/datasets/ramyelbouhy/real-estate-dashboard/data
    Explore at:
    zip(10488043 bytes)Available download formats
    Dataset updated
    May 23, 2025
    Authors
    Ramy Elbouhy
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Introduction

    Objective:

    Improve understanding of real estate performance.

    Leverage data to support business decisions.

    Scope:

    Track property sales, visits, and performance metrics.

    Technical Steps

    Step 1: Creating an Azure SQL Database

    Action: Provisioned an Azure SQL Database to host real estate data.

    Why Azure?: Scalability, security, and integration with Power BI.

    Step 2: Importing Data

    Action: Imported datasets (properties, visits, sales, agents, etc.) into the SQL database.

    Tools Used: SQL Server Management Studio (SSMS) and Azure Data Studio.

    Step 3: Data Transformation in SQL

    Normalized Data: Ensured data consistency by normalizing the formats of dates and categorical fields.

    Calculated Fields:

    Time on Market: DATEDIFF function to calculate the difference between listing and sale dates.

    Conversion Rate: Aggregated sales and visits data using COUNT and SUM to calculate conversion rates per agent and property.

    Buyer Segmentation: Identified first-time vs repeat buyers using JOINs and COUNT functions.

    Data Cleaning: Removed duplicates, handled null values, and standardized city names and property types.

    Step 4: Connecting Power BI to Azure SQL

    Action: Established a live connection to Azure SQL Database in Power BI.

    Benefit: Real-time data updates and efficient analysis.

    Step 5: Data Modeling in Power BI

    Relationships:

    Defined relationships between tables (e.g., Sales, Visits, Properties, Agents) using primary and foreign keys.

    Utilized active and inactive relationships for dynamic calculations like time-based comparisons.

    Calculated Columns and Measures:

    Time on Market: Created a calculated measure using DATEDIFF.

    Conversion Rates: Used DIVIDE and CALCULATE for accurate per-agent and per-property analysis.

    Step 6: Creating Visualizations

    Key Visuals:

    Sales Heatmap by City: Geographic visualization to highlight sales performance.

    Conversion Rates: Bar charts and line graphs for trend analysis.

    Time on Market: Boxplots and histograms for distribution insights.

    Buyer Segmentation: Pie charts and bar graphs to show buyer profiles.

    Step 7: Building Dashboards

    Structure:

    Page 1: Overview (Key Metrics and Sales Heatmap).

    Page 2: Performance Analysis (Conversion Rates, Time on Market).

    Page 3: Buyer Insights (First-Time vs Repeat Buyers, Property Distribution).

    Insights Gained

    Insight 1: Sales Performance by City

    Cities highest sales volume.

    City low performance, requiring further investigation.

    Insight 2: Conversion Rates

    Agent highest conversion rate.

    Certain properties (e.g., luxury villas) outperform others in conversion.

    Insight 3: Time on Market

    Average time on market.

    Insight 4: Buyer Trends

    Repeat Buyers make up 60% of purchases.

    First-Time Buyers prefer apartments over villas.

    Recommendation

    Focus on High-Performing Cities Recommendation 2: Support Low-Performing Areas

    Investigate challenges to develop targeted marketing strategies.

    Enhance Conversion Rates

    Train agents based on techniques used by top performers.

    Prioritize marketing for properties with high conversion rates.

    Engage First-Time Buyers

    Create specific campaigns for apartments to attract first-time buyers.

    Offer financial guidance programs to boost their confidence.

    Summary:

    Built a robust data solution from Azure SQL to Power BI.

    Derived actionable insights that can drive real estate growth.

  11. f

    Consistency for each dissimilarity measure.

    • figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pablo D. Reeb; Sergio J. Bramardi; Juan P. Steibel (2023). Consistency for each dissimilarity measure. [Dataset]. http://doi.org/10.1371/journal.pone.0132310.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Pablo D. Reeb; Sergio J. Bramardi; Juan P. Steibel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Consistency for each dissimilarity measure.

  12. d

    Targeted Grants to Increase the Well-Being of, and to Improve the Permanency...

    • catalog.data.gov
    • data.virginia.gov
    Updated Sep 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Data Archive on Child Abuse and Neglect (2025). Targeted Grants to Increase the Well-Being of, and to Improve the Permanency Outcomes for, Children Affected by Methamphetamine or Other Substance Abuse: September 30, 2007, to September 30, 2012 (RPG-1) [Dataset]. https://catalog.data.gov/dataset/targeted-grants-to-increase-the-well-being-of-and-to-improve-the-permanency-outcomes-for-c-eb7d1
    Explore at:
    Dataset updated
    Sep 7, 2025
    Dataset provided by
    National Data Archive on Child Abuse and Neglect
    Description

    During the first year of the RPG Program, HHS, with Office of Management and Budget approval, developed a web-based RPG Data Collection and Reporting System to compile the performance measure data across all 53 grantees. Grantees began submitting case-level child and adult data to the RPG Data System in December 2008 and then uploaded their latest cumulative data files in December and June of each program year. Grantees' final data upload was in December 2012. The RPG Data System links data for children and adults together as a family unit and follows clients served over the course of the grant project, making it the most extensive quantitative dataset currently available on outcomes for children, adults, and families affected by substance abuse and child maltreatment. Grantees collected and reported on the performance measures that aligned with their program models, services and activities, goals, and intended outcomes. While grantee programs may have varied in terms of the interventions implemented, grantees reporting on the same performance measures submitted their data with specified data elements drawn from existing substance abuse and child welfare treatment reporting systems. Thus, grantees submitted data using standardized definitions and coding (grantees were provided a Data Dictionary) to ensure consistency across RPG grantees collecting the same performance measures. Each grantee was provided with individualized customized data plans for each of their RPG participant and control/comparison groups (some grantees had multiple treatment and control/comparison groups). Each customized data plan included child and adult demographic information and the distinct data elements required to calculate the selected standardized child and adult performance measures. The creation of individual data plans allowed for case-level data to be submitted in a standardized uniform file format, which further ensured consistent data collection and reporting across RPG grantees. To further strengthen data quality and consistency, two immediate levels of automated quality assurance checks occurred when grantees submitted their data to the RPG Data System. The first level of checks validated the accuracy of individual data elements based on valid coding and date ranges (e.g., a date of 2015 is identified as invalid, as the year has not occurred). The second level of review involved approximately 150 data validation checks that addressed illogical coding (e.g., a male client is coded as pregnant), as well as potential relational inconsistencies or possible errors between data elements (e.g., a substance abuse assessment that occurs after substance abuse treatment entry instead of prior to entry). To complete their data uploads, grantees had to correct definite coding errors and confirm or correct warnings regarding potential data inconsistencies. The dataset is a compilation of data from multiple administrative data sources, including child maltreatment data from the National Child Abuse and Neglect Data System (NCANDS), foster care data from the Adoption and Foster Care Analysis and Reporting System (AFCARS), and caregiver substance abuse treatment data from the Treatment Episode Data Set (TEDS). Data from the North Carolina Family Assessment Scale (NCFAS) are the only non-administrative data included in this collection. Researchers who order the RPG data from NDACAN should review the RPG user support information on the NDACAN website. Investigators: Young, N. K., DeCerchio, K., Rodi, C.

  13. S

    Data of Infrared Small Target Detection Using Local Component Uncertainty...

    • scidb.cn
    Updated May 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erwei Zhao; Wei Zheng; Mingtao Li; Haibin Sun; Jianfeng Wang (2023). Data of Infrared Small Target Detection Using Local Component Uncertainty Measure With Consistency Assessment [Dataset]. http://doi.org/10.57760/sciencedb.space.00645
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 4, 2023
    Dataset provided by
    Science Data Bank
    Authors
    Erwei Zhao; Wei Zheng; Mingtao Li; Haibin Sun; Jianfeng Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The development of effective detection algorithms under a complex background for small infrared (IR) targets has always been difficult. The existing algorithms have poor resistance to complex backgrounds, easily leading to false alarms. Furthermore, each target and its background correspond to different component signals, and changes in the components in space cause observation uncertainty. Inspired by this phenomenon, we propose a method for detecting small targets in complex backgrounds using local uncertainty measurements based on the compositional consistency principle. First, a multilayer nested sliding window is constructed, and a local component uncertainty measure algorithm is used to suppress the complex background by evaluating the component comprising local area signals. Subsequently, an energy weighting factor is introduced to reinforce the energy information embedded in the target in the uncertainty distribution map, thereby enhancing the target signal. Validation results obtained on real IR images show that the energy-weighted local uncertainty measure performs better when detecting small targets hidden in complex backgrounds, with a high signal-to-clutter ratio (SCR) gain and background suppression factor (BSF). The effectiveness of our proposed method on several typical open-source datasets is provided in this dataset, and quantitatively compared with several other sets of state-of-the-art algorithms.

  14. H

    Data from: Survival and Accountability: An Analysis of the Empirical Support...

    • dataverse.harvard.edu
    Updated Jul 17, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan Kennedy (2018). Survival and Accountability: An Analysis of the Empirical Support for “Selectorate Theory" [Dataset]. http://doi.org/10.7910/DVN/MCU7GO
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 17, 2018
    Dataset provided by
    Harvard Dataverse
    Authors
    Ryan Kennedy
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This study re-examines the empirical support for one of the most influential explanations of leadership tenure, “selectorate theory,” by testing for consistency across key regime categories. The argument made herein is that if the measures are good, the consistency of their relationships should not be limited to particular nominal regime categories, and they should capture the implications of the theory differentiating it from competing theories. Current measures of selectorate theory concepts are wanting on both fronts. I find that the measure used for winning coalition size is correlated with the destabilization of leaders in democracies and the stabilization of leaders in nondemocracies. I also find that the measure of selectorate size exhibits two behaviors inconsistent with the theory: larger selectorates are only stabilizing after the leader has already been in office for an extended period of time; and the effect is only substantial for differentiating between types of military regimes. These findings have five implications: (1) they cast serious doubt on the utility of current measures of selectorate theory; (2) they raise conceptual questions about the treatment of political regimes as vectors or categories; (3) they define substantive, not just statistical, issues that future measures will need to address; (4) they give baselines for re-analysis of the effect of these measures on other implications of interest; and (5) they provide an interesting comment on the comparative politics literature on hybrid regimes and the effect of parliamentary institutions in nondemocratic regimes.

  15. G

    Data Quality Scorecards Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Data Quality Scorecards Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-quality-scorecards-market
    Explore at:
    csv, pptx, pdfAvailable download formats
    Dataset updated
    Oct 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Quality Scorecards Market Outlook



    According to our latest research, the global Data Quality Scorecards market size in 2024 stands at USD 1.42 billion, reflecting robust demand across diverse sectors. The market is projected to expand at a CAGR of 14.8% from 2025 to 2033, reaching an estimated USD 4.45 billion by the end of the forecast period. Key growth drivers include the escalating need for reliable data-driven decision-making, stringent regulatory compliance requirements, and the proliferation of digital transformation initiatives across enterprises of all sizes. As per our latest research, organizations are increasingly recognizing the significance of maintaining high data quality standards to fuel analytics, artificial intelligence, and business intelligence capabilities.




    One of the primary growth factors for the Data Quality Scorecards market is the exponential rise in data volumes generated by organizations worldwide. The digital economy has led to a surge in data collection from various sources, including customer interactions, IoT devices, and transactional systems. This data explosion has heightened the complexity of managing and ensuring data accuracy, completeness, and consistency. As a result, businesses are investing in comprehensive data quality management solutions, such as scorecards, to monitor, measure, and improve the quality of their data assets. These tools provide actionable insights, enabling organizations to proactively address data quality issues and maintain data integrity across their operations. The growing reliance on advanced analytics and artificial intelligence further amplifies the demand for high-quality data, making data quality scorecards an indispensable component of modern data management strategies.




    Another significant growth driver is the increasing regulatory scrutiny and compliance requirements imposed on organizations, particularly in industries such as BFSI, healthcare, and government. Regulatory frameworks such as GDPR, HIPAA, and CCPA mandate stringent controls over data accuracy, privacy, and security. Non-compliance can result in severe financial penalties and reputational damage, compelling organizations to adopt robust data quality management practices. Data quality scorecards help organizations monitor compliance by providing real-time visibility into data quality metrics and highlighting areas that require remediation. This proactive approach to compliance not only mitigates regulatory risks but also enhances stakeholder trust and confidence in organizational data assets. The integration of data quality scorecards into enterprise data governance frameworks is becoming a best practice for organizations aiming to achieve continuous compliance and data excellence.




    The rapid adoption of cloud computing and digital transformation initiatives across industries is also fueling the growth of the Data Quality Scorecards market. As organizations migrate their data infrastructure to the cloud and embrace hybrid IT environments, the complexity of managing data quality across disparate systems increases. Cloud-based data quality scorecards offer scalability, flexibility, and ease of deployment, making them an attractive option for organizations seeking to modernize their data management practices. Moreover, the proliferation of self-service analytics and business intelligence tools has democratized data access, necessitating robust data quality monitoring to ensure that decision-makers are working with accurate and reliable information. The convergence of cloud, AI, and data quality management is expected to create new opportunities for innovation and value creation in the market.




    From a regional perspective, North America continues to dominate the Data Quality Scorecards market, driven by the presence of leading technology vendors, high adoption rates of advanced analytics, and stringent regulatory frameworks. However, the Asia Pacific region is expected to witness the fastest growth during the forecast period, fueled by rapid digitalization, increasing investments in IT infrastructure, and growing awareness of data quality management among enterprises. Europe also represents a significant market, characterized by strong regulatory compliance requirements and a mature data management ecosystem. Latin America and the Middle East & Africa are emerging markets, with increasing adoption of data quality solutions in sectors such as BFSI, healthcare, and government. The global market landscape is evolving rapidly, with regional

  16. Earth_Quake and (Tsunami prediction dataset)2025

    • kaggle.com
    zip
    Updated Nov 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tanzeela Shahzadi (2025). Earth_Quake and (Tsunami prediction dataset)2025 [Dataset]. https://www.kaggle.com/datasets/tan5577/earth-quake-and-tsunami-prediction-dataset2025
    Explore at:
    zip(16151 bytes)Available download formats
    Dataset updated
    Nov 4, 2025
    Authors
    Tanzeela Shahzadi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    About Dataset:

    Worldwide Earthquake and Tsunami Hazard Analysis Dataset

    Overview:

    The Earthquake and Tsunami Dataset includes 782 records with 13 numerical features capturing key earthquake characteristics such as magnitude, depth, intensity, and geographic location. With no missing values, the dataset ensures high reliability and quality. The target variable tsunami (1 = occurred, 0 = not occurred) enables effective predictive modeling. Overall, this dataset is well-suited for tsunami prediction, risk assessment, and earthquake pattern analysis using machine learning techniques.

    Dataset Information:

    Dataset Name: Earthquake and Tsunami Dataset

    Total Records: 782

    Total Features: 13

    Data Type: Numerical (all columns are numeric — float64 or int64)

    Target Variable: tsunami (1 = tsunami occurred, 0 = no tsunami)

    Key Features:

    Features name and Description:

    magnitude The magnitude of the earthquake on the Richter scale. cdi Community Determined Intensity – represents reported earthquake effects. mmi Modified Mercalli Intensity – measures perceived shaking intensity. sig Significance level – overall impact measure combining magnitude and reports. nst Number of seismic stations used to determine the earthquake location. dmin Minimum distance from the earthquake epicenter to the nearest station (in degrees). gap Azimuthal gap between recording stations (degrees). depth Depth of the earthquake focus in kilometers. latitude Geographic latitude of the earthquake epicenter. longitude Geographic longitude of the earthquake epicenter. Year Year in which the earthquake occurred. Month Month of the earthquake occurrence. tsunami Binary indicator (1 = Tsunami occurred, 0 = No tsunami). Data Quality Assessment Missing Values: None (All 13 columns have 0 missing values).

    Data Types:

    5 columns of type int64

    8 columns of type float64 Outliers: Possible in magnitude, depth, or distance values (requires further exploration).

    Data Consistency: Consistent numerical format with valid ranges for latitude, longitude, and depth.

    Machine Learning Applications:

    This dataset is well-suited for both classification and regression problems, such as:

    Tsunami Prediction (Classification): Predict whether an earthquake will trigger a tsunami (tsunami as target).

    Earthquake Intensity Estimation (Regression): Predict magnitude or MMI using seismic parameters.

    Risk Analysis: Identify high-risk regions or time periods based on magnitude, depth, and frequency.

    Geospatial Modeling: Use latitude and longitude to map earthquake-prone or tsunami-prone .

  17. G

    Smart Tape Measure Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Smart Tape Measure Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/smart-tape-measure-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Oct 6, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Smart Tape Measure Market Outlook



    According to our latest research, the global smart tape measure market size reached USD 1.12 billion in 2024, driven by robust demand across professional and consumer segments. The market is anticipated to grow at a CAGR of 8.3% during the forecast period, with the market size projected to reach USD 2.18 billion by 2033. This strong growth is attributed to the increasing adoption of digital and connected measuring solutions in construction, home improvement, and industrial sectors. As per our latest research, the surge in smart home and IoT applications, coupled with the construction industry's digital transformation, are pivotal growth factors shaping the market trajectory.




    One of the primary growth drivers for the smart tape measure market is the ongoing wave of digitalization sweeping across the construction and interior design sectors. Traditional measuring methods are being rapidly replaced by smart tape measures due to their enhanced precision, ease of use, and ability to seamlessly integrate with digital workflows. Contractors, architects, and interior designers increasingly rely on these devices to accelerate project timelines, reduce manual errors, and ensure data consistency. The integration of Bluetooth and wireless connectivity in smart tape measures allows for real-time data transfer to design software and project management tools, further streamlining operations and boosting productivity. The trend towards smart cities and connected infrastructure is also fueling demand, as accurate and efficient measurement tools become integral to modern building practices.




    Another significant factor propelling the smart tape measure market is the rising popularity of DIY and home improvement activities, particularly in developed regions. Consumers are seeking user-friendly tools that offer advanced functionalities such as voice commands, digital displays, and mobile app integration. These features not only enhance measurement accuracy but also provide added convenience for non-professional users. The proliferation of e-commerce platforms has made smart tape measures more accessible to a broader audience, enabling manufacturers to reach customers directly and offer tailored solutions. Moreover, the growing emphasis on home automation and smart living environments is creating new opportunities for product innovation and differentiation, encouraging market players to introduce versatile and feature-rich smart tape measures.




    The industrial sector's shift towards automation and Industry 4.0 principles is further accelerating the adoption of smart tape measures. Industrial applications demand high-precision measurement tools that can withstand harsh operating conditions and deliver reliable performance. Smart tape measures equipped with laser technology, rugged casings, and wireless connectivity are gaining traction in manufacturing plants, warehouses, and logistics centers. These devices facilitate efficient space planning, inventory management, and quality control processes. Additionally, the trend of integrating smart tape measures with enterprise resource planning (ERP) and building information modeling (BIM) systems is enabling organizations to optimize resource allocation and improve operational efficiency, thereby driving market growth.




    From a regional perspective, North America currently dominates the smart tape measure market, accounting for the largest share in 2024, followed closely by Europe and Asia Pacific. The strong presence of leading construction companies, robust home improvement culture, and high consumer awareness in North America have contributed to this region's leadership. Europe is witnessing steady growth due to stringent building regulations and the rapid adoption of digital construction technologies. Meanwhile, Asia Pacific is emerging as a lucrative market, fueled by rapid urbanization, infrastructure development, and increasing disposable incomes. The Middle East & Africa and Latin America are also exhibiting promising growth potential, albeit from a smaller base, as awareness and adoption of smart measuring tools continue to rise.



  18. s

    Portsmouth Water Nightflow Data

    • streamwaterdata.co.uk
    Updated Apr 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AHughes_Portsmouth (2024). Portsmouth Water Nightflow Data [Dataset]. https://www.streamwaterdata.co.uk/datasets/f1aeaa7ad2c947048eaf9fc06b6df0e5
    Explore at:
    Dataset updated
    Apr 25, 2024
    Dataset authored and provided by
    AHughes_Portsmouth
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview

    Reporting of leakage from water networks is based on the concept of monitoring flows at a time when demand is at a minimum which is normally during the night. This dataset includes net night flow measurements for 10% of the publisher’s total district metered areas. This 10% has been chosen on the basis that the telemetry on site is reliable, that it is not revealing of sensitive usage patterns and that the night flow there is typical of low demand.

    Key Definitions

    Dataset

    A structured and organized collection of related elements, often stored digitally, used for analysis and interpretation in various fields.

    Data Triage

    The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data.

    District Metered Area (DMA)

    The role of a district metered area (DMA) is to divide the water distribution network into manageable areas or sectors into which the flow can be measured. These areas provide the water providers with guidance as to which DMAs (District Metered Areas) require leak detection work.

    Leakage

    The accidental admission or escape of a fluid or gas through a hole or crack

    Night Flow

    This technique considers that in a DMA, leakages can be estimated when the flow into the

    DMA is at its minimum. Typically, this is measured at night between 3am and 4am when customer demand is low so that network leakage can be detected.

    Centroid

    The centre of a geometric object.

    Data History

    Data Origin

    Companies have configured their networks to be able to continuously monitor night flows using district meters. Flow data is recorded on meters and normally transmitted daily to a data centre. Data is analysed to confirm its validity and used to derive continuous night flow in each monitored area.

    Data Triage Considerations

    Data Quality

    Not all DMAs provide quality data for the purposes of trend analysis. It was decided that water companies should choose 10% of their DMAs to be represented in this data set to begin with. The advice to publishers is to choose those with reliable and consistent telemetry, indicative of genuine low demand during measurement times and not revealing of sensitive night usage patterns.

    Data Consistency

    There is a concern that companies measure flow allowance for legitimate night use and/or potential night use differently. To avoid any inconsistency, it was decided that we would share the net flow.

    Critical National Infrastructure

    The release of boundary data for district metered areas has been deemed to be revealing of critical national infrastructure. Because of this, it has been decided that the data set shall only contain point data from a centroid within the DMA.

    Data Triage Review Frequency

    Every 12 months, unless otherwise requested.

    Data Limitations

    Some of the flow recorded may be legitimate nighttime usage of the network

    Some measuring systems automatically infill estimated measurements where none have been received via telemetry. These estimates are based on past flow.

    The reason for a fluctuation in night flow may not be determined by this dataset but potential causes can include seasonal variation in nighttime water usage and mains bursts

    Data Publish Frequency

    Monthly

    Supplementary information

    Below is a curated selection of links for additional reading, which provide a deeper understanding of this dataset.

    Ofwat – Reporting Guidance https://www.ofwat.gov.uk/wp-content/uploads/2018/03/Reporting-guidance-leakage.pdf

    Water UK – UK Leakage https://www.water.org.uk/wp-content/uploads/2022/03/Water-UK-A-leakage-Routemap-to-2050.pdf

  19. a

    Portsmouth Water Drinking Water Quality Data 2022 2023 2024

    • hub.arcgis.com
    • streamwaterdata.co.uk
    • +1more
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AHughes_Portsmouth (2025). Portsmouth Water Drinking Water Quality Data 2022 2023 2024 [Dataset]. https://hub.arcgis.com/datasets/d3165fd17d624b22a9900d47677dfa45
    Explore at:
    Dataset updated
    Oct 1, 2025
    Dataset authored and provided by
    AHughes_Portsmouth
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview

    Water companies in the UK are responsible for testing the quality of drinking water. This dataset contains the results of samples taken from the taps in domestic households to make sure they meet the standards set out by UK and European legislation. This data shows the location, date, and measured levels of determinands set out by the Drinking Water Inspectorate (DWI).

    Key Definitions

    Aggregation

    Process involving summarizing or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes

    Anonymisation

    Anonymised data is a type of information sanitization in which data anonymisation tools encrypt or remove personally identifiable information from datasets for the purpose of preserving a data subject's privacy

    Dataset

    Structured and organized collection of related elements, often stored digitally, used for analysis and interpretation in various fields.

    Determinand

    A constituent or property of drinking water which can be determined or estimated.

    DWI

    Drinking Water Inspectorate, an organisation “providing independent reassurance that water supplies in England and Wales are safe and drinking water quality is acceptable to consumers.”

    DWI Determinands

    Constituents or properties that are tested for when evaluating a sample for its quality as per the guidance of the DWI. For this dataset, only determinands with “point of compliance” as “customer taps” are included.

    Granularity

    Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours

    ID

    Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance.

    LSOA

    Lower-Level Super Output Area is made up of small geographic areas used for statistical and administrative purposes by the Office for National Statistics. It is designed to have homogeneous populations in terms of population size, making them suitable for statistical analysis and reporting. Each LSOA is built from groups of contiguous Output Areas with an average of about 1,500 residents or 650 households allowing for granular data collection useful for analysis, planning and policy- making while ensuring privacy.

    ONS

    Office for National Statistics

    Open Data Triage

    The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data. <

    Sample

    A sample is a representative segment or portion of water taken from a larger whole for the purpose of analysing or testing to ensure compliance with safety and quality standards.

    Schema

    Structure for organizing and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute.

    Units

    Standard measurements used to quantify and compare different physical quantities.

    Water Quality

    The chemical, physical, biological, and radiological characteristics of water, typically in relation to its suitability for a specific purpose, such as drinking, swimming, or ecological health. It is determined by assessing a variety of parameters, including but not limited to pH, turbidity, microbial content, dissolved oxygen, presence of substances and temperature.

    Data History

    Data Origin

    These samples were taken from customer taps. They were then analysed for water quality, and the results were uploaded to a database. This dataset is an extract from this database.

    Data Triage Considerations

    Granularity

    Is it useful to share results as averages or individual?

    We decided to share as individual results as the lowest level of granularity

    Anonymisation

    It is a requirement that this data cannot be used to identify a singular person or household. We discussed many options for aggregating the data to a specific geography to ensure this requirement is met. The following geographical aggregations were discussed:

    <!--·
    Water Supply Zone (WSZ) - Limits interoperability with other datasets

    <!--·
    Postcode – Some postcodes contain very few households and may not offer necessary anonymisation

    <!--·
    Postal Sector – Deemed not granular enough in highly populated areas

    <!--·
    Rounded Co-ordinates – Not a recognised standard and may cause overlapping areas

    <!--·
    MSOA – Deemed not granular enough

    <!--·
    LSOA – Agreed as a recognised standard appropriate for England and Wales

    <!--·
    Data Zones – Agreed as a recognised standard appropriate for Scotland

    Data Specifications

    Each dataset will cover a calendar year of samples

    This dataset will be published annually

    Historical datasets will be published as far back as 2016 from the introduction of of The Water Supply (Water Quality) Regulations 2016

    The Determinands included in the dataset are as per the list that is required to be reported to the Drinking Water Inspectorate.

    Context

    Many UK water companies provide a search tool on their websites where you can search for water quality in your area by postcode. The results of the search may identify the water supply zone that supplies the postcode searched. Water supply zones are not linked to LSOAs which means the results may differ to this dataset

    Some sample results are influenced by internal plumbing and may not be representative of drinking water quality in the wider area.

    Some samples are tested on site and others are sent to scientific laboratories.

    Data Publish Frequency

    Annually

    Data Triage Review Frequency

    Annually unless otherwise requested

    Supplementary information

    Below is a curated selection of links for additional reading, which provide a deeper understanding of this dataset.

    <!--1.
    Drinking Water Inspectorate Standards and Regulations:

    <!--2.
    https://www.dwi.gov.uk/drinking-water-standards-and-regulations/

    <!--3.
    LSOA (England and Wales) and Data Zone (Scotland):

    <!--4. https://www.nrscotland.gov.uk/files/geography/2011-census/geography-bckground-info-comparison-of-thresholds.pdf

    <!--5.
    Description for LSOA boundaries by the ONS: Census 2021 geographies - Office for National Statistics (ons.gov.uk)

    <!--[6.
    Postcode to LSOA lookup tables: Postcode to 2021 Census Output Area to Lower Layer Super Output Area to Middle Layer Super Output Area to Local Authority District (August 2023) Lookup in the UK (statistics.gov.uk)

    <!--7.
    Legislation history: Legislation - Drinking Water Inspectorate (dwi.gov.uk)

  20. Data from: Emotional Regulation Questionnaire (ERQ): Evidence of Construct...

    • scielo.figshare.com
    jpeg
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valdiney Veloso Gouveia; Hysla Magalhães de Moura; Isabel Cristina Vasconcelos de Oliveira; Maria Gabriela Costa Ribeiro; Alessandro Teixeira Rezende; Tátila Rayane de Sampaio Brito (2023). Emotional Regulation Questionnaire (ERQ): Evidence of Construct Validity and Internal Consistency [Dataset]. http://doi.org/10.6084/m9.figshare.7131473.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    SciELOhttp://www.scielo.org/
    Authors
    Valdiney Veloso Gouveia; Hysla Magalhães de Moura; Isabel Cristina Vasconcelos de Oliveira; Maria Gabriela Costa Ribeiro; Alessandro Teixeira Rezende; Tátila Rayane de Sampaio Brito
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract Gather evidence of construct and convergent validity and internal consistency of the Emotion Regulation Questionnaire (ERQ). A total of 441 students, mostly female (54.6%), with a mean age of 16 years (SD = 1.14), answered the ERQ and demographic questions. They were randomly distributed in two databases, which were submitted to exploratory (sample 1) and confirmatory factor analysis (sample 2). The exploratory results indicated a three-factor structure: Cognitive Reappraisal, Redirection of Attentional Focus and Emotional Suppression, which together explained 59.3% of the total variance (α = 0,67; α = 0,63; α = 0,64). For the confirmatory analyses, the following goodness-of-fit indices were found: χ² (24) = 67.02, p

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). CARETS: A Consistency And Robustness Evaluative Test Suite for VQA [Dataset]. https://service.tib.eu/ldmservice/dataset/carets--a-consistency-and-robustness-evaluative-test-suite-for-vqa

Data from: CARETS: A Consistency And Robustness Evaluative Test Suite for VQA

Related Article
Explore at:
Dataset updated
Jan 2, 2025
Description

CARETS is a systematic test suite to measure consistency and robustness of modern VQA models through a series of six fine-grained capability tests.

Search
Clear search
Close search
Google apps
Main menu