Name: GoiEner smart meters data Summary: The dataset contains hourly time series of electricity consumption (kWh) provided by the Spanish electricity retailer GoiEner. The time series are arranged in four compressed files: raw.tzst, contains raw time series of all GoiEner clients (any date, any length, may have missing samples). imp-pre.tzst, contains processed time series (imputation of missing samples), longer than one year, collected before March 1, 2020. imp-in.tzst, contains processed time series (imputation of missing samples), longer than one year, collected between March 1, 2020 and May 30, 2021. imp-post.tzst, contains processed time series (imputation of missing samples), longer than one year, collected after May 30, 2020. metadata.csv, contains relevant information for each time series. License: CC-BY-SA Acknowledge: These data have been collected in the framework of the WHY project. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 891943. Disclaimer: The sole responsibility for the content of this publication lies with the authors. It does not necessarily reflect the opinion of the Executive Agency for Small and Medium-sized Enterprises (EASME) or the European Commission (EC). EASME or the EC are not responsible for any use that may be made of the information contained therein. Collection Date: From November 2, 2014 to June 8, 2022. Publication Date: December 1, 2022. DOI: 10.5281/zenodo.7362094 Other repositories: None. Author: GoiEner, University of Deusto. Objective of collection: This dataset was originally used to establish a methodology for clustering households according to their electricity consumption. Description: The meaning of each column is described next for each file. raw.tzst: (no column names provided) timestamp; electricity consumption in kWh. imp-pre.tzst, imp-in.tzst, imp-post.tzst: “timestamp”: timestamp; “kWh”: electricity consumption in kWh; “imputed”: binary value indicating whether the row has been obtained by imputation. metadata.csv: “user”: 64-character identifying a user; “start_date”: initial timestamp of the time series; “end_date”: final timestamp of the time series; “length_days”: number of days elapsed between the initial and the final timestamps; “length_years”: number of years elapsed between the initial and the final timestamps; “potential_samples”: number of samples that should be between the initial and the final timestamps of the time series if there were no missing values; “actual_samples”: number of actual samples of the time series; “missing_samples_abs”: number of potential samples minus actual samples; “missing_samples_pct”: potential samples minus actual samples as a percentage; “contract_start_date”: contract start date; “contract_end_date”: contract end date; “contracted_tariff”: type of tariff contracted (2.X: households and SMEs, 3.X: SMEs with high consumption, 6.X: industries, large commercial areas, and farms); “self_consumption_type”: the type of self-consumption to which the users are subscribed; “p1”, “p2”, “p3”, “p4”, “p5”, “p6”: contracted power (in kW) for each of the six time slots; “province”: province where the user is located; “municipality”: municipality where the user is located (municipalities below 50.000 inhabitants have been removed); “zip_code”: post code (post codes of municipalities below 50.000 inhabitants have been removed); “cnae”: CNAE (Clasificación Nacional de Actividades Económicas) code for economic activity classification. 5 star: ⭐⭐⭐ Preprocessing steps: Data cleaning (imputation of missing values using the Last Observation Carried Forward algorithm using weekly seasons); data integration (combination of multiple SIMEL files, i.e. the data sources); data transformation (anonymization, unit conversion, metadata generation). Reuse: This dataset is related to datasets: "A database of features extracted from different electricity load profiles datasets" (DOI 10.5281/zenodo.7382818), where time series feature extraction has been performed. "Measuring the flexibility achieved by a change of tariff" (DOI 10.5281/zenodo.7382924), where the metadata has been extended to include the results of a socio-economic characterization and the answers to a survey about barriers to adapt to a change of tariff. Update policy: There might be a single update in mid-2023. Ethics and legal aspects: The data provided by GoiEner contained values of the CUPS (Meter Point Administration Number), which are personal data. A pre-processing step has been carried out to replace the CUPS by random 64-character hashes. Technical aspects: raw.tzst contains a 15.1 GB folder with 25,559 CSV files; imp-pre.tzst contains a 6.28 GB folder with 12,149 CSV files; imp-in.tzst contains a 4.36 GB folder with 15.562 CSV files; and imp-post.tzst contains a 4.01 GB folder with 17.519 CSV files. Other: None.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Kaggle London Smart Meters dataset contains 5560 half hourly time series that represent the energy consumption readings of London households in kilowatt hour (kWh) from November 2011 to February 2014.
The original dataset contains missing values. They have been replaced by carrying forward the corresponding last observations (LOCF method).
This high-frequency three minutes interval smart meter dataset provides information on urban household electricity consumption patterns from nearly 100 smart meters installed in Mathura and Bareilly districts of Uttar Pradesh, India from May 2019 to October 2021. Apart from this, the data also provides information on the situation of power supply (including hours of power supply, voltage, current withdrawn and other related variables). The data can have various use cases for researchers and practitioners working in the power sector domain. For instance, power distribution companies (discoms) can utilise the smart meter data for effective service delivery and demand management.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview This dataset offers valuable insights into yearly domestic water consumption across various Lower Super Output Areas (LSOAs) or Data Zones, accompanied by the count of water meters within each area. It is instrumental for analysing residential water use patterns, facilitating water conservation efforts, and guiding infrastructure development and policy making at a localised level. Key Definitions Aggregation The process of summarising or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes. AMR Meter Automatic meter reading (AMR) is the technology of automatically collecting consumption, diagnostic, and status data from a water meter remotely and periodically. Dataset Structured and organised collection of related elements, often stored digitally, used for analysis and interpretation in various fields. Data Zone Data zones are the key geography for the dissemination of small area statistics in Scotland Dumb Meter A dumb meter or analogue meter is read manually. It does not have any external connectivity. Granularity Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours ID Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance. LSOA Lower Layer Super Output Areas (LSOA) are a geographic hierarchy designed to improve the reporting of small area statistics in England and Wales. Open Data Triage The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data. Schema Structure for organising and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute. Smart Meter A smart meter is an electronic device that records information and communicates it to the consumer and the supplier. It differs from automatic meter reading (AMR) in that it enables two-way communication between the meter and the supplier. Units Standard measurements used to quantify and compare different physical quantities. Water Meter Water metering is the practice of measuring water use. Water meters measure the volume of water used by residential and commercial building units that are supplied with water by a public water supply system. Data History Data Origin Domestic consumption data is recorded using water meters. The consumption recorded is then sent back to water companies. This dataset is extracted from the water companies. Data Triage Considerations This section discusses the careful handling of data to maintain anonymity and addresses the challenges associated with data updates, such as identifying household changes or meter replacements. Identification of Critical Infrastructure This aspect is not applicable for the dataset, as the focus is on domestic water consumption and does not contain any information that reveals critical infrastructure details. Commercial Risks and Anonymisation Individual Identification Risks There is a potential risk of identifying individuals or households if the consumption data is updated irregularly (e.g., every 6 months) and an out-of-cycle update occurs (e.g., after 2 months), which could signal a change in occupancy or ownership. Such patterns need careful handling to avoid accidental exposure of sensitive information. Meter and Property Association Challenges arise in maintaining historical data integrity when meters are replaced but the property remains the same. Ensuring continuity in the data without revealing personal information is crucial. Interpretation of Null Consumption Instances of null consumption could be misunderstood as a lack of water use, whereas they might simply indicate missing data. Distinguishing between these scenarios is vital to prevent misleading conclusions. Meter Re-reads The dataset must account for instances where meters are read multiple times for accuracy. Joint Supplies & Multiple Meters per Household Special consideration is required for households with multiple meters as well as multiple households that share a meter as this could complicate data aggregation. Schema Consistency with the Energy Industry: In formulating the schema for the domestic water consumption dataset, careful consideration was given to the potential risks to individual privacy. This evaluation included examining the frequency of data updates, the handling of property and meter associations, interpretations of null consumption, meter re-reads, joint suppliers, and the presence of multiple meters within a single household as described above. After a thorough assessment of these factors and their implications for individual privacy, it was decided to align the dataset's schema with the standards established within the energy industry. This decision was influenced by the energy sector's experience and established practices in managing similar risks associated with smart meters. This ensures a high level of data integrity and privacy protection. Schema The dataset schema is aligned with those used in the energy industry, which has encountered similar challenges with smart meters. However, it is important to note that the energy industry has a much higher density of meter distribution, especially smart meters. Aggregation to Mitigate Risks The dataset employs an elevated level of data aggregation to minimise the risk of individual identification. This approach is crucial in maintaining the utility of the dataset while ensuring individual privacy. The aggregation level is carefully chosen to remove identifiable risks without excluding valuable data, thus balancing data utility with privacy concerns. Data Freshness Users should be aware that this dataset reflects historical consumption patterns and does not represent real-time data. Publish Frequency Annually Data Triage Review Frequency An annual review is conducted to ensure the dataset's relevance and accuracy, with adjustments made based on specific requests or evolving data trends. Data Specifications For the domestic water consumption dataset, the data specifications are designed to ensure comprehensiveness and relevance, while maintaining clarity and focus. The specifications for this dataset include: Each dataset encompasses recordings of domestic water consumption as measured and reported by the data publisher. It excludes commercial consumption. Where it is necessary to estimate consumption, this is calculated based on actual meter readings. Meters of all types (smart, dumb, AMR) are included in this dataset. The dataset is updated and published annually. Historical data may be made available to facilitate trend analysis and comparative studies, although it is not mandatory for each dataset release. Context Users are cautioned against using the dataset for immediate operational decisions regarding water supply management. The data should be interpreted considering potential seasonal and weather-related influences on water consumption patterns. The geographical data provided does not pinpoint locations of water meters within an LSOA. The dataset aims to cover a broad spectrum of households, from single-meter homes to those with multiple meters, to accurately reflect the diversity of water use within an LSOA.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains half hourly smart meter measurements of 4443 households, obtained during the Low Carbon London project, during 2013.
It is a refactored version of the data released by UK Power Networks under CC-BY license. The following filters have been applied:
Description of the data format:
Note: a cleaner version of the same data set, accompanied by survey data, is available under a more restrictive license at DOI: http://doi.org/10.5255/UKDA-SN-7857-2.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The PREMISES Smart Meter Measurement dataset contains electrical consumption data, as well as water and gas consumption when available, for 3 households in Belgium. The measurements were obtained using the local P1-Port interface and the open software tool: https://github.com/ejpalacios/p1-reader-premises. Electricity measurements are sampled at 1-second intervals, while gas and water cumulative values are given every 5 minutes. Metadata on household composition and energy efficiency (EPC and E levels) are provided. Likewise, we include details on whether the house has on-site PV generation, electric vehicle charger, and heat pump. This work was supported by the Research Foundation Flanders (FWO) Marie Skłodowska-Curie Actions - Seal of Excellence Postdoctoral Fellowships, under the project: "PREMISES: PRoviding Energy Metering Infrastructures with Secure Extended Services" Grant number: 12ZZV22N Data donated with the explicit permission of the subjects.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For processed data files more suitable for use, please refer to the GoiEner Smart Meters Data dataset: DOI 10.5281/zenodo.7362094
The GoiEner Raw Files dataset contains a complete set of raw data from the customer database of the Spanish renewable energy cooperative GoiEner, obtained from smart meters. Founded in the Basque Country in 2012, GoiEner has made available this extensive dataset, which includes the raw electricity consumption data (and self-generation) for all its customers. The supply points provided comprise a diverse range of customers, such as households, offices, small and medium-sized enterprises (SMEs), industrial buildings, and public facilities.
This dataset spans the entire period from the first records using smart meters in late 2014 until June 2022. It consists of 71,048 files containing information on consumption, generation, contracted power, pricing, and other related data for each supply point in the grid. These files serve as the primary source for managing electricity supply contracts between distributors and retailers, as well as for user billing. The formats and specifications of the files adhere to the guidelines set by the Spanish electricity market regulator, the National Commission of Markets and Competition (CNMC), which establishes the Electricity Metering Information System (SIMEL).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains detailed electricity measurements of appliances in a collective living (co-living) student apartment at KTH Live-in-Lab. The measurements included are RMS voltage, RMS current, real power, and power factor. The data was collected over a period of 277 days between 28th August 2020 and 31st May 2021 with 1 second time resolution. The compressed file "appliance_csv.zip" contains data in plain CSV file format, whereas "appliance_mat.zip" contains data in MATLAB file format.
For more detailed information and insights into the dataset and the data collection process, refer to the following article. Kindly cite this when using the dataset.
CAMSL is the first public dataset for a Time-Of-Use (TOU) tariff intervention study using smart-meter data including pre, during and post TOU intervention periods. It includes 1423 households (1023 TOUusers and 400 Non-TOU users) in Tokyo between 1st July 2017 and 31st December 2018 (18 months). The dataset also includes raw data of 3337 customers who did not participate in the TOU trial. Each day has 48 half-hourly data points for energy consumption from a smart meter and each household has 579 days between 1 July 2017 to 31 December 2018, comprising a total of 27792 data points for electricity consumption obtained at each household for this dataset. The uniqueness of this dataset is the included online engagement data recorded via web-application usage, which enables further studies related to gamification effects.
consumption_data.zip: half-hourly consumption data from 1st June 2017 to 31st December 2018 customer_info.csv: customer information (house_type, number_of_residents, tou) TOU users == 1 Control users == 0 As for the selection of Control users, please refer to the article. web_info.csv: web activity information (for TOU customers) (sessions, average_session_duration, bounce_rate) already padded if the value is missing. temperature_Tokyo.csv: hourly temperature data in Tokyo from 1st June 2017 to 31st December 2018 holidays.csv: Japanese national holidays non_tou.csv.gz: raw data of consumption of total 3337 customers who did not participate in the TOU trial
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
This dataset offers valuable insights into yearly domestic water consumption across various Lower Super Output Areas (LSOAs) or Data Zones, accompanied by the count of water meters within each area. It is instrumental for analysing residential water use patterns, facilitating water conservation efforts, and guiding infrastructure development and policy making at a localised level.
Key Definitions
Aggregation
The process of summarising or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes.
AMR Meter
Automatic meter reading (AMR) is the technology of automatically collecting consumption, diagnostic, and status data from a water meter remotely and periodically.
Dataset
Structured and organised collection of related elements, often stored digitally, used for analysis and interpretation in various fields.
Data Zone
Data zones are the key geography for the dissemination of small area statistics in Scotland
Dumb Meter
A dumb meter or analogue meter is read manually. It does not have any external connectivity.
Granularity
Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours
ID
Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance.
LSOA
Lower Layer Super Output Areas (LSOA) are a geographic hierarchy designed to improve the reporting of small area statistics in England and Wales.
Open Data Triage
The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data.
Schema
Structure for organising and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute.
Smart Meter
A smart meter is an electronic device that records information and communicates it to the consumer and the supplier. It differs from automatic meter reading (AMR) in that it enables two-way communication between the meter and the supplier.
Units
Standard measurements used to quantify and compare different physical quantities.
Water Meter
Water metering is the practice of measuring water use. Water meters measure the volume of water used by residential and commercial building units that are supplied with water by a public water supply system.
Data History
Data Origin
Domestic consumption data is recorded using water meters. The consumption recorded is then sent back to water companies. This dataset is extracted from the water companies.
Data Triage Considerations
This section discusses the careful handling of data to maintain anonymity and addresses the challenges associated with data updates, such as identifying household changes or meter replacements.
Identification of Critical Infrastructure
This aspect is not applicable for the dataset, as the focus is on domestic water consumption and does not contain any information that reveals critical infrastructure details.
Commercial Risks and Anonymisation
Individual Identification Risks
There is a potential risk of identifying individuals or households if the consumption data is updated irregularly (e.g., every 6 months) and an out-of-cycle update occurs (e.g., after 2 months), which could signal a change in occupancy or ownership. Such patterns need careful handling to avoid accidental exposure of sensitive information.
Meter and Property Association
Challenges arise in maintaining historical data integrity when meters are replaced but the property remains the same. Ensuring continuity in the data without revealing personal information is crucial.
Interpretation of Null Consumption
Instances of null consumption could be misunderstood as a lack of water use, whereas they might simply indicate missing data. Distinguishing between these scenarios is vital to prevent misleading conclusions.
Meter Re-reads
The dataset must account for instances where meters are read multiple times for accuracy.
Joint Supplies & Multiple Meters per Household
Special consideration is required for households with multiple meters as well as multiple households that share a meter as this could complicate data aggregation.
Schema Consistency with the Energy Industry:
In formulating the schema for the domestic water consumption dataset, careful consideration was given to the potential risks to individual privacy. This evaluation included examining the frequency of data updates, the handling of property and meter associations, interpretations of null consumption, meter re-reads, joint suppliers, and the presence of multiple meters within a single household as described above.
After a thorough assessment of these factors and their implications for individual privacy, it was decided to align the dataset's schema with the standards established within the energy industry. This decision was influenced by the energy sector's experience and established practices in managing similar risks associated with smart meters. This ensures a high level of data integrity and privacy protection.
Schema
The dataset schema is aligned with those used in the energy industry, which has encountered similar challenges with smart meters. However, it is important to note that the energy industry has a much higher density of meter distribution, especially smart meters.
Aggregation to Mitigate Risks
The dataset employs an elevated level of data aggregation to minimise the risk of individual identification. This approach is crucial in maintaining the utility of the dataset while ensuring individual privacy. The aggregation level is carefully chosen to remove identifiable risks without excluding valuable data, thus balancing data utility with privacy concerns.
Data Freshness
Users should be aware that this dataset reflects historical consumption patterns and does not represent real-time data.
Publish Frequency
Annually
Data Triage Review Frequency
An annual review is conducted to ensure the dataset's relevance and accuracy, with adjustments made based on specific requests or evolving data trends.
Data Specifications
For the domestic water consumption dataset, the data specifications are designed to ensure comprehensiveness and relevance, while maintaining clarity and focus. The specifications for this dataset include:
·
Each
dataset encompasses recordings of domestic water consumption as measured and
reported by the data publisher. It excludes commercial consumption.
· Where it is necessary to estimate consumption, this is calculated based on actual meter readings.
· Meters of all types (smart, dumb, AMR) are included in this dataset.
·
The
dataset is updated and published annually.
·
Historical
data may be made available to facilitate trend analysis and comparative
studies, although it is not mandatory for each dataset release.
Context
Users are cautioned against using the dataset for immediate operational decisions regarding water supply management. The data should be interpreted considering potential seasonal and weather-related influences on water consumption patterns.
The geographical data provided does not pinpoint locations of water meters within an LSOA.
The dataset aims to cover a broad spectrum of households, from single-meter homes to those with multiple meters, to accurately reflect the diversity of water use within an LSOA.
Supplementary Information
Below is a curated selection of links for additional reading, which provide a deeper understanding of this dataset.
Ofwat guidance on water meters
https://www.ofwat.gov.uk/wp-content/uploads/2015/11/prs_lft_101117meters.pdf
Energy consumption readings for a sample of 5,567 London Households that took part in the UK Power Networks led Low Carbon London project between November 2011 and February 2014.
Readings were taken at half hourly intervals. Households have been allocated to a CACI Acorn group (2010). The customers in the trial were recruited as a balanced sample representative of the Greater London population.
The dataset contains energy consumption, in kWh (per half hour), unique household identifier, date and time, and CACI Acorn group. The CSV file is around 10GB when unzipped and contains around 167million rows.
Within the data set are two groups of customers. The first is a sub-group, of approximately 1100 customers, who were subjected to Dynamic Time of Use (dToU) energy prices throughout the 2013 calendar year period. The tariff prices were given a day ahead via the Smart Meter IHD (In Home Display) or text message to mobile phone. Customers were issued High (67.20p/kWh), Low (3.99p/kWh) or normal (11.76p/kWh) price signals and the times of day these applied. The dates/times and the price signal schedule is availaible as part of this dataset. All non-Time of Use customers were on a flat rate tariff of 14.228pence/kWh.
The signals given were designed to be representative of the types of signal that may be used in the future to manage both high renewable generation (supply following) operation and also test the potential to use high price signals to reduce stress on local distribution grids during periods of stress.
The remaining sample of approximately 4500 customers energy consumption readings were not subject to the dToU tariff.
More information can be found on the Low Carbon London webpage
Some analysis of this data can be seen here.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
This dataset contains the electric power consumption data from the Los Alamos Public Utility Department (LADPU) in New Mexico, USA. The data was collected by Landis+Gyr smart meters devices on 1,757 households at North Mesa, Los Alamos, NM. The sampling rate is one observation every fifteen minutes (i.e., 96 observations per day). For most customers, the data spans about six years, from July 30, 2013 to December 30, 2019. However, for some customers, the period is reduced. The dataset contains missing values and duplicated measurements.
Methods This dataset is provided in its original format, without cleaning or pre-processing. The only procedure performed was for anonymization reasons. Thus, the data are not normalized, and it has missing values and duplicate entries (i.e., more than one measurement for the same time). However, these issues represent only a small portion of data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is maintained by Steven Firth (s.k.firth@lboro.ac.uk), Building Energy Research Group (BERG), School of Civil and Building Engineering, Loughborough University. The REFIT project (www.refitsmarthomes.org) carried out a study from 2013 to 2015 in which 20 UK homes were upgraded to Smart Homes through the installation of devices including Smart Meters, programmable thermostats, programmable radiator valves, motion sensors, door sensors and window sensors.Data was collected using building surveys, sensor placements and household interviews.The REFIT Smart Home dataset is one of the datasets made publically available by the project. This dataset includes: - Building survey data for the 20 homes. - Sensor measurements made before the Smart Home equipment was installed. - Sensor measurements made after the Smart Home equipment was installed. - Climate data recorded at a nearby weather station.--- This work has been carried out as part of the REFIT project (‘Personalised Retrofit Decision Support Tools for UK Homes using Smart Home Technology’, Grant Reference EP/K002457/1). REFIT is a consortium of three universities - Loughborough, Strathclyde and East Anglia - and ten industry stakeholders funded by the Engineering and Physical Sciences Research Council (EPSRC) under the Transforming Energy Demand in Buildings through Digital Innovation (BuildTEDDI) funding programme. For more information see: www.epsrc.ac.uk and www.refitsmarthomes.org---The references below provide links to the REFIT project website, the TEDDINET website, a journal article which uses the dataset, and three additional datasets collected as part of the REFIT project by the University of Strathclyde and the University of East Anglia.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data Between September 2021 and September 2022, we collected the cumulative water consumption data of 17 households in Germany using the commercially available smart meters Hydrus 1.3, DN 20,00 from Diehl Metering. Further information on data collection and description of the dataset can be found in section 3 of the original article, which is available at the following DOI: https://doi.org/10.3390/jsan12030046
Structure The root folder of the Zenodo archive contains the documents readme.md, info.txt, and info.json, as well as the 17 folders for the individual households. The readme.md gives some basic information on the data set and its use. The documents main.txt and main.json contains the metadata for the measurements of all households in both human and machine-readable format. For each test household, there is a separate folder with the documents info.txt and info.json. These contain household-specific metadata. The total water consumption measurements are stored in smartmeter.csv.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data HistoryData OriginDomestic consumption data is recorded using water meters. The consumption recorded is then sent back to water companies. This dataset is extracted from the water companies.Data Triage ConsiderationsThis section discusses the careful handling of data to maintain anonymity and addresses the challenges associated with data updates, such as identifying household changes or meter replacements.Identification of Critical InfrastructureThis aspect is not applicable for the dataset, as the focus is on domestic water consumption and does not contain any information that reveals critical infrastructure details.Individual Identification RisksThere is a potential risk of identifying individuals or households if the consumption data is updated irregularly (e.g., every 6 months) and an out-of-cycle update occurs (e.g., after 2 months), which could signal a change in occupancy or ownership. Such patterns need careful handling to avoid accidental exposure of sensitive information.Meter and Property AssociationChallenges arise in maintaining historical data integrity when meters are replaced but the property remains the same. Ensuring continuity in the data without revealing personal information is crucial.Interpretation of Null ConsumptionInstances of null consumption could be misunderstood as a lack of water use, whereas they might simply indicate missing data. Distinguishing between these scenarios is vital to prevent misleading conclusions.Meter Re-readsThe dataset must account for instances where meters are read multiple times for accuracy.Joint Supplies & Multiple Meters per HouseholdSpecial consideration is required for households with multiple meters as well as multiple households that share a meter as this could complicate data aggregation.Schema Consistency with the Energy Industry:In formulating the schema for the domestic water consumption dataset, careful consideration was given to the potential risks to individual privacy. This evaluation included examining the frequency of data updates, the handling of property and meter associations, interpretations of null consumption, meter re-reads, joint suppliers, and the presence of multiple meters within a single household as described above.After a thorough assessment of these factors and their implications for individual privacy, it was decided to align the dataset's schema with the standards established within the energy industry. This decision was influenced by the energy sector's experience and established practices in managing similar risks associated with smart meters. This ensures a high level of data integrity and privacy protection.SchemaThe dataset schema is aligned with those used in the energy industry, which has encountered similar challenges with smart meters. However, it is important to note that the energy industry has a much higher density of meter distribution, especially smart meters.Aggregation to Mitigate RisksThe dataset employs an elevated level of data aggregation to minimise the risk of individual identification. This approach is crucial in maintaining the utility of the dataset while ensuring individual privacy. The aggregation level is carefully chosen to remove identifiable risks without excluding valuable data, thus balancing data utility with privacy concerns.Data FreshnessUsers should be aware that this dataset reflects historical consumption patterns and does not represent real-time data.Publish FrequencyAnnuallyData Triage Review FrequencyAn annual review is conducted to ensure the dataset's relevance and accuracy, with adjustments made based on specific requests or evolving data trends.Data SpecificationsFor the domestic water consumption dataset, the data specifications are designed to ensure comprehensiveness and relevance, while maintaining clarity and focus. The specifications for this dataset include:Each dataset encompasses recordings of domestic water consumption as measured and reported by the data publisher. It excludes commercial consumption.Consumption for calendar year dates is calculated based on actual meter readings.Meters of all types (smart, dumb, AMR) are included in this dataset.The dataset is updated and published annually.The dataset includes LSOAs with 10 or more meters. Any LSOAs with less than 10 meters have been excluded.The dataset includes only meters that are currently shown as active.The dataset excludes any meters where consumption is recorded as null.ContextUsers are cautioned against using the dataset for immediate operational decisions regarding water supply management. The data should be interpreted considering potential seasonal and weather-related influences on water consumption patterns.The geographical data provided does not pinpoint locations of water meters within an LSOA.The dataset aims to cover a broad spectrum of households, from single-meter homes to those with multiple meters, to accurately reflect the diversity of water use within an LSOA.This dataset has been calculated using actual read data. To align reads to calendar year, our approach uses a combination of previous and next usage readings, along with the number of days between these readings to calculate the total consumption.Supplementary InformationBelow is a curated selection of links for additional reading, which provide a deeper understanding of this dataset:Ofwat guidance on water metersData SchemaDATA_SOURCE: Company that provided the dataYEAR: The calendar year covered by the dataLSOA_CODE: LSOA or Data Zone converted code of the meter locationNUMBER_OF_METERS: Number of meters within an LSOATOTAL_CONSUMPTION: Total consumption within the LSOATOTAL_CONSUMPTION_UNITS: Units for total consumption
Heat pumps are essential for decarbonizing residential heating but consume substantial electrical energy, impacting operational costs and grid demand. Many systems run inefficiently due to planning flaws, operational faults, or misconfigurations. While optimizing performance requires skilled professionals, labor shortages hinder large-scale interventions. However, digital tools and improved data availability create new service opportunities for energy efficiency, predictive maintenance, and demand-side management. To support research and practical solutions, we present an open-source dataset of electricity consumption from 1,408 households with heat pumps and smart electricity meters in the canton of Zurich, Switzerland, recorded at 15-minute and daily resolutions between 2018-11-03 and 2024-03-21. The dataset includes household metadata, weather data from 8 stations, and ground truth data from 410 field visit protocols collected by energy consultants during system optimizations. Additionally, the dataset includes a Python-based data loader to facilitate seamless data processing and exploration.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Energy consumption readings for a sample of 5,567 London Households that took part in the UK Power Networks led Low Carbon London project between November 2011 and February 2014.
Readings were taken at half hourly intervals. The customers in the trial were recruited as a balanced sample representative of the Greater London population.
The dataset contains energy consumption, in kWh (per half hour), unique household identifier, date and time. The CSV file is around 10GB when unzipped and contains around 167million rows.
Within the data set are two groups of customers. The first is a sub-group, of approximately 1100 customers, who were subjected to Dynamic Time of Use (dToU) energy prices throughout the 2013 calendar year period. The tariff prices were given a day ahead via the Smart Meter IHD (In Home Display) or text message to mobile phone. Customers were issued High (67.20p/kWh), Low (3.99p/kWh) or normal (11.76p/kWh) price signals and the times of day these applied. The dates/times and the price signal schedule is availaible as part of this dataset. All non-Time of Use customers were on a flat rate tariff of 14.228pence/kWh.
The signals given were designed to be representative of the types of signal that may be used in the future to manage both high renewable generation (supply following) operation and also test the potential to use high price signals to reduce stress on local distribution grids during periods of stress.
The remaining sample of approximately 4500 customers energy consumption readings were not subject to the dToU tariff.
More information can be found on the Low Carbon London webpage
Some analysis of this data can be seen here.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data History
Data Origin
Domestic consumption data is recorded using water meters. The consumption recorded is then sent back to water companies. This dataset is extracted from the water companies.
Data Triage Considerations
This section discusses the careful handling of data to maintain anonymity and addresses the challenges associated with data updates, such as identifying household changes or meter replacements.
Identification of Critical Infrastructure
This aspect is not applicable for the dataset, as the focus is on domestic water consumption and does not contain any information that reveals critical infrastructure details.
Commercial Risks and Anonymisation Individual Identification Risks
There is a potential risk of identifying individuals or households if the consumption data is updated irregularly (e.g., every 6 months) and an out-of-cycle update occurs (e.g., after 2 months), which could signal a change in occupancy or ownership. Such patterns need careful handling to avoid accidental exposure of sensitive information.
Meter and Property Association
Challenges arise in maintaining historical data integrity when meters are replaced but the property remains the same. Ensuring continuity in the data without revealing personal information is crucial.
Interpretation of Null Consumption
Instances of null consumption could be misunderstood as a lack of water use, whereas they might simply indicate missing data. Distinguishing between these scenarios is vital to prevent misleading conclusions.
Meter Re-reads
The dataset must account for instances where meters are read multiple times for accuracy.
Joint Supplies & Multiple Meters per Household
Special consideration is required for households with multiple meters as well as multiple households that share a meter as this could complicate data aggregation.
Schema Consistency with the Energy Industry
In formulating the schema for the domestic water consumption dataset, careful consideration was given to the potential risks to individual privacy. This evaluation included examining the frequency of data updates, the handling of property and meter associations, interpretations of null consumption, meter re-reads, joint suppliers, and the presence of multiple meters within a single household as described above.
After a thorough assessment of these factors and their implications for individual privacy, it was decided to align the dataset's schema with the standards established within the energy industry. This decision was influenced by the energy sector's experience and established practices in managing similar risks associated with smart meters. This ensures a high level of data integrity and privacy protection.
Schema The dataset schema is aligned with those used in the energy industry, which has encountered similar challenges with smart meters. However, it is important to note that the energy industry has a much higher density of meter distribution, especially smart meters.
Aggregation to Mitigate Risks The dataset employs an elevated level of data aggregation to minimise the risk of individual identification. This approach is crucial in maintaining the utility of the dataset while ensuring individual privacy. The aggregation level is carefully chosen to remove identifiable risks without excluding valuable data, thus balancing data utility with privacy concerns.
Data Freshness Users should be aware that this dataset reflects historical consumption patterns and does not represent real-time data. Publish Frequency Weekly.
Data Triage Review Frequency An annual review is conducted to ensure the dataset's relevance and accuracy, with adjustments made based on specific requests or evolving data trends.
Data Specifications For the domestic water consumption dataset, the data specifications are designed to ensure comprehensiveness and relevance, while maintaining clarity and focus. The specifications for this dataset include: • Each dataset encompasses recordings of domestic water consumption as measured and reported by the data publisher. It excludes commercial consumption. • Where it is necessary to estimate consumption, this is calculated based on actual meter readings. • Meters of all types (smart, dumb, AMR) are included in this dataset. • The dataset is updated and published Weekly. • Historical data may be made available to facilitate trend analysis and comparative studies, although it is not mandatory for each dataset release. • The dataset includes LSOAs with 2 or more meters. Any LSOAs with less than 2 meters have been excluded. • Consumption data is only included where we have the full consumption data for a year for a given meter.
Context Users are cautioned against using the dataset for immediate operational decisions regarding water supply management. The data should be interpreted considering potential seasonal and weather-related influences on water consumption patterns.
The geographical data provided does not pinpoint locations of water meters within an LSOA.
The dataset aims to cover a broad spectrum of households, from single-meter homes to those with multiple meters, to accurately reflect the diversity of water use within an LSOA.
Supplementary InformationBelow is a curated selection of links for additional reading, which provide a deeper understanding of this dataset.1.Ofwat guidance on water meters. https://www.ofwat.gov.uk/wp-content/uploads/2015/11/prs_lft_101117meters.pdf Data Schema DATA_SOURCE: Company that provided the data YEAR: The calendar year covered by the data LSOA_CODE: LSOA or Data Zone converted code of the meter location NUMBER_OF_METERS: Number of meters within an LSOA TOTAL_CONSUMPTION: Average consumption within the LSOA TOTAL_CONSUMPTION_UNITS: Units for average consumption
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
In recent years, utility companies in several provinces have started installing wireless smart meters in Canadian businesses and residences. Some people have expressed concern about the possibility of health effects from exposure to the radiofrequency fields that these devices emit.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
OverviewThis dataset offers valuable insights into yearly domestic water consumption across various Lower Super Output Areas (LSOAs) or Data Zones, accompanied by the count of water meters within each area. It is instrumental for analysing residential water use patterns, facilitating water conservation efforts, and guiding infrastructure development and policy making at a localised level.Key DefinitionsAggregation The process of summarising or grouping data to obtain a single or reduced set of information, often for analysis or reporting purposes. AMR MeterAutomatic meter reading (AMR) is the technology of automatically collecting consumption, diagnostic, and status data from a water meter remotely and periodically.Dataset Structured and organised collection of related elements, often stored digitally, used for analysis and interpretation in various fields.Data ZoneData zones are the key geography for the dissemination of small area statistics in ScotlandDumb MeterA dumb meter or analogue meter is read manually. It does not have any external connectivity.Granularity Data granularity is a measure of the level of detail in a data structure. In time-series data, for example, the granularity of measurement might be based on intervals of years, months, weeks, days, or hours.ID Abbreviation for Identification that refers to any means of verifying the unique identifier assigned to each asset for the purposes of tracking, management, and maintenance. LSOALower Layer Super Output Areas (LSOA) are a geographic hierarchy designed to improve the reporting of small area statistics in England and Wales.Open Data Triage The process carried out by a Data Custodian to determine if there is any evidence of sensitivities associated with Data Assets, their associated Metadata and Software Scripts used to process Data Assets if they are used as Open Data. Schema Structure for organising and handling data within a dataset, defining the attributes, their data types, and the relationships between different entities. It acts as a framework that ensures data integrity and consistency by specifying permissible data types and constraints for each attribute. Smart MeterA smart meter is an electronic device that records information and communicates it to the consumer and the supplier. It differs from automatic meter reading (AMR) in that it enables two-way communication between the meter and the supplier.Units Standard measurements used to quantify and compare different physical quantities.Water MeterWater metering is the practice of measuring water use. Water meters measure the volume of water used by residential and commercial building units that are supplied with water by a public water supply system.Data HistoryData Origin Domestic consumption data is recorded using water meters. The consumption recorded is then sent back to water companies. This dataset is extracted from the water companies.Data Triage ConsiderationsThis section discusses the careful handling of data to maintain anonymity and addresses the challenges associated with data updates, such as identifying household changes or meter replacements.Identification of Critical InfrastructureThis aspect is not applicable for the dataset, as the focus is on domestic water consumption and does not contain any information that reveals critical infrastructure details.Commercial Risks and AnonymisationIndividual Identification RisksThere is a potential risk of identifying individuals or households if the consumption data is updated irregularly (e.g., every 6 months) and an out-of-cycle update occurs (e.g., after 2 months), which could signal a change in occupancy or ownership. Such patterns need careful handling to avoid accidental exposure of sensitive information.Meter and Property AssociationChallenges arise in maintaining historical data integrity when meters are replaced but the property remains the same. Ensuring continuity in the data without revealing personal information is crucial.Interpretation of Null ConsumptionInstances of null consumption could be misunderstood as a lack of water use, whereas they might simply indicate missing data. Distinguishing between these scenarios is vital to prevent misleading conclusions.Meter Re-reads The dataset must account for instances where meters are read multiple times for accuracy.Joint Supplies & Multiple Meters per Household Special consideration is required for households with multiple meters as well as multiple households that share a meter as this could complicate data aggregation.Schema Consistency with the Energy IndustryIn formulating the schema for the domestic water consumption dataset, careful consideration was given to the potential risks to individual privacy. This evaluation included examining the frequency of data updates, the handling of property and meter associations, interpretations of null consumption, meter re-reads, joint suppliers, and the presence of multiple meters within a single household as described above. After a thorough assessment of these factors and their implications for individual privacy, it was decided to align the dataset's schema with the standards established within the energy industry. This decision was influenced by the energy sector's experience and established practices in managing similar risks associated with smart meters. This ensures a high level of data integrity and privacy protection.SchemaThe dataset schema is aligned with those used in the energy industry, which has encountered similar challenges with smart meters. However, it is important to note that the energy industry has a much higher density of meter distribution, especially smart meters. Aggregation to Mitigate RisksThe dataset employs an elevated level of data aggregation to minimise the risk of individual identification. This approach is crucial in maintaining the utility of the dataset while ensuring individual privacy. The aggregation level is carefully chosen to remove identifiable risks without excluding valuable data, thus balancing data utility with privacy concerns.Data Triage Review FrequencyAn annual review is conducted to ensure the dataset's relevance and accuracy, with adjustments made based on specific requests or evolving data trends.Data FreshnessUsers should be aware that this dataset reflects historical consumption patterns and does not represent real-time data.Publish FrequencyAnnuallyData SpecificationsFor the domestic water consumption dataset, the data specifications are designed to ensure comprehensiveness and relevance, while maintaining clarity and focus. The specifications for this dataset include:• Each dataset encompasses recordings of domestic water consumption as measured and reported by the data publisher. It excludes commercial consumption.• Where it is necessary to estimate consumption, this is calculated based on actual meter readings.• Meters of all types (smart, dumb, AMR) are included in this dataset.• The dataset is updated and published annually.• Historical data may be made available to facilitate trend analysis and comparative studies, although it is not mandatory for each dataset release.• The dataset includes LSOAs with 2 or more meters. Any LSOAs with less than 2 meters have been excluded.Context Users are cautioned against using the dataset for immediate operational decisions regarding water supply management. The data should be interpreted considering potential seasonal and weather-related influences on water consumption patterns.The geographical data provided does not pinpoint locations of water meters within an LSOA. The dataset aims to cover a broad spectrum of households, from single-meter homes to those with multiple meters, or a single meter for multiple domestic units, to accurately reflect the diversity of water use within an LSOA.This dataset has been aggregated from actual read data and does not use estimated values to align reads to calendar years. Our approach subtracts the latest meter read from the preceding year from the latest meter read in the reported year; this is divided by the number of days between the two reads to obtain an average daily consumption. Data are removed from meters considered as void for seven or more months during the year. Void properties are those within the company’s supply area, which are connected for either a water service only, a wastewater service only or both services but do not receive a charge, as there are no occupants.Supplementary InformationBelow is a curated selection of links for additional reading, which provide a deeper understanding of this dataset. 1. Ofwat guidance on water meters: https://www.ofwat.gov.uk/wp-content/uploads/2015/11/prs_lft_101117meters.pdf2. Wessex Water performance commitment data on void sites: https://marketplace.wessexwater.co.uk/dataset/void-sites-performance-commitment-data
Name: GoiEner smart meters data Summary: The dataset contains hourly time series of electricity consumption (kWh) provided by the Spanish electricity retailer GoiEner. The time series are arranged in four compressed files: raw.tzst, contains raw time series of all GoiEner clients (any date, any length, may have missing samples). imp-pre.tzst, contains processed time series (imputation of missing samples), longer than one year, collected before March 1, 2020. imp-in.tzst, contains processed time series (imputation of missing samples), longer than one year, collected between March 1, 2020 and May 30, 2021. imp-post.tzst, contains processed time series (imputation of missing samples), longer than one year, collected after May 30, 2020. metadata.csv, contains relevant information for each time series. License: CC-BY-SA Acknowledge: These data have been collected in the framework of the WHY project. This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 891943. Disclaimer: The sole responsibility for the content of this publication lies with the authors. It does not necessarily reflect the opinion of the Executive Agency for Small and Medium-sized Enterprises (EASME) or the European Commission (EC). EASME or the EC are not responsible for any use that may be made of the information contained therein. Collection Date: From November 2, 2014 to June 8, 2022. Publication Date: December 1, 2022. DOI: 10.5281/zenodo.7362094 Other repositories: None. Author: GoiEner, University of Deusto. Objective of collection: This dataset was originally used to establish a methodology for clustering households according to their electricity consumption. Description: The meaning of each column is described next for each file. raw.tzst: (no column names provided) timestamp; electricity consumption in kWh. imp-pre.tzst, imp-in.tzst, imp-post.tzst: “timestamp”: timestamp; “kWh”: electricity consumption in kWh; “imputed”: binary value indicating whether the row has been obtained by imputation. metadata.csv: “user”: 64-character identifying a user; “start_date”: initial timestamp of the time series; “end_date”: final timestamp of the time series; “length_days”: number of days elapsed between the initial and the final timestamps; “length_years”: number of years elapsed between the initial and the final timestamps; “potential_samples”: number of samples that should be between the initial and the final timestamps of the time series if there were no missing values; “actual_samples”: number of actual samples of the time series; “missing_samples_abs”: number of potential samples minus actual samples; “missing_samples_pct”: potential samples minus actual samples as a percentage; “contract_start_date”: contract start date; “contract_end_date”: contract end date; “contracted_tariff”: type of tariff contracted (2.X: households and SMEs, 3.X: SMEs with high consumption, 6.X: industries, large commercial areas, and farms); “self_consumption_type”: the type of self-consumption to which the users are subscribed; “p1”, “p2”, “p3”, “p4”, “p5”, “p6”: contracted power (in kW) for each of the six time slots; “province”: province where the user is located; “municipality”: municipality where the user is located (municipalities below 50.000 inhabitants have been removed); “zip_code”: post code (post codes of municipalities below 50.000 inhabitants have been removed); “cnae”: CNAE (Clasificación Nacional de Actividades Económicas) code for economic activity classification. 5 star: ⭐⭐⭐ Preprocessing steps: Data cleaning (imputation of missing values using the Last Observation Carried Forward algorithm using weekly seasons); data integration (combination of multiple SIMEL files, i.e. the data sources); data transformation (anonymization, unit conversion, metadata generation). Reuse: This dataset is related to datasets: "A database of features extracted from different electricity load profiles datasets" (DOI 10.5281/zenodo.7382818), where time series feature extraction has been performed. "Measuring the flexibility achieved by a change of tariff" (DOI 10.5281/zenodo.7382924), where the metadata has been extended to include the results of a socio-economic characterization and the answers to a survey about barriers to adapt to a change of tariff. Update policy: There might be a single update in mid-2023. Ethics and legal aspects: The data provided by GoiEner contained values of the CUPS (Meter Point Administration Number), which are personal data. A pre-processing step has been carried out to replace the CUPS by random 64-character hashes. Technical aspects: raw.tzst contains a 15.1 GB folder with 25,559 CSV files; imp-pre.tzst contains a 6.28 GB folder with 12,149 CSV files; imp-in.tzst contains a 4.36 GB folder with 15.562 CSV files; and imp-post.tzst contains a 4.01 GB folder with 17.519 CSV files. Other: None.