Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset (.csv file) contains room-level measurements of indoor environmental quality (IEQ) variables together with occupancy ground truth for five residential apartments in Denmark over 7 days during January 2023, with a temporal resolution of 15 minutes. The occupancy ground truth has been determined based on activity logbooks filled in by apartments’ occupants. The measured IEQ variables are:
Indoor CO2 concentration
Indoor operative temperature
Indoor relative humidity
Additional features were generated from those three variables: variable transforms representing the short-term dynamics of the indoor environment for each of the three IEQ variables:
Difference between current variable value and average over the last hour
Difference between current variable value and previous recording (15 minutes prior)
Average of the variable over the last hour
Moreover, metadata parameters include the apartment number (1-5), room number (1-16), room type number, one-hot-encoded room type, floor area of the room, hour of the day, day number of the week, day number of the year, day label (unique identifier of a day for the different rooms) and date and time stamp. The occupancy ground truth variable is a binary 0 (no occupancy in the room) or 1 (occupancy in the room).
The full list of data variables with header name and description is as follows:
datetime: Date and time stamp in the format DD-MM-YY HH:MM (Day-Month-Year Hour:Minute)
indoor_co2_concentration: Indoor CO2 concentration inside the corresponding room (in ppm: parts per million)
indoor_operative_temperature: Indoor operative temperature inside the corresponding room (in Celsius degrees)
indoor_relative_humidity: Indoor relative humidity inside the corresponding room (in percent)
current_value_minus_average_last_hour_co2: Difference between the current indoor CO2 concentration inside the corresponding room and the average indoor CO2 concentration over the last hour (in ppm: parts per million)
current_value_minus_average_last_hour_operative_temperature: Difference between the current indoor operative temperature inside the corresponding room and the average indoor operative temperature over the last hour (in Celsius degrees)
current_value_minus_average_last_hour_relative_humidity: Difference between the current indoor relative humidity inside the corresponding room and the average indoor relative humidity over the last hour (in percent)
average_co2_last_hour: Average indoor CO2 concentration inside the corresponding room over the last hour (in ppm: parts per million)
average_operative_temperature_last_hour: Average indoor operative temperature inside the corresponding room over the last hour (in Celsius degrees)
average_relative_humidity_last_hour: Average indoor relative humidity inside the corresponding room over the last hour (in percent)
current_value_minus_last_15_min_co2: Difference between the current indoor CO2 concentration inside the corresponding room and the indoor CO2 concentration 15 minutes before (in ppm: parts per million)
current_value_minus_last_15_min_operative_temperature: Difference between the current indoor operative temperature inside the corresponding room and the indoor operative temperature 15 minutes before (in Celsius degrees)
current_value_minus_last_15_min_relative_humidity: Difference between the current indoor relative humidity inside the corresponding room and the indoor relative humidity 15 minutes before (in percent)
room_number: Room number label (from 1 to 16)
kitchen: One-hot-encoded room type for kitchen rooms
livingroom: One-hot-encoded room type for living rooms
bedroom: One-hot-encoded room type for bedrooms
office: One-hot-encoded room type for office rooms
kitchen_livingroom: One-hot-encoded room type for kitchen/living rooms
hour_of_the_day: Hour of the day (from 0 to 23)
day_number_of_the_week: Day number label of the week (from 1 to 7)
day_of_year: Day number label of year (from 30 to 37)
day_label: Day number label as a unique identifier of a day for the different rooms (from 0 to 112)
occupancy_ground_truth: Ground truth value of occupancy in the corresponding room. Binary variable taking the value of 0 when there is no occupancy (no people) in the room or 1 when there is occupancy (at least one person) in the room
floor_area: Surface floor area of the corresponding room (in square meters)
apartment_number: Apartment number label (from 1 to 5)
room_type: Type of the corresponding room (kitchen, living room, bedroom, office or kitchen_livingroom)
A detailed description of the study case building can be found in a dedicated technical report: Kamilla Heimar Andersen, Anna Marszal-Pomianowska, Henrik N. Knudsen, Hicham Johra, Simon Pommerencke Melgaard, Marc Zein Dahl, Patrick Andersen Hundevad, Per Kvols Heiselberg (2023). Room-based Indoor Environment Measurements and Occupancy Ground Truth Datasets from Five Residential Apartments in a Nordic Climate. DCE Technical Reports No. 318. Aalborg University, Department of the Built Environment. https://doi.org/10.54337/aau550646548.
This dataset was used to develop and validate an occupancy detection XGBoost model using IEQ measurements as inputs for residential buildings during the winter period: K.H. Andersen, H. Johra, M. Schaffer, A. Marszal-Pomianowska, H.N. Knudsen, P.K. Heiselberg, W. O'Brien (2024). Exploring occupant detection model generalizability for residential buildings using supervised learning with IEQ sensors. Building and Environment, 111319. https://doi.org/10.1016/j.buildenv.2024.111319.
This data release contains the input-data files and R scripts associated with the analysis presented in [citation of manuscript]. The spatial extent of the data is the contiguous U.S. The input-data files include one comma separated value (csv) file of county-level data, and one csv file of city-level data. The county-level csv (“county_data.csv”) contains data for 3,109 counties. This data includes two measures of water use, descriptive information about each county, three grouping variables (climate region, urban class, and economic dependency), and contains 18 explanatory variables: proportion of population growth from 2000-2010, fraction of withdrawals from surface water, average daily water yield, mean annual maximum temperature from 1970-2010, 2005-2010 maximum temperature departure from the 40-year maximum, mean annual precipitation from 1970-2010, 2005-2010 mean precipitation departure from the 40-year mean, Gini income disparity index, percent of county population with at least some college education, Cook Partisan Voting Index, housing density, median household income, average number of people per household, median age of structures, percent of renters, percent of single family homes, percent apartments, and a numeric version of urban class. The city-level csv (city_data.csv) contains data for 83 cities. This data includes descriptive information for each city, water-use measures, one grouping variable (climate region), and 6 explanatory variables: type of water bill (increasing block rate, decreasing block rate, or uniform), average price of water bill, number of requirement-oriented water conservation policies, number of rebate-oriented water conservation policies, aridity index, and regional price parity. The R scripts construct fixed-effects and Bayesian Hierarchical regression models. The primary difference between these models relates to how they handle possible clustering in the observations that define unique water-use settings. Fixed-effects models address possible clustering in one of two ways. In a "fully pooled" fixed-effects model, any clustering by group is ignored, and a single, fixed estimate of the coefficient for each covariate is developed using all of the observations. Conversely, in an unpooled fixed-effects model, separate coefficient estimates are developed only using the observations in each group. A hierarchical model provides a compromise between these two extremes. Hierarchical models extend single-level regression to data with a nested structure, whereby the model parameters vary at different levels in the model, including a lower level that describes the actual data and an upper level that influences the values taken by parameters in the lower level. The county-level models were compared using the Watanabe-Akaike information criterion (WAIC) which is derived from the log pointwise predictive density of the models and can be shown to approximate out-of-sample predictive performance. All script files are intended to be used with R statistical software (R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org) and Stan probabilistic modeling software (Stan Development Team. 2017. RStan: the R interface to Stan. R package version 2.16.2. http://mc-stan.org).
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset (.csv file) contains room-level measurements of indoor environmental quality (IEQ) variables together with occupancy ground truth for five residential apartments in Denmark over 7 days during January 2023, with a temporal resolution of 15 minutes. The occupancy ground truth has been determined based on activity logbooks filled in by apartments’ occupants. The measured IEQ variables are:
Indoor CO2 concentration
Indoor operative temperature
Indoor relative humidity
Additional features were generated from those three variables: variable transforms representing the short-term dynamics of the indoor environment for each of the three IEQ variables:
Difference between current variable value and average over the last hour
Difference between current variable value and previous recording (15 minutes prior)
Average of the variable over the last hour
Moreover, metadata parameters include the apartment number (1-5), room number (1-16), room type number, one-hot-encoded room type, floor area of the room, hour of the day, day number of the week, day number of the year, day label (unique identifier of a day for the different rooms) and date and time stamp. The occupancy ground truth variable is a binary 0 (no occupancy in the room) or 1 (occupancy in the room).
The full list of data variables with header name and description is as follows:
datetime: Date and time stamp in the format DD-MM-YY HH:MM (Day-Month-Year Hour:Minute)
indoor_co2_concentration: Indoor CO2 concentration inside the corresponding room (in ppm: parts per million)
indoor_operative_temperature: Indoor operative temperature inside the corresponding room (in Celsius degrees)
indoor_relative_humidity: Indoor relative humidity inside the corresponding room (in percent)
current_value_minus_average_last_hour_co2: Difference between the current indoor CO2 concentration inside the corresponding room and the average indoor CO2 concentration over the last hour (in ppm: parts per million)
current_value_minus_average_last_hour_operative_temperature: Difference between the current indoor operative temperature inside the corresponding room and the average indoor operative temperature over the last hour (in Celsius degrees)
current_value_minus_average_last_hour_relative_humidity: Difference between the current indoor relative humidity inside the corresponding room and the average indoor relative humidity over the last hour (in percent)
average_co2_last_hour: Average indoor CO2 concentration inside the corresponding room over the last hour (in ppm: parts per million)
average_operative_temperature_last_hour: Average indoor operative temperature inside the corresponding room over the last hour (in Celsius degrees)
average_relative_humidity_last_hour: Average indoor relative humidity inside the corresponding room over the last hour (in percent)
current_value_minus_last_15_min_co2: Difference between the current indoor CO2 concentration inside the corresponding room and the indoor CO2 concentration 15 minutes before (in ppm: parts per million)
current_value_minus_last_15_min_operative_temperature: Difference between the current indoor operative temperature inside the corresponding room and the indoor operative temperature 15 minutes before (in Celsius degrees)
current_value_minus_last_15_min_relative_humidity: Difference between the current indoor relative humidity inside the corresponding room and the indoor relative humidity 15 minutes before (in percent)
room_number: Room number label (from 1 to 16)
kitchen: One-hot-encoded room type for kitchen rooms
livingroom: One-hot-encoded room type for living rooms
bedroom: One-hot-encoded room type for bedrooms
office: One-hot-encoded room type for office rooms
kitchen_livingroom: One-hot-encoded room type for kitchen/living rooms
hour_of_the_day: Hour of the day (from 0 to 23)
day_number_of_the_week: Day number label of the week (from 1 to 7)
day_of_year: Day number label of year (from 30 to 37)
day_label: Day number label as a unique identifier of a day for the different rooms (from 0 to 112)
occupancy_ground_truth: Ground truth value of occupancy in the corresponding room. Binary variable taking the value of 0 when there is no occupancy (no people) in the room or 1 when there is occupancy (at least one person) in the room
floor_area: Surface floor area of the corresponding room (in square meters)
apartment_number: Apartment number label (from 1 to 5)
room_type: Type of the corresponding room (kitchen, living room, bedroom, office or kitchen_livingroom)
A detailed description of the study case building can be found in a dedicated technical report: Kamilla Heimar Andersen, Anna Marszal-Pomianowska, Henrik N. Knudsen, Hicham Johra, Simon Pommerencke Melgaard, Marc Zein Dahl, Patrick Andersen Hundevad, Per Kvols Heiselberg (2023). Room-based Indoor Environment Measurements and Occupancy Ground Truth Datasets from Five Residential Apartments in a Nordic Climate. DCE Technical Reports No. 318. Aalborg University, Department of the Built Environment. https://doi.org/10.54337/aau550646548.
This dataset was used to develop and validate an occupancy detection XGBoost model using IEQ measurements as inputs for residential buildings during the winter period: K.H. Andersen, H. Johra, M. Schaffer, A. Marszal-Pomianowska, H.N. Knudsen, P.K. Heiselberg, W. O'Brien (2024). Exploring occupant detection model generalizability for residential buildings using supervised learning with IEQ sensors. Building and Environment, 111319. https://doi.org/10.1016/j.buildenv.2024.111319.