The 5-year goal of the “Model America” concept was to generate a model of every building in the United States. This data repository delivers on that goal. Oak Ridge National Laboratory (ORNL) has developed the Automatic Building Energy Modeling (AutoBEM) software suite to process multiple types of data, extract building-specific descriptors, generate building energy models, and simulate them on High Performance Computing (HPC) resources. For more information, see AutoBEM-related publications (bit.ly/AutoBEM). There were 125,714,640 buildings detected in the United States and this dataset contains 122,930,327 (97.8%) buildings which resulted in a successful simulation. Future, annual updates have been proposed that may include additional buildings, data improvements, or other algorithmic enhancements. This dataset of 122.9 million buildings includes: Models (state_county.zip) – OpenStudio (v3.1.0) and EnergyPlus (v9.4) building energy models. Please note that the download requires the free Globus Connect Personal (https://www.globus.org/globus-connect-personal); Each model has approximately 3,000 building input descriptors that can be extracted. Please see the EnergyPlus (v9.4) 2,784-page Input/Output Reference Guide (https://energyplus.net/sites/all/modules/custom/nrel_custom/pdfs/pdfs_v9.4.0/InputOutputReference.pdf) for everything that can be retrieved or simulated from these models. These models were derived from the following metadata, which is not included in this dataset: 1. ID - unique building ID 2. County - county name 3. State - state name 4. CZ - ASHRAE Climate Zone designation 5. Clim_Zone - text label of climate zone 6. est_year - estimated year of construction 7. est_commercial - estimated building type (0=residential, 1=commercial) 8. Centroid - building center location in latitude/longitude (from Footprint2D) 9. Footprint2D - building polygon of 2D footprint (lat1/lon1_lat2/lon2_...) 10. Height - building height (meters) 11. Area2D - footprint area (ft2) 12. BuildingType - DOE prototype building designation (IECC=residential) as implemented by OpenStudio-standards 13. WWR_surfaces - percent of each facade (pair of points from Footprint2D) covered by fenestration/windows (average 14.5% for residential, 40% for commercial buildings) 14. NumFloors - number of floors (above-grade) 15. Area - estimate of total conditioned floor area (ft2) 16. Standard - building vintage. These models are made free and openly available in hopes of stimulating any simulation-informed use case. Data is provided as-is with no warranties, express or implied, regarding fitness for a particular purpose. We wish to thank our sponsors which include Oak Ridge National Laboratory (ORNL) Laboratory Directed Research and Development (LDRD), U.S. Dept. of Energy’s (DOE) Building Technologies Office (BTO), Office of Electricity (OE), Biological and Environmental Research (BER), and National Nuclear Security Administration (NNSA). This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357. Please cite as: New, Joshua R., Adams, Mark, Bass, Brett, Berres, Anne, and Clinton, Nicholas (2021). “Model America - data and models of every U.S. building. [Data set].” Constellation, doi.ccs.ornl.gov/ui/doi/339, April 14, 2021
Oak Ridge National Laboratory (ORNL) has developed the Automatic Building Energy Modeling (AutoBEM) software suite to process multiple types of data, extract building-specific descriptors, generate building energy models, and simulate them on High Performance Computing (HPC) resources. For more information, see AutoBEM-related publications (bit.ly/AutoBEM).
Data is provided for 2,555,152 buildings located within the boundary of Arizona in the United States:
Data (1.48GB *.csv) - Arizona 2,555,152 building information data with simulation results separated by county. (Simulation results are for June 1st-August 31st, 2020)
Building Information Data Fields:
Energy Simulation Data Fields:
This data is made free and openly available in hopes of stimulating any simulation-informed use case. Data is provided as-is with no warranties, express or implied, regarding fitness for a particular purpose. We wish to thank our sponsors which include Oak Ridge National Laboratory (ORNL), U.S. Dept. of Energy’s (DOE) Building Technologies Office (BTO), Office of Electricity (OE), and Biological and Environmental Research (BER).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Oak Ridge National Laboratory (ORNL) has developed the Automatic Building Energy Modeling (AutoBEM) software suite to process multiple types of data, extract building-specific descriptors, generate building energy models, and simulate them on High Performance Computing (HPC) resources. For more information, see AutoBEM-related publications (bit.ly/AutoBEM).
This dataset include the energy simulation output using NASA Power 2022 weather data (June 1st-August 31st) for each county in AZ. The output field includes basic energy simulation outputs and Anthropogenic Emissions.
Building Information Data Fields:
Energy Simulation Data Fields:
Dataset quality ***: High quality dataset that was quality-checked by the EIDC team
The Massive Data Institute is partnering with BlocPower, who has created an open building data set for 121 million buildings across America. This is the largest building-level dataset in the country. This data enables researchers, policymakers, and community leaders to harness information on building characteristics to make buildings greener, smarter, and healthier.
Buildings produce over 30% of US greenhouse gas (GHG) emissions. Correctly analyzing, sizing, engineering, and commissioning projects to reduce GHG emissions requires accurate and precise data. However, that data is currently highly fragmented, inaccessible, and unreliable.
In this initial data release, EIDC and BlocPower are making a subset of the data accessible at a building-level resolution. The data is accessible to registered users.
The following list provides a brief description of the variables in the current table, 'BlocPower Core'.
%3Cu%3E%3Cstrong%3EGeographic characteristics:%3C/strong%3E%3C/u%3E
%3Cu%3E%3Cstrong%3E%3C/strong%3E%3C/u%3E
building_id: unique identifier for each building
state: includes all 50 U.S. states and Washington D.C.
county: includes 1,810 U.S. counties
city: includes 15,391 U.S. cities
zip: includes 25,837 U.S. zipcodes
address: includes 68 million building addresses
%3Cu%3E%3Cstrong%3EBuilding characteristics:%3C/strong%3E%3C/u%3E
%3Cu%3E%3Cstrong%3E%3C/strong%3E%3C/u%3E
area_sq_ft: total area of building in square feet
year_built: year in which building was built
building_type: type of building, includes values such as single family residential, multi family residential, and small commercial.
%3Cu%3E%3Cstrong%3EBuilding system types:%3C/strong%3E%3C/u%3E
%3Cu%3E%3Cstrong%3E%3C/strong%3E%3C/u%3E
cooling_system_type: type of cooling system, includes values such as central air, chilled water, evaporative cooler, wall unit, and window unit.
heating_system_type: type of heating system, includes values such as central air, electric, forced air, gas, heat pump, hot water, and solar.
heating_fuel_type: type of heating fuel, includes values such as electric, wood, oil, propane, electric, coal, and gas.
%3Cu%3E%3Cstrong%3EModeled variables:%3C/strong%3E%3C/u%3E
%3Cu%3E%3Cstrong%3E%3C/strong%3E%3C/u%3E
The energy use variables were modeled by BlocPower based on ORNL’s Model America building energy profiles.
total_site_energy_GJ: Total energy - amount of heat and electricity - consumed by a building on site, in gigajoules.
total_source_energy_GJ: Total amount of raw fuel that is required to operate the building, in gigajoules. It incorporates all transmission, delivery, and production losses. Recommended by the EPA as the best unit of evaluation for comparing different buildings.
energy_use_intensity: Total energy use normalized by building area, in units of thousand British Thermal Units (kBTUs) per square foot.
energy_efficiency_potential: Predicted energy efficiency potential of a building, classified as low, medium or high.
There are missing values for several variables. Users can inspect the number of missing values for a variable by the following steps:
Click on ‘Tables’ at the top of this landing page.
Click on the table named ‘BlocPower.Core’.
Click on the variable of interest,
Look for the distribution of possible variable values (including missing values) will appear on the lower right-hand corner of the popup window (scroll down as needed).
%3C!-- --%3E
Addresses are available for 68 million buildings. Modeled data is matched to addresses using nearest neighbor search methods. The modeling process is unable to link buildings to specific addresses for states that do not provide building coordinates. Because the modeled data is based on a limited set of publicly available addresses, some specific street addresses may be duplicated.
Consequently, data is completely missing for 8 states - DC, DE, FL, GA, MD, NC, SC, WV. Data is also sparse for VA and TX (%3E90% missing values). An upcoming priority is to identify ways to source addresses for these states. Addresses are mostly available for AK, CA, CO, CT, HI, IL, LA, MA, MI, NH, NJ, NY, NV, OR, PA, RI, TN, WA (less than 25% missing).
Building system data is most complete for AK, CO, CT, MA, NH, NY, NV, RI, WA
(average of %3C30% missing). Missing data for building sys
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The 5-year goal of the “Model America” concept was to generate a model of every building in the United States. This data repository delivers on that goal. Oak Ridge National Laboratory (ORNL) has developed the Automatic Building Energy Modeling (AutoBEM) software suite to process multiple types of data, extract building-specific descriptors, generate building energy models, and simulate them on High Performance Computing (HPC) resources. For more information, see AutoBEM-related publications (bit.ly/AutoBEM). There were 125,714,640 buildings detected in the United States and this dataset contains 122,930,327 (97.8%) buildings which resulted in a successful simulation. Future, annual updates have been proposed that may include additional buildings, data improvements, or other algorithmic enhancements. This dataset of 122.9 million buildings includes: Models (state_county.zip) – OpenStudio (v3.1.0) and EnergyPlus (v9.4) building energy models. Please note that the download requires the free Globus Connect Personal (https://www.globus.org/globus-connect-personal); Each model has approximately 3,000 building input descriptors that can be extracted. Please see the EnergyPlus (v9.4) 2,784-page Input/Output Reference Guide (https://energyplus.net/sites/all/modules/custom/nrel_custom/pdfs/pdfs_v9.4.0/InputOutputReference.pdf) for everything that can be retrieved or simulated from these models. These models were derived from the following metadata, which is not included in this dataset: 1. ID - unique building ID 2. County - county name 3. State - state name 4. CZ - ASHRAE Climate Zone designation 5. Clim_Zone - text label of climate zone 6. est_year - estimated year of construction 7. est_commercial - estimated building type (0=residential, 1=commercial) 8. Centroid - building center location in latitude/longitude (from Footprint2D) 9. Footprint2D - building polygon of 2D footprint (lat1/lon1_lat2/lon2_...) 10. Height - building height (meters) 11. Area2D - footprint area (ft2) 12. BuildingType - DOE prototype building designation (IECC=residential) as implemented by OpenStudio-standards 13. WWR_surfaces - percent of each facade (pair of points from Footprint2D) covered by fenestration/windows (average 14.5% for residential, 40% for commercial buildings) 14. NumFloors - number of floors (above-grade) 15. Area - estimate of total conditioned floor area (ft2) 16. Standard - building vintage. These models are made free and openly available in hopes of stimulating any simulation-informed use case. Data is provided as-is with no warranties, express or implied, regarding fitness for a particular purpose. We wish to thank our sponsors which include Oak Ridge National Laboratory (ORNL) Laboratory Directed Research and Development (LDRD), U.S. Dept. of Energy’s (DOE) Building Technologies Office (BTO), Office of Electricity (OE), Biological and Environmental Research (BER), and National Nuclear Security Administration (NNSA). This research used resources of the Argonne Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC02-06CH11357. Please cite as: New, Joshua R., Adams, Mark, Bass, Brett, Berres, Anne, and Clinton, Nicholas (2021). “Model America - data and models of every U.S. building. [Data set].” Constellation, doi.ccs.ornl.gov/ui/doi/339, April 14, 2021