The International Energy System (IES) from EIA.gov has production, reserves, consumption, capacity, storage, imports, exports, and emissions time series by country for electricity, petroleum, natural gas, coal, nuclear, and renewable energy.
EIA launched a new liquids pipeline projects database that tracks more than 200 pipeline projects for crude oil, hydrocarbon gas liquids (HGL), and other petroleum products. The database contains project information such as project type, start date, capacity, mileage, and geographic information for historical (completed since 2010) and future pipeline projects. The information in the database is based on the latest public information from company documents, government filings, and trade press, but it does not reflect any assumptions on the likelihood or timing of project completion. The liquids pipeline projects database complements EIA’s natural gas pipeline projects table.
Form EIA-930 data collection provides a centralized and comprehensive source for hourly operating data about the high-voltage bulk electric power grid in the Lower 48 states. We collect the data from the electricity balancing authorities (BAs) that operate the grid.
The EIA-906, EIA-920, EIA-923 and predecessor forms provide monthly and annual data on generation and fuel consumption at the power plant and prime mover level. A subset of plants, steam-electric plants 10 MW and above, also provides boiler level and generator level data. Data for utility plants are available from 1970, and for non-utility plants from 1999. Beginning with January 2004 data collection, the EIA-920 was used to collect data from the combined heat and power plant (cogeneration) segment of the non-utility sector; also as of 2004, nonutilities filed the annual data for nonutility source and disposition of electricity. Beginning in 2007, environmental data was collected on Schedules 8A – 8F of the Form 923 and includes by-product disposition, financial information, NOX control operations, cooling system operations and FGP and FGD unit operations. Beginning in 2008, the EIA-923 superseded the EIA-906, EIA-920, FERC 423, and the EIA-423. Schedule 2 of the EIA-923 collects the plant level fuel receipts and cost data previously collected on the FERC and EIA Forms 423. Data for fuel receipts and costs prior to 2010 are published at /cneaf/electricity/page/eia423.html.
Power plant data prior to 2001 are published as database (.DBF) files, with separate files for utility and non-utility plants. For 2001 data and subsequent years, the data are in Excel spreadsheet files that include data for all plants and make other changes to the presentation of the data.
Note that beginning with January 2001, the data for combined heat and power plants (i.e., the plants that provide data on the EIA-920 form) will only be posted in the combined Excel file.
The links will allow you to download the current Excel files, and will take you to the locations from which you can download the DBF-format utility and non-utility files for 2000 and earlier. The "Database Notes from EIA" link will take you to information on changes to the data and other points of interest to users.
Historical database (.dbf) files for utility (1970-2000) and non-utility (1999-2000)
Utility Database Legacy (.DBF) Format Non-Utility Database Legacy (.DBF) Format Database Notes from EIA Updated 4/21/10 Comments or Questions? E-Mail EIA-923@eia.doe.gov
Additional Links:
Monthly Generation and Fuel Consumption by State
Electric Power Monthly
Form EIA-923, Power Plant Operations Report, form and instructions, (http://www.eia.doe.gov/oiaf/aeo/images/pdf.gif" alt="pdf file" height="16" width="16">) pdf format
Form EIA-923, Power Plant Operations Report, form and instructions, MS Word format
<b>Contact:</b> <span class="bodypara"><div align="left"> Channele Wirman<br> Phone: 202-586-5356<br> Email: <a href="mailto:channele.wirman@eia.doe.gov">Channele Wirman</a></div></span>
EIA Form 714 Annual Electric Balancing Authority Area and Planning Area Report Dataset
The U.S. Residential Energy Consumption Survey, administered by the U.S. Energy Information Administration (EIA), uses a nationally representative sample to collect information about home characteristics, household energy usage, and energy cost. The microdata at the household level from 2020, 2015, 2009, 2005, 2001, 1997, 1993,1990, and 1987, made available by the EIA for public use, were curated by Carnegie Mellon University Libraries to make it more accessible for data analysis.
Daily values are the sum of hourly values. If one or more hourly values is missing in a given day, we use NA for the whole day.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Electricity: Average Retail Price: EIA: Residential data was reported at 13.300 0.01 USD/kWh in Aug 2018. This records an increase from the previous number of 13.130 0.01 USD/kWh for Jul 2018. United States Electricity: Average Retail Price: EIA: Residential data is updated monthly, averaging 8.590 0.01 USD/kWh from Jul 1976 (Median) to Aug 2018, with 434 observations. The data reached an all-time high of 13.300 0.01 USD/kWh in Aug 2018 and a record low of 3.600 0.01 USD/kWh in Jan 1977. United States Electricity: Average Retail Price: EIA: Residential data remains active status in CEIC and is reported by Energy Information Administration. The data is categorized under Global Database’s United States – Table US.P002: Energy Price.
This collection provides international data on natural gas. Data organized by country. Users of the EIA API are required to obtain an API Key via this registration form: http://www.eia.gov/beta/api/register.cfm
The U.S. Energy Information Administration (EIA) collects water cooling data for the electric power industry in the United States. This submission includes annual data from 2014 to 2019. Each spreadsheet details the generator type, fuel consumption, water consumption, cooling type, and equipment status, location, and water source for each plant.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States EIA Forecast: Electricity Consumption data was reported at 10.520 kWh/Day bn in Dec 2019. This records an increase from the previous number of 9.870 kWh/Day bn for Nov 2019. United States EIA Forecast: Electricity Consumption data is updated monthly, averaging 10.519 kWh/Day bn from Mar 2016 (Median) to Dec 2019, with 46 observations. The data reached an all-time high of 12.364 kWh/Day bn in Aug 2018 and a record low of 9.369 kWh/Day bn in Apr 2019. United States EIA Forecast: Electricity Consumption data remains active status in CEIC and is reported by Energy Information Administration. The data is categorized under Global Database’s United States – Table US.RB069: Electricity Supply and Consumption: Forecast: Energy Information Administration.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the first data release from the Public Utility Data Liberation (PUDL) project. It can be referenced & cited using https://doi.org/10.5281/zenodo.3653159
For more information about the free and open source software used to generate this data release, see Catalyst Cooperative's PUDL repository on Github, and the associated documentation on Read The Docs. This data release was generated using v0.3.1 of the catalystcoop.pudl
python package.
Included Data Packages
This release consists of three tabular data packages, conforming to the standards published by Frictionless Data and the Open Knowledge Foundation. The data are stored in CSV files (some of which are compressed using gzip), and the associated metadata is stored as JSON. These tabular data can be used to populate a relational database.
pudl-eia860-eia923:
pudl-eia860-eia923-epacems:
pudl-eia860-eia923
package above, as well as the Hourly Emissions data from the US Environmental Protection Agency's (EPA's) Continuous Emissions Monitoring System (CEMS) from 1995-2018. The EPA CEMS data covers thousands of power plants at hourly resolution for decades, and contains close to a billion records.pudl-ferc1
:catalystcoop.pudl
Python package and the original source data files archived as part of this data release.Contact Us
If you're using PUDL, we would love to hear from you! Even if it's just a note to let us know that you exist, and how you're using the software or data. You can also:
Using the Data
The data packages are just CSVs (data) and JSON (metadata) files. They can be used with a variety of tools on many platforms. However, the data is organized primarily with the idea that it will be loaded into a relational database, and the PUDL Python package that was used to generate this data release can facilitate that process. Once the data is loaded into a database, you can access that DB however you like.
Make sure conda
is installed
None of these commands will work without the conda
Python package manager installed, either via Anaconda or miniconda
:
Download the data
First download the files from the Zenodo archive into a new empty directory. A couple of them are very large (5-10 GB), and depending on what you're trying to do you may not need them.
pudl-input-data.tgz
.pudl-eia860-eia923-epacems.tgz
.Load All of PUDL in a Single Line
Use cd
to get into your new directory at the terminal (in Linux or Mac OS), or open up an Anaconda terminal in that directory if you're on Windows.
If you have downloaded all of the files from the archive, and you want it all to be accessible locally, you can run a single shell script, called load-pudl.sh
:
bash pudl-load.sh
This will do the following:
sqlite/pudl.sqlite
.parquet/epacems
.sqlite/ferc1.sqlite
.Selectively Load PUDL Data
If you don't want to download and load all of the PUDL data, you can load each of the above datasets separately.
Create the PUDL conda
Environment
This installs the PUDL software locally, and a couple of other useful packages:
conda create --yes --name pudl --channel conda-forge \
--strict-channel-priority \
python=3.7 catalystcoop.pudl=0.3.1 dask jupyter jupyterlab seaborn pip
conda activate pudl
Create a PUDL data management workspace
Use the PUDL setup script to create a new data management environment inside this directory. After you run this command you'll see some other directories show up, like parquet
, sqlite
, data
etc.
pudl_setup ./
Extract and load the FERC Form 1 and EIA 860/923 data
If you just want the FERC Form 1 and EIA 860/923 data that has been integrated into PUDL, you only need to download pudl-ferc1.tgz
and pudl-eia860-eia923.tgz
. Then extract them in the same directory where you ran pudl_setup
:
tar -xzf pudl-ferc1.tgz
tar -xzf pudl-eia860-eia923.tgz
To make use of the FERC Form 1 and EIA 860/923 data, you'll probably want to load them into a local database. The datapkg_to_sqlite
script that comes with PUDL will do that for you:
datapkg_to_sqlite \
datapkg/pudl-data-release/pudl-ferc1/datapackage.json \
datapkg/pudl-data-release/pudl-eia860-eia923/datapackage.json \
-o datapkg/pudl-data-release/pudl-merged/
Now you should be able to connect to the database (~300 MB) which is stored in sqlite/pudl.sqlite
.
Extract EPA CEMS and convert to Apache Parquet
If you want to work with the EPA CEMS data, which is much larger, we recommend converting it to an Apache Parquet dataset with the included epacems_to_parquet
script. Then you can read those files into dataframes directly. In Python you can use the pandas.DataFrame.read_parquet()
method. If you need to work with more data than can fit in memory at one time, we recommend using Dask dataframes. Converting the entire dataset from datapackages into Apache Parquet may take an hour or more:
tar -xzf pudl-eia860-eia923-epacems.tgz
epacems_to_parquet datapkg/pudl-data-release/pudl-eia860-eia923-epacems/datapackage.json
You should find the Parquet dataset (~5 GB) under parquet/epacems
, partitioned by year and state for easier querying.
Clone the raw FERC Form 1 Databases
If you want to access the entire set of original, raw FERC Form 1 data (of which only a small subset has been cleaned and integrated into PUDL) you can extract the original input data that's part of the Zenodo archive and run the ferc1_to_sqlite
script using the same settings file that was used to generate the data release:
tar -xzf pudl-input-data.tgz
ferc1_to_sqlite data-release-settings.yml
You'll find the FERC Form 1 database (~820 MB) in sqlite/ferc1.sqlite
.
Data Quality Control
We have performed basic sanity checks on much but not all of the data compiled in PUDL to ensure that we identify any major issues we might have introduced through our processing
Note: Sample data provided. ・ These data identify operable electric generating plants in the United States by energy source, as of November 2023.The attribute data for this point dataset come from the U.S. Energy Information Administration, EIA-860, Annual Electric Generator Report; EIA-860M, Monthly Update to the Annual Electric Generator Report; and EIA-923, Power Plant Operations Report. It includes all operable plants by energy source with a combined nameplate capacity of 1 megawatt or more that are operating, are on standby, or out of service for short- or long-term.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Natural Gas Price: EIA: Wellhead data was reported at 3.350 USD/1000 Cub ft in Dec 2012. This stayed constant from the previous number of 3.350 USD/1000 Cub ft for Nov 2012. United States Natural Gas Price: EIA: Wellhead data is updated monthly, averaging 2.175 USD/1000 Cub ft from Jan 1976 (Median) to Dec 2012, with 444 observations. The data reached an all-time high of 10.790 USD/1000 Cub ft in Jul 2008 and a record low of 0.540 USD/1000 Cub ft in Mar 1976. United States Natural Gas Price: EIA: Wellhead data remains active status in CEIC and is reported by Energy Information Administration. The data is categorized under Global Database’s United States – Table US.P002: Energy Price.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Natural Gas Price: EIA: Industrial data was reported at 3.750 USD/1000 Cub ft in Sep 2018. This records an increase from the previous number of 3.670 USD/1000 Cub ft for Aug 2018. United States Natural Gas Price: EIA: Industrial data is updated monthly, averaging 3.800 USD/1000 Cub ft from Jan 1984 (Median) to Sep 2018, with 417 observations. The data reached an all-time high of 13.060 USD/1000 Cub ft in Jul 2008 and a record low of 2.230 USD/1000 Cub ft in Jul 1991. United States Natural Gas Price: EIA: Industrial data remains active status in CEIC and is reported by Energy Information Administration. The data is categorized under Global Database’s United States – Table US.P002: Energy Price.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Natural Gas Price: EIA: Residential data was reported at 17.320 USD/1000 Cub ft in Sep 2018. This records a decrease from the previous number of 18.630 USD/1000 Cub ft for Aug 2018. United States Natural Gas Price: EIA: Residential data is updated monthly, averaging 8.200 USD/1000 Cub ft from Jan 1981 (Median) to Sep 2018, with 453 observations. The data reached an all-time high of 20.770 USD/1000 Cub ft in Jul 2008 and a record low of 3.940 USD/1000 Cub ft in Jan 1981. United States Natural Gas Price: EIA: Residential data remains active status in CEIC and is reported by Energy Information Administration. The data is categorized under Global Database’s United States – Table US.P002: Energy Price.
This dataset is the 2010 United States Energy Consumption by Sector and Source, part of the Annual Energy Outlook that highlights changes in the AEO Reference case projections for key energy topics.
All data made available in bulk through the EIA Open Data API, including:
Archived from https://www.eia.gov/opendata/bulkfiles.php. The Annual Energy Outlook data is also archived separately here.
This archive contains raw input data for the Public Utility Data Liberation (PUDL) software developed by Catalyst Cooperative. At present, PUDL integrates only a few specific data series related to fuel receipts and costs figures from the Bulk Electricity API. It is organized into Frictionless Data Packages. For additional information about this data and PUDL, see the following resources:
Note: Sample data provided. ・ These data identify and provide detailed information on underground natural gas storage in the United States as of December 2022. The attribute data for this point dataset come from EIA’s U.S. field level storage data, which is sourced from U.S. Energy Information Administration, Form EIA-191, Monthly Underground Gas Storage Report. It includes both active and inactive natural gas storage fields. EIA-191 collects information on working and base gas in reservoirs, injections, withdrawals, and location of reservoirs from operators of all underground natural gas storage fields on a monthly basis. The facility location data represent the approximate location based on research of publicly available information from sources such as Federal agencies, company websites, and satellite images on public websites.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Electricity: Average Retail Price: EIA: Total data was reported at 10.230 0.01 USD/kWh in Apr 2018. This records a decrease from the previous number of 10.370 0.01 USD/kWh for Mar 2018. United States Electricity: Average Retail Price: EIA: Total data is updated monthly, averaging 7.135 0.01 USD/kWh from Jul 1976 (Median) to Apr 2018, with 430 observations. The data reached an all-time high of 11.030 0.01 USD/kWh in Jul 2014 and a record low of 3.000 0.01 USD/kWh in Aug 1976. United States Electricity: Average Retail Price: EIA: Total data remains active status in CEIC and is reported by Energy Information Administration. The data is categorized under Global Database’s USA – Table US.P002: Energy Price.
The International Energy System (IES) from EIA.gov has production, reserves, consumption, capacity, storage, imports, exports, and emissions time series by country for electricity, petroleum, natural gas, coal, nuclear, and renewable energy.