This dataset was created by Afroz
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
UK PV dataset
Domestic solar photovoltaic (PV) power generation data from Great Britain. This dataset contains data from over 30,000 solar PV systems. The dataset spans 2010 to 2025. The nominal generation capacity per PV system ranges from 0.47 kilowatts to 250 kilowatts. The dataset is updated with new data every few months. All PV systems in this dataset report cumulative energy generation every 30 minutes. This data represents a true accumulation of the total energy generated… See the full description on the dataset page: https://huggingface.co/datasets/openclimatefix/uk_pv.
In this dataset the anther's analysis is based on data from NREL about Solar & Wind energy generation by operation areas.
NASA Prediction of Worldwide Energy Resources
COA = central operating area.
EOA = eastern operating area.
SOA = southern operating area.
WOA = western operating area. Source: NRELSource Link
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset was created by MD ABU RAIHAN
Released under CC BY-NC-SA 4.0
This is a dataset created using my residential solar data and weather data from the National Center for Environmental Information.
The goal was to create a dataset that contained information that could be used for modeling or analysis to review weather features and their effect on solar energy output. The data is hourly as the local airport collects data every 20 minutes starting at 00:15, but my solar system does not allow for reports to be generated with this frequency.
Here is some info on my solar system. It was installed in mid-2020 and contains 32 residential solar panels. My yard has little to no shade with the panels being on the west side of my roof.
This dataset contains over two years of 1-minute resolution data collected from four floating solar sites, as well as data from a land-based PV system co-located with one of the floating sites. The dataset includes highly granular module temperature measurements - five modules per floating site, with three sensors per module, totaling 15 module temperature sensors per floating site. In addition to the module temperature data, meteorological data collected at the floating sites is also included, along with traditional PV system-level parameters. The data is intended for analysis of solar energy production, efficiency, and performance degradation over time. For information about the data file usage see the "README" resource below. See "Metadata File" for information about individual files and other metadata information.
Solar Footprints in CaliforniaThis GIS dataset consists of polygons that represent the footprints of solar powered electric generation facilities and related infrastructure in California called Solar Footprints. The location of solar footprints was identified using other existing solar footprint datasets from various sources along with imagery interpretation. CEC staff reviewed footprints identified with imagery and digitized polygons to match the visual extent of each facility. Previous datasets of existing solar footprints used to locate solar facilities include: GIS Layers: (1) California Solar Footprints, (2) UC Berkeley Solar Points, (3) Kruitwagen et al. 2021, (4) BLM Renewable Project Facilities, (5) Quarterly Fuel and Energy Report (QFER)Imagery Datasets: Esri World Imagery, USGS National Agriculture Imagery Program (NAIP), 2020 SENTINEL 2 Satellite Imagery, 2023Solar facilities with large footprints such as parking lot solar, large rooftop solar, and ground solar were included in the solar footprint dataset. Small scale solar (approximately less than 0.5 acre) and residential footprints were not included. No other data was used in the production of these shapes. Definitions for the solar facilities identified via imagery are subjective and described as follows: Rooftop Solar: Solar arrays located on rooftops of large buildings. Parking lot Solar: Solar panels on parking lots roughly larger than 1 acre, or clusters of solar panels in adjacent parking lots. Ground Solar: Solar panels located on ground roughly larger than 1 acre, or large clusters of smaller scale footprints. Once all footprints identified by the above criteria were digitized for all California counties, the features were visually classified into ground, parking and rooftop categories. The features were also classified into rural and urban types using the 42 U.S. Code § 1490 definition for rural. In addition, the distance to the closest substation and the percentile category of this distance (e.g. 0-25th percentile, 25th-50th percentile) was also calculated. The coverage provided by this data set should not be assumed to be a complete accounting of solar footprints in California. Rather, this dataset represents an attempt to improve upon existing solar feature datasets and to update the inventory of "large" solar footprints via imagery, especially in recent years since previous datasets were published. This procedure produced a total solar project footprint of 150,250 acres. Attempts to classify these footprints and isolate the large utility-scale projects from the smaller rooftop solar projects identified in the data set is difficult. The data was gathered based on imagery, and project information that could link multiple adjacent solar footprints under one larger project is not known. However, partitioning all solar footprints that are at least partly outside of the techno-economic exclusions and greater than 7 acres yields a total footprint size of 133,493 acres. These can be approximated as utility-scale footprints. Metadata: (1) CBI Solar FootprintsAbstract: Conservation Biology Institute (CBI) created this dataset of solar footprints in California after it was found that no such dataset was publicly available at the time (Dec 2015-Jan 2016). This dataset is used to help identify where current ground based, mostly utility scale, solar facilities are being constructed and will be used in a larger landscape intactness model to help guide future development of renewable energy projects. The process of digitizing these footprints first began by utilizing an excel file from the California Energy Commission with lat/long coordinates of some of the older and bigger locations. After projecting those points and locating the facilities utilizing NAIP 2014 imagery, the developed area around each facility was digitized. While interpreting imagery, there were some instances where a fenced perimeter was clearly seen and was slightly larger than the actual footprint. For those cases the footprint followed the fenced perimeter since it limits wildlife movement through the area. In other instances, it was clear that the top soil had been scraped of any vegetation, even outside of the primary facility footprint. These footprints included the areas that were scraped within the fencing since, especially in desert systems, it has been near permanently altered. Other sources that guided the search for solar facilities included the Energy Justice Map, developed by the Energy Justice Network which can be found here:https://www.energyjustice.net/map/searchobject.php?gsMapsize=large&giCurrentpageiFacilityid;=1&gsTable;=facility&gsSearchtype;=advancedThe Solar Energy Industries Association’s “Project Location Map” which can be found here: https://www.seia.org/map/majorprojectsmap.phpalso assisted in locating newer facilities along with the "Power Plants" shapefile, updated in December 16th, 2015, downloaded from the U.S. Energy Information Administration located here:https://www.eia.gov/maps/layer_info-m.cfmThere were some facilities that were stumbled upon while searching for others, most of these are smaller scale sites located near farm infrastructure. Other sites were located by contacting counties that had solar developments within the county. Still, others were located by sleuthing around for proposals and company websites that had images of the completed facility. These helped to locate the most recently developed sites and these sites were digitized based on landmarks such as ditches, trees, roads and other permanent structures.Metadata: (2) UC Berkeley Solar PointsUC Berkeley report containing point location for energy facilities across the United States.2022_utility-scale_solar_data_update.xlsm (live.com)Metadata: (3) Kruitwagen et al. 2021Abstract: Photovoltaic (PV) solar energy generating capacity has grown by 41 per cent per year since 2009. Energy system projections that mitigate climate change and aid universal energy access show a nearly ten-fold increase in PV solar energy generating capacity by 2040. Geospatial data describing the energy system are required to manage generation intermittency, mitigate climate change risks, and identify trade-offs with biodiversity, conservation and land protection priorities caused by the land-use and land-cover change necessary for PV deployment. Currently available inventories of solar generating capacity cannot fully address these needs. Here we provide a global inventory of commercial-, industrial- and utility-scale PV installations (that is, PV generating stations in excess of 10 kilowatts nameplate capacity) by using a longitudinal corpus of remote sensing imagery, machine learning and a large cloud computation infrastructure. We locate and verify 68,661 facilities, an increase of 432 per cent (in number of facilities) on previously available asset-level data. With the help of a hand-labelled test set, we estimate global installed generating capacity to be 423 gigawatts (−75/+77 gigawatts) at the end of 2018. Enrichment of our dataset with estimates of facility installation date, historic land-cover classification and proximity to vulnerable areas allows us to show that most of the PV solar energy facilities are sited on cropland, followed by arid lands and grassland. Our inventory could aid PV delivery aligned with the Sustainable Development GoalsEnergy Resource Land Use Planning - Kruitwagen_etal_Nature.pdf - All Documents (sharepoint.com)Metadata: (4) BLM Renewable ProjectTo identify renewable energy approved and pending lease areas on BLM administered lands. To provide information about solar and wind energy applications and completed projects within the State of California for analysis and display internally and externally. This feature class denotes "verified" renewable energy projects at the California State BLM Office, displayed in GIS. The term "Verified" refers to the GIS data being constructed at the California State Office, using the actual application/maps with legal descriptions obtained from the renewable energy company. https://www.blm.gov/wo/st/en/prog/energy/renewable_energy https://www.blm.gov/style/medialib/blm/wo/MINERALS_REALTY_AND_RESOURCE_PROTECTION_/energy/solar_and_wind.Par.70101.File.dat/Public%20Webinar%20Dec%203%202014%20-%20Solar%20and%20Wind%20Regulations.pdfBLM CA Renewable Energy Projects | BLM GBP Hub (arcgis.com)Metadata: (5) Quarterly Fuel and Energy Report (QFER) California Power Plants - Overview (arcgis.com)
https://choosealicense.com/licenses/bsd-3-clause/https://choosealicense.com/licenses/bsd-3-clause/
PV Generation Dataset
This dataset compiles and harmonizes multiple open pv datasets.
Curated by: Attila Balint License: BSD 3-clause "New" or "Revised" licence
Uses
This pv dataset facilitates primarily solar generation forecasting.
Dataset Structure
The dataset contains three main files.
data/generation.parquet data/metadata.parquet data/weather.parquet
data/generation.parquet
This file contains the electricity generation values and has three… See the full description on the dataset page: https://huggingface.co/datasets/EDS-lab/pv-generation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a structured, high-precision compilation of environmental and solar irradiance data sourced from NASA POWER, covering India's major geographic zones over the span of one year (January to December 2022). It includes seven critical parameters—such as solar radiation, temperature, cloud cover, albedo, and precipitation—essential for evaluating solar power generation potential. The primary goal of the dataset is to aid in the identification of optimal locations for solar energy infrastructure by applying geospatial and machine learning techniques. Carefully preprocessed for consistency and organized for ease of use, this dataset is not only useful for current solar site suitability analysis but also offers long-term value to researchers, urban planners, and policymakers. It supports advanced analytics like clustering, classification, and visualizations, and can serve as a foundation for predictive modeling, transfer learning, and sustainability-oriented decision-making in the field of renewable energy.
The National Renewable Energy Laboratory's (NREL) Photovoltaic (PV) Rooftop Database (PVRDB) is a lidar-derived, geospatially-resolved dataset of suitable roof surfaces and their PV technical potential for 128 metropolitan regions in the United States. The PVRDB data are organized by city and year of lidar collection. Five geospatial layers are available for each city and year: 1) the raster extent of the lidar collection, 2) buildings identified from the lidar data, 3) suitable developable planes for each building, 4) aspect values of the developable planes, and 5) the technical potential estimates of the developable planes.
This dataset is based on solar interconnection data drawn from the publicly posted inventories of New York State’s electric utilities. This dataset represents the most comprehensive source of installed distributed solar projects, including projects that did not receive State funding, for all of New York State since 2000. This dataset does not include utility-scale projects that participate in the NYISO wholesale market. The interactive map at https://www.nyserda.ny.gov/All-Programs/Programs/NY-Sun/Solar-Data-Maps/Statewide-Projects provides information on Statewide Distributed Solar Projects since 2000 by county. The New York State Energy Research and Development Authority (NYSERDA) offers objective information and analysis, innovative programs, technical expertise, and support to help New Yorkers increase energy efficiency, save money, use renewable energy, and reduce reliance on fossil fuels. To learn more about NYSERDA’s programs, visit nyserda.ny.gov or follow us on X, Facebook, YouTube, or Instagram.
Dataset Card for Solar Irradiance Data in Sri Lanka
Dataset Summary
This dataset provides solar irradiance data for Sri Lanka, including historical and current solar radiation measurements. The data is vital for renewable energy applications, particularly in solar energy generation.
Supported Tasks and Leaderboards
This dataset can be used for tasks related to solar energy forecasting, climate modeling, and geographical information systems (GIS) analysis.… See the full description on the dataset page: https://huggingface.co/datasets/SasikaA073/LK_Solar_Dataset.
A partnership with the University of Nevada and U.S. Department of Energy's National Renewable Energy Laboratory (NREL) to collect solar data to support future solar power generation in the United States. The measurement station monitors global horizontal, direct normal, and diffuse horizontal irradiance to define the amount of solar energy that hits this particular location. The solar measurement instrumentation is also accompanied by meteorological monitoring equipment to provide scientists with a complete picture of the solar power possibilities.
Lawrence Berkeley National Laboratory (Berkeley Lab) estimates hourly project-level generation data for utility-scale solar projects and hourly county-level generation data for residential and non-residential distributed photovoltaic (PV) systems in the seven organized wholesale markets and 10 additional Balancing Areas. To encourage its broader use, Berkeley Lab has made this data file public here at OEDI. The public project-level dataset is updated annually with data from the previous calendar year. For more information about the research project, including a technical report, briefing material, visualizations, and additional data, please visit the project homepage linked in this submission. A newer version of the data exists and can be found linked in the resources of this submission under "Solar-to-Grid Public Data File Updated 2021".
This is the sun angle data for the cities near the solar plants mentioned in this dataset: https://www.kaggle.com/anikannal/solar-power-generation-data.
These 6 CSV files contain the time of sunrise, sunset, and solar noon, for the two solar plants mentioned. They come from this NOA
https://www.esrl.noaa.gov/gmd/grad/solcalc/ Thanks to kaggle, and NOAA for the calculations specifically.
This dataset was used to support machine learning model development for wildfire impacts on utility-scale solar in the United States. The dataset includes information regarding energy generation, PM2.5, clearness index, temperature, wind speed, precipitation, and site size.
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
High Frequency Indicator: The dataset contains year- and month-wise All India compiled data from the year 2012 to till date on Solar, Hydro and Wind power generated in India Note: 1. Solar power generation data up to March 2017 is not available. 2. Data for April 2013 is not public. 3. Wind Generation is as reported by SLDCs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Solar electrical generation data is reported for commercial power plants: those with a nameplate capacity of 1 MW or more. Counties in gray reported no generation in 2020. San Bernardino and Riverside county had solar thermal electric generation. Map and data from the California Energy Commission. Data is classified using the Jenk’s Natural Break’s method. Data is current as of November 22, 2021. Contact Rebecca Vail at (916)651- 0477 or John Hingtgen at (916) 510-9747 with questions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains measured timeseries of renewable energy production and electricity consumption as well as exchange with neighboring countries/continents on hourly time resolution. The timeseries data has been divided into two xml files, one for each of the Danish price regions; DK1 (Western Denmark) and DK2 (Eastern Denmark). The data comes from the Danish TSO Energinet and was used in a flexibility study by Karen Olsen in 2018-19 leading to a paper that is to appear in the proceedings of the ICAE19 conference and is entitled: "Data-driven flexibility requirements for current and future scenarios with high penetration of renewables". A journal paper has also been submitted using the same data.The data has been extracted from a website run by Energinet at the following link where time series data is publicly available:https://www.energidataservice.dk/dataset/electricitybalanceThe present version was extracted in September 2019 and contains installation and production data from 2011 until and including the beginning of September 2019.The data is in the originally downloaded xml files, ready to be parsed by the python code written by Karen Olsen (see reference for Fanfare code).Data used for analysis:- offshore wind power generated (column: "Offshore Wind Power" in the xml file)- onshore wind power generated (column: "Onshore Wind Power" in the xml file)- solar power generated (column: "Solar Power Prod" in the xml file)- gross consumption (column: "Gross Con" in the xml file)Further information and code for analysis can be found under:https://kpolsen.github.io/FANFARE/Contains data used pursuant to 'Conditions for use of Danish public-sector data' from the Energi Data Service portal (www.energidataservice.dk).
The National Solar Radiation Database (NSRDB) is a serially complete collection of meteorological and solar irradiance data sets for the United States and a growing list of international locations for 1998-2023. The NSRDB is updated annually and provides foundational information to support U.S. Department of Energy programs, research, industry and the general public. The NSRDB provides time-series data at 30-minute resolution of resource averaged over surface cells of 0.038 degrees in both latitude and longitude, or nominally 4 km in size. Additionally time series data at 5 minutes for the US and 10 minutes for North, Central and South America at 2 km resolution are produced from the next generation of GOES satellites and made available from 2019. The solar radiation values represent the resource available to solar energy systems. The data was created using cloud properties which are generated using the AVHRR Pathfinder Atmospheres-Extended (PATMOS-x) algorithms developed by the University of Wisconsin. Fast all-sky radiation model for solar applications (FARMS) in conjunction with the cloud properties, and aerosol optical depth (AOD) and precipitable water vapor (PWV) from ancillary source are used to estimate solar irradiance (GHI, DNI, and DHI). The Global Horizontal Irradiance (GHI) is computed for clear skies using the REST2 model. For cloud scenes identified by the cloud mask, FARMS is used to compute GHI and FARMS DNI is used to compute the Direct Normal Irradiance (DNI). The PATMOS-X model uses radiance images in visible and infrared channels from the Geostationary Operational Environmental Satellite (GOES) series of geostationary weather satellites. Ancillary variables needed to run REST2 and FARMS (e.g., aerosol optical depth, precipitable water vapor, and albedo) are derived from NASA's Modern Era-Retrospective Analysis (MERRA-2) dataset. Temperature and wind speed data are also derived from MERRA-2 and provided for use in NREL's System Advisor Model (SAM) to compute PV generation.
This dataset was created by Afroz