These datasets are associated with the manuscript "Urban Heat Island Impacts on Heat-Related Cardiovascular Morbidity: A Time Series Analysis of Older Adults in US Metropolitan Areas." The datasets include (1) ZIP code-level daily average temperature for 2000-2017, (2) ZIP code-level daily counts of Medicare hospitalizations for cardiovascular disease for 2000-2017, and (3) ZIP code-level population-weighted urban heat island intensity (UHII). There are 9,917 ZIP codes included in the datasets, which are located in the urban cores of 120 metropolitan statistical areas across the contiguous United States. (1) The ZIP code-level daily temperature data is publicly available at: https://doi.org/10.15139/S3/ZL4UF9. A data dictionary is also available at this link. (2) The ZIP code-level daily counts of Medicare hospitalizations cannot be uploaded to ScienceHub because of privacy requirements in the data use agreement with Medicare. (3) The ZIP code-level UHII data is attached, along with a data dictionary describing the dataset. Portions of this dataset are inaccessible because: The ZIP code-level daily counts of Medicare cardiovascular disease hospitalizations cannot be uploaded to ScienceHub due to privacy requirements in data use agreements with Medicare. They can be accessed through the following means: The Medicare data can only be accessed internally at EPA with the correct permissions. Format: The Medicare data includes counts of the number of cardiovascular disease hospitalizations in each ZIP code on each day between 2000-2017. This dataset is associated with the following publication: Cleland, S., W. Steinhardt, L. Neas, J. West, and A. Rappold. Urban Heat Island Impacts on Heat-Related Cardiovascular Morbidity: A Time Series Analysis of Older Adults in US Metropolitan Areas. ENVIRONMENT INTERNATIONAL. Elsevier B.V., Amsterdam, NETHERLANDS, 178(108005): 1, (2023).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Files: ‘zip.temp.data_[year].rds’, where [year] is between 2000-2017 Data frame with arithmetic (.Mean) and population-weighted (.Wght) averages of mean/max/min temperature, dew point, relative humidity, and apparent temperature for 9,917 ZIP codes located in the urban cores of 120 metropolitan areas in the contiguous United States for 01/01/2000 to 12/31/2017. A data dictionary describing all variables included in the dataset can be found in: 'Data Dictionary.docx'
Hourly Precipitation Data (HPD) is digital data set DSI-3240, archived at the National Climatic Data Center (NCDC). The primary source of data for this file is approximately 5,500 US National Weather Service (NWS), Federal Aviation Administration (FAA), and cooperative observer stations in the United States of America, Puerto Rico, the US Virgin Islands, and various Pacific Islands. The earliest data dates vary considerably by state and region: Maine, Pennsylvania, and Texas have data since 1900. The western Pacific region that includes Guam, American Samoa, Marshall Islands, Micronesia, and Palau have data since 1978. Other states and regions have earliest dates between those extremes. The latest data in all states and regions is from the present day. The major parameter in DSI-3240 is precipitation amounts, which are measurements of hourly or daily precipitation accumulation. Accumulation was for longer periods of time if for any reason the rain gauge was out of service or no observer was present. DSI 3240_01 contains data grouped by state; DSI 3240_02 contains data grouped by year.
OnPoint Weather is a global weather dataset for business available for any lat/lon point and geographic area such as ZIP codes. OnPoint Weather provides a continuum of hourly and daily weather from the year 2000 to current time and a forward forecast of 45 days. OnPoint Climatology provides hourly and daily weather statistics which can be used to determine ‘departures from normal’ and to provide climatological guidance of expected weather for any location at any point in time. The OnPoint Climatology provides weather statistics such as means, standard deviations and frequency of occurrence. Weather has a significant impact on businesses and accounts for hundreds of billions in lost revenue annually. OnPoint Weather allows businesses to quantify weather impacts and develop strategies to optimize for weather to improve business performance. Examples of Usage Quantify the impact of weather on sales across diverse locations and times of the year Understand how supply chains are impacted by weather Understand how employee’s attendance and performance are impacted by weather Understand how weather influences foot traffic at malls, stores and restaurants OnPoint Weather is available through Google Cloud Platform’s Commercial Dataset Program and can be easily integrated with other Google Cloud Platform Services to quickly reveal and quantify weather impacts on business. Weather Source provides a full range of support services from answering quick questions to consulting and building custom solutions. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery 瞭解詳情
Nowcast is a representation of present conditions for any location at any point in time. OnPoint Weather is described exactly as it sounds, weather data for any location at any point in time. Unlike other providers who rely on singular inputs that in many instances may be many miles away from your location of interest to be meaningful or actionable, Weather Source ingests all of the best weather sensing inputs available including:
Airport observation stations NOAA & NWS data Satellites Radar IoT Devices and other sensor information Telematics Weather analyses and model outputs
Weather Source unifies and homogenizes the inputs on our high resolution global grid. The globally consistent OnPoint Grid covers every land mass in the world and up to 200 miles offshore. Each grid point - millions in total - represents a “virtual” weather station with a unique OnPoint ID from which weather data can be mapped to any lat/lon coordinates or specified geographically bounded areas such as:
Census tract/block County/parish or state Designated Market Area (DMA) ZIP/Postal Code
All Weather Source data is available in hourly or daily format. Daily format includes minimum and maximum values as well as daily averages for each supported weather parameter. * Purchasing a subscription through Mobito will provide instant access to all Weather Source tiles and resources listed in the Mobito Marketplace including historical, forecast and climatology subject to the subscription tier purchased.
This item contains data and code used in experiments that produced the results for Sadler et. al (2022) (see below for full reference). We ran five experiments for the analysis, Experiment A, Experiment B, Experiment C, Experiment D, and Experiment AuxIn. Experiment A tested multi-task learning for predicting streamflow with 25 years of training data and using a different model for each of 101 sites. Experiment B tested multi-task learning for predicting streamflow with 25 years of training data and using a single model for all 101 sites. Experiment C tested multi-task learning for predicting streamflow with just 2 years of training data. Experiment D tested multi-task learning for predicting water temperature with over 25 years of training data. Experiment AuxIn used water temperature as an input variable for predicting streamflow. These experiments and their results are described in detail in the WRR paper. Data from a total of 101 sites across the US was used for the experiments. The model input data and streamflow data were from the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS) dataset (Newman et. al 2014, Addor et. al 2017). The water temperature data were gathered from the National Water Information System (NWIS) (U.S. Geological Survey, 2016). The contents of this item are broken into 13 files or groups of files aggregated into zip files:
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Data and coding scripts for Seddon et al. (2016) Nature (DOI 10.1038/nature16986). We derived monthly time-series of four key terrestrial ecosystem variables at 0.05 degree (~5km) resolution from observations by the MODIS sensor on Terra (AM) for the period February 2010-December 2013 inclusive, and developed a method to identify vegetation sensitivity to climate variability over this period (see Methods in main paper).
This ORA item contains all data and files required to run the analysis described in the paper. Data required to run the script are provided in six zip files evi.zip, temp.zip, aetpet.zip, cld.zip, stdev.zip, numpxl.zip, each containing 167 text files, one per month of available data, in addition to a supporting files folder. Details are as follows.
supporting_files.zip : This directory includes computer code and additional supporting files. Please see the 'read me.txt' file within this directory for more information.
evi.zip: ENHANCED VEGETATION INDEX (EVI). We used the MOD13C2 product (Huete et al 2002) which comprises monthly, global EVI at 0.05 degree resolution. In some cases where no clear-sky observations are available, the MOD13C2 version 5 product replaces no-data values with climatological monthly means, so we removed these values where appropriate.
EVI format = ascii text file projection = geographic projection spatial resolution = 0.05 degrees min x = -180 max x = 180 min y = -60 max x = 90 rows = 3000 cols = 7200 bit depth = 16 bit signed integer nodata (sea) = -9999 missing data (on land) = -999 units = dimensionless scale factor = 10000 (divide the value by 10000 to get EVI) filenames = yyyymmevi.txt
numpxl.zip - COUNTS OF THE NUMBER OF PIXELS USED IN EVI CALCULATION. The MOD13C2 product is the result of a spatially and temporally averaged mosaic of higher resolution (1km pixels). Data in this directory represent the number of 1km observations used to calculate the MODIS EVI product. See the online documentation for more details (Solano et al. 2010).
numpxl format = ascii text file projection = geographic projection spatial resolution = 0.05 degrees min x = -180 max x = 180 min y = -60 max x = 90 rows = 3000 cols = 7200 bit depth = 16 bit signed integer nodata (sea) = -9999 missing data (on land) = -999 units = counts filenames = yyyy_mm_numpxl_pt05deg.txt
stdev.zip - STANDARD DEVIATION OF EVI VALUES. Standard deviation of the monthly EVI observations. See discussion in numpxl.zip item (above) and the online documentation for more details (Solano et al. 2010).
stdev format = ascii text file projection = geographic projection spatial resolution = 0.05 degrees min x = -180 max x = 180 min y = -60 max x = 90 rows = 3000 cols = 7200 bit depth = 16 bit signed integer nodata (sea) = -9999 missing data (on land) = -999 units = dimensionless scale factor = 10000 (divide the value by 10000 to get EVI) filenames = yyyy_mm_stdev_pt05deg.txt
temp.zip: AIR TEMPERATURE. We used the MOD07_L2 Atmospheric Profile product (Seeman et al 2006) as a measure of air temperature. Five-minute swaths of Retrieved Temperature Profile were projected to geographic co-ordinates. Pixels from the highest available pressure level, corresponding to the temperature closest to the Earth's surface, were selected from each swath. Swaths were then mean-mosaicked into global daily grids, and the daily global grids were mean-composited to monthly grids of air temperature.
Air temperature format = ascii text file projection = geographic projection spatial resolution = 0.05 degrees min x = -180 max x = 180 min y = -60 max x = 90 rows = 3000 cols = 7200 bit depth = 16 bit signed integer nodata (sea) = -9999 missing data (on land) = -999 units = degrees C scale factor = 1 (divide the value by 1 to get Air temperature) filenames = yyyymmtemp.txt
aetpet.zip: WATER AVAILABILITY. We used the MOD16 Global Evapotranspiration product (Mu et al 2011) to calculate the monthly 0.05 degree ratio of Actual to Potential Evapotranspiration (AET/PET).
AET/PET format = ascii text file projection = geographic projection spatial resolution = 0.05 degrees min x = -180 max x = 180 min y = -60 max x = 90 rows = 3000 cols = 7200 bit depth = 16 bit signed integer nodata (sea) = -9999 missing data (on land) = -999 units = dimensionless scale factor = 10000 (divide the value by 10000 to get AET/PET) filenames = yyyymmaetpet.txt
cld.zip - CLOUDINESS. We used the MOD35_L2 Cloud Mask product (Ackerman et al 2010). This product provides daily records on the presence of cloudy vs cloudless skies, and we used this to make an index of the proportion of of cloudy to clear days in a given pixel. After conversion to geographic co-ordinates, five-minute swaths at 1-km resolution were reclassed as clear sky or cloudy, and these daily swaths were mean-mosaicked to global coverages, mean composited from daily to monthly, and mean-aggregated from 1km to 0.05 degree.
cld format = ascii text file projection = geographic projection spatial resolution = 0.05 degrees min x = -180 max x = 180 min y = -60 max x = 90 rows = 3000 cols = 7200 bit depth = 16 bit signed integer nodata (sea) = -9999 missing data (on land) = -999 units = percentage of days in the month which were cloudy scale factor = 100 (divide the value by 100 to get percentage cloudy days) filenames = yyyymmcld.txt
References
Ackerman, S. et al. (2010) Discriminating clear-sky from cloud with MODIS: Algorithm Theoretical Basis Document (MOD35), Version 6.1. (URL: ttp://modis- atmos.gsfc.nasa.gov/_docs/MOD35_A TBD_Collection6.pdf)
Huete, A. et al. (2002) Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Remote Sensing of Environment 83, 195–213.
Mu, Q., Zhao, M., Running, S.R. (2011) Improvements to a MODIS global terrestrial evapotranspiration algorithm. Remote Sensing of Environment 115, 1781-1800
Seeman, S. W., Borbas, E. E., Li, J., Menzel, W. P. & Gumley, L. E. (2006) MODIS Atmospheric Profile Retrieval Algorithm Theoretical Basis Document, Version 6 (URL: http://modis-atmos.gsfc.nasa.gov/_docs/MOD07_atbd_v7_April2011.pdf)
Solano, R. et al. (2010) MODIS Vegetation Index User’s Guide (MOD13 Series) Version 2.00, May 2010 (Collection 5) (URL: http://vip.arizona.edu/documents/MODIS/MODIS_VI_UsersGuide_01_2012.pdf) Seddon et al. (2016) Nature (DOI 10.1038/nature16986) ABSTRACT: Identification of properties that contribute to the persistence and resilience of ecosystems despite climate change constitutes a research priority of global significance. Here, we present a novel, empirical approach to assess the relative sensitivity of ecosystems to climate variability, one property of resilience that builds on theoretical modelling work recognising that systems closer to critical thresholds respond more sensitively to external perturbations. We develop a new metric, the Vegetation Sensitivity Index (VSI) which identifies areas sensitive to climate variability over the past 14 years. The metric uses time-series data of MODIS derived Enhanced Vegetation Index (EVI) and three climatic variables that drive vegetation productivity (air temperature, water availability and cloudiness). Underlying the analysis is an autoregressive modelling approach used to identify regions with memory effects and reduced response rates to external forcing. We find ecologically sensitive regions with amplified responses to climate variability in the arctic tundra, parts of the boreal forest belt, the tropical rainforest, alpine regions worldwide, steppe and prairie regions of central Asia and North and South America, the Caatinga deciduous forest in eastern South America, and eastern areas of Australia. Our study provides a quantitative methodology for assessing the relative response rate of ecosystems – be they natural or with a strong anthropogenic signature – to environmental variability, which is the first step to address why some regions appear to be more sensitive than others and what impact this has upon the resilience of ecosystem service provision and human wellbeing.
Each annual file contains 35 metrics calculated by CANUE staff using base data provided by the Canadian Forest Service of Natural Resources Canada.The base data consist of interpolated daily maximum temperature, minimum temperature and total precipitation for all unique DMTI Spatial Inc. postal code locations in use at any time between 1983 and 2015. These were generated using thin-plate smoothing splines, as implemented in the ANUSPLIN climate modeling software. The earliest applications of thin-plate smoothing splines were described by Wahba and Wendelberger (1980) and Hutchinson and Bischof (1983), but the methodology has been further developed into an operational climate mapping tool at the ANU over the last 20 years. ANUSPLIN has become one of the leading technologies in the development of climate models and maps, and has been applied in North America and many regions around the world. ANUSPLIN is essentially a multidimensional “nonparametric” surface fitting method that has been found particularly well suited to the interpolation of various climate parameters, including daily maximum and minimum temperature, precipitation, and solar radiation.Equations for calculating the included metrics, based on daily minimum and maximum temperature, and total precipitation were developed by Pei-Ling Wang and Dr. Johannes Feddema at the University of Victoria, Geography Department, and implemented by CANUE staff Mahdi Shooshtari.
This interactive mapping application easily searches and displays global tropical cyclone data. Users are able to query storms by the storm name, geographic region, or latitude/longitude coordinates. Custom queries can track storms of interest and allow for data extraction and download.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
File Name | File Type | Description |
ERA5_CEMS_Download_and_Resample_Notebooks.zip | ZIP file containing Python Jupyter notebooks | Code used to download and resample ERA5 and CEMS meteorological data from hourly into daily values |
Geolocate_GlobalRx_Notebooks.zip | ZIP file containing Python Jupyter notebooks | Code used to determine values of meteorological and environmental variables at date and location of each burn record |
GlobalRx-Figures-Stats.ipynb | Jupyter notebook | Code used to calculate and generate all statistics and figures in the paper |
GlobalRx_CSV_v2024.1.csv GlobalRx_XLSX_v2024.1.xlsx GlobalRx_SHP_v2024.1.zip | CSV, Excel, and ZIP file containing shape file and accompanying feature files | GlobalRx dataset. Features of the dataset are described in more detail below.** |
**Description of GlobalRx Dataset:
198,890 records of prescribed burns in 16 countries. In the information below, the name of the variable's column within the dataset is given in parentheses () in code font
. For example, the column with the Drought Code data is titled DC
.
For each record, the following general information (derived from the original burn records sources) is included, where available:
Latitude
)Longitude
)Year
)Month
) Day
)Time
)DOY
)Country
)State/Province
)Agency/Organisation
)Burn Objective
)Area Burned (Ha)
)Data Repository
)Citation
)* Not available for every record
For each record, the following meteorological information (derived from the ERA5 single levels reanalysis product) is also included:
PPT_tot
)RH_min
, RH_mean
)*T_max
)Wind_max
, Wind_mean
)BLH_min
)CHI
)*VPD
)** Computed from other ERA5 meteorological variables.
For each record, the following fire weather indices and components (derived from ERA5 fire weather reanalysis product) are also included:
FWI
)FFMC
)DMC
)DC
)FFDI
)KBDI
)USBI
)For each record, the following environmental information (derived from various sources, see paper for more information) is also included:
Ecoregion (Olson)
)Biome (Olson)
)Koppen Climate
)Topography
)Fuelbed Classification (GFD-FCCS)
)WDPA Name
)WDPA Governance
)WDPA Ownership
)WDPA Designation
)WDPA IUCN Category
)Metadata:Data Provider: Oklahoma MesonetData Link: Daily SummariesLast Update Time: October 10, 2023Start Date: January 1, 2021End Date: December 31, 2021Update Frequency: DailyAdmin Unit: zipcodeMeasurement Criteria:Total Daily Solar Radiation: Calculated if 99% of the observations for the site and day are available.Maximum Wind Gust, Maximum Heat Index, Minimum Wind Chill: Calculated if at least 1 observation is available for the day.Wind Direction: Available when wind speed is greater than 2.5 mph.Other Variables: Require at least 90% of the observations to be computed.Unavailable Data: Values between +/-990-999 indicate data are not available.Variables in Data Table:CSV Column NameDescriptionStateNameState NameStateFIPSState FIPS CodeZipCodeZip CodeCentroid_LatCentroid LatitudeCentroid_LonCentroid LongitudeDateDateMEAN_TR05Calibrated change in temperature of soil over time after a heat pulse is introduced. Used to calculate soil water potential, fractional water index, or volumetric water.MEAN_TR25Calibrated change in temperature of soil over time after a heat pulse is introduced. Used to calculate soil water potential, fractional water index, or volumetric water.MEAN_TR60Calibrated change in temperature of soil over time after a heat pulse is introduced. Used to calculate soil water potential, fractional water index, or volumetric water.MEAN_R05BDNumber of errant 30-minute calibrated delta-t at 5 cm observations.MEAN_R25BDNumber of errant 30-minute calibrated delta-t at 25 cm observations.MEAN_R60BDNumber of errant 30-minute calibrated delta-t at 60 cm observations.MEAN_TAVGAverage of all 5-minute averaged temperature observations each day.MEAN_HAVGAverage of all 5-minute averaged humidity observations each day.MEAN_DAVGAverage of all 5-minute averaged dewpoint temperatures each day. Dewpoint temperature is derived from 1.5 m air temperature and the corresponding humidity value.MEAN_VDEFAverage of all 5-minute averaged vapor deficit estimates each day.MEAN_PAVGAverage of all 5-minute averaged station air pressure observations each day.MEAN_WSPDAverage of all 5-minute wind speed observations each day.MEAN_ATOTDaily accumulation of solar radiation each day.MEAN_BAVGAverage of all 15-minute averaged soil temperature observations each day. This variable is only available prior to December 1, 2013.MEAN_S5AVAverage of all 15-minute averaged soil temperature observations each day.Description:The Oklahoma Mesonet dataset provides detailed daily environmental measurements across Oklahoma. It includes various parameters such as temperature, humidity, dewpoint, vapor deficit, air pressure, wind speed, solar radiation, and soil temperature. Each parameter is averaged over specific time intervals (e.g., 5-minute, 15-minute) and provides a comprehensive overview of daily weather and environmental conditions.The Mesonet stations require a minimum percentage of observations for the day to be included in daily calculations. For instance, total daily solar radiation is calculated only if 99% of the observations for the site and day are available. Other parameters, such as maximum wind gust, maximum heat index, and minimum wind chill, require at least one observation to be recorded for the day. Wind direction data are only recorded when wind speeds exceed 2.5 mph, and most other variables require at least 90% of the observations to be computed.This dataset is essential for researchers, policymakers, and agricultural professionals, providing critical data to analyze environmental trends, assess climate patterns, and make informed decisions related to agriculture and environmental management.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dataset Description This dataset contains aggregated meteorological variables for U.S. counties and ZIP Code Tabulation Areas (ZCTAs) derived from the gridMET dataset. The gridMET product combines high-resolution spatial climate data (e.g., temperature, precipitation, humidity) from the PRISM Climate Group with daily temporal attributes and additional meteorological variables from the NLDAS-2 regional reanalysis dataset. The final product includes daily meteorological data at approximately 4km x 4km spatial resolution across the contiguous United States. This dataset has been processed to provide spatial (ZCTA, County) and temporal (daily, yearly) aggregations for broader climate analysis. This dataset was created to support climate and public health research by providing ready-to-use, high-resolution meteorological data aggregated at county and ZCTA levels. This allows for efficient linking with health and socio-demographic data to explore the impacts of climate on public health. Contributors: Harvard T.H. Chan School of Public Health, NSAPH (National Studies on Air Pollution and Health) The data is organized by geographic unit (County and ZCTA) and temporal scale (daily, yearly). The dataset is structured to facilitate the computation of climate exposure variables for health impact studies. A data processing pipeline was used to generate this dataset.
Each annual file contains 21 metrics developed by the CANUE Weather and Climate Team, and calculated by CANUE staff using base data provided by the Canadian Forest Service of Natural Resources Canada.The base data consist of interpolated daily maximum temperature, minimum temperature and total precipitation for all unique DMTI Spatial Inc. postal code locations in use at any time between 1983 and 2015. These were generated using thin-plate smoothing splines, as implemented in the ANUSPLIN climate modeling software. The earliest applications of thin-plate smoothing splines were described by Wahba and Wendelberger (1980) and Hutchinson and Bischof (1983), but the methodology has been further developed into an operational climate mapping tool at the ANU over the last 20 years. ANUSPLIN has become one of the leading technologies in the development of climate models and maps, and has been applied in North America and many regions around the world. ANUSPLIN is essentially a multidimensional “nonparametric” surface fitting method that has been found particularly well suited to the interpolation of various climate parameters, including daily maximum and minimum temperature, precipitation, and solar radiation.The water balance model was developed by Pei-Ling Wang and Dr. Johannes Feddema at the University of Victoria, Geography Department, and implemented by CANUE staff Mahdi Shooshtari. (THESE DATA ARE ALSO AVAILABLE AS MONTHLY METRICS).
This data release collates stream water temperature observations from across the United States from four data sources: The U.S. Geological Survey's National Water Information System (NWIS), Water Quality Portal (WQP), Spatial Hydro-Ecological Decision Systems temperature database (EcoSHEDS), and the U.S. Fish and Wildlife's NorWeST stream temperature database. These data were compiled for use in broad scale water temperature models. Observations are included from the contiguous continental US, as well as Alaska, Hawaii, and territories. Temperature monitoring sites were paired to stream segments from the Geospatial Fabric for the National Hydrologic Model. Continuous and discrete data were reduced to daily mean, minimum, and maximum temperatures when more than one measurement was made per site-day. Various quality control checks were conducted including inspecting and converting units, eliminating some duplicate entries, interpreting flags and removing low quality observations, fixing date issues from the WQP, and filtering to expected water temperature ranges. However, we expect data quality issues persist and users should conduct further data quality checks that match the intended use of the data. This data release contains four core files: - site_metadata.csv contains information about each site at which water temperature observations are reported in this dataset. - national_stream_temp_code.zip contains the R code used to derive the data in this data release. - daily_stream_temperature.zip is a compressed comma separated file of observed water temperatures. - spatial.zip contains the geographic information about each site at which water temperature observations are reported in this dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A complete description of the dataset is given by Jones et al. (2023). Key information is provided below.
Background
A dataset describing the global warming response to national emissions CO2, CH4 and N2O from fossil and land use sources during 1851-2021.
National CO2 emissions data are collated from the Global Carbon Project (Andrew and Peters, 2024; Friedlingstein et al., 2024).
National CH4 and N2O emissions data are collated from PRIMAP-hist (HISTTP) (Gütschow et al., 2024).
We construct a time series of cumulative CO2-equivalent emissions for each country, gas, and emissions source (fossil or land use). Emissions of CH4 and N2O emissions are related to cumulative CO2-equivalent emissions using the Global Warming Potential (GWP*) approach, with best-estimates of the coefficients taken from the IPCC AR6 (Forster et al., 2021).
Warming in response to cumulative CO2-equivalent emissions is estimated using the transient climate response to cumulative carbon emissions (TCRE) approach, with best-estimate value of TCRE taken from the IPCC AR6 (Forster et al., 2021, Canadell et al., 2021). 'Warming' is specifically the change in global mean surface temperature (GMST).
The data files provide emissions, cumulative emissions and the GMST response by country, gas (CO2, CH4, N2O or 3-GHG total) and source (fossil emissions, land use emissions or the total).
Data records: overview
The data records include three comma separated values (.csv) files as described below.
All files are in ‘long’ format with one value provided in the Data column for each combination of the categorical variables Year, Country Name, Country ISO3 code, Gas, and Component columns.
Component specifies fossil emissions, LULUCF emissions or total emissions of the gas.
Gas specifies CO2, CH4, N2O or the three-gas total (labelled 3-GHG).
Country ISO3 codes are specifically the unique ISO 3166-1 alpha-3 codes of each country.
Data records: specifics
Data are provided relative to 2 reference years (denoted ref_year below): 1850 and 1991. 1850 is a mutual first year of data spanning all input datasets. 1991 is relevant because the United Nations Framework Convention on Climate Change was operationalised in 1992.
EMISSIONS_ANNUAL_{ref_year-20}-2023.csv: Data includes annual emissions of CO2 (Pg CO2 year-1), CH4 (Tg CH4 year-1) and N2O (Tg N2O year-1) during the period ref_year-20 to 2023. The Data column provides values for every combination of the categorical variables. Data are provided from ref_year-20 because these data are required to calculate GWP* for CH4.
EMISSIONS_CUMULATIVE_CO2e100_{ref_year+1}-2023.csv: Data includes the cumulative CO2 equivalent emissions in units Pg CO2-e100 during the period ref_year+1 to 2023 (i.e. since the reference year). The Data column provides values for every combination of the categorical variables.
GMST_response_{ref_year+1}-2023.csv: Data includes the change in global mean surface temperature (GMST) due to emissions of the three gases in units °C during the period ref_year+1 to 2023 (i.e. since the reference year). The Data column provides values for every combination of the categorical variables.
Accompanying Code
Code is available at: https://github.com/jonesmattw/National_Warming_Contributions .
The code requires Input.zip to run (see README at the GitHub link).
Further info: Country Groupings
We also provide estimates of the contributions of various country groupings as defined by the UNFCCC:
And other country groupings:
See COUNTRY_GROUPINGS.xlsx for the lists of countries in each group.
Metadata:Data Provider: Oklahoma MesonetData Link: Daily SummariesLast Update Time: October 10, 2023Start Date: January 1, 2010End Date: December 31, 2010Update Frequency: DailyAdmin Unit: zipcodeMeasurement Criteria:Total Daily Solar Radiation: Calculated if 99% of the observations for the site and day are available.Maximum Wind Gust, Maximum Heat Index, Minimum Wind Chill: Calculated if at least 1 observation is available for the day.Wind Direction: Available when wind speed is greater than 2.5 mph.Other Variables: Require at least 90% of the observations to be computed.Unavailable Data: Values between +/-990-999 indicate data are not available.Variables in Data Table:CSV Column NameDescriptionStateNameState NameStateFIPSState FIPS CodeZipCodeZip CodeCentroid_LatCentroid LatitudeCentroid_LonCentroid LongitudeDateDateMEAN_TR05Calibrated change in temperature of soil over time after a heat pulse is introduced. Used to calculate soil water potential, fractional water index, or volumetric water.MEAN_TR25Calibrated change in temperature of soil over time after a heat pulse is introduced. Used to calculate soil water potential, fractional water index, or volumetric water.MEAN_TR60Calibrated change in temperature of soil over time after a heat pulse is introduced. Used to calculate soil water potential, fractional water index, or volumetric water.MEAN_R05BDNumber of errant 30-minute calibrated delta-t at 5 cm observations.MEAN_R25BDNumber of errant 30-minute calibrated delta-t at 25 cm observations.MEAN_R60BDNumber of errant 30-minute calibrated delta-t at 60 cm observations.MEAN_TAVGAverage of all 5-minute averaged temperature observations each day.MEAN_HAVGAverage of all 5-minute averaged humidity observations each day.MEAN_DAVGAverage of all 5-minute averaged dewpoint temperatures each day. Dewpoint temperature is derived from 1.5 m air temperature and the corresponding humidity value.MEAN_VDEFAverage of all 5-minute averaged vapor deficit estimates each day.MEAN_PAVGAverage of all 5-minute averaged station air pressure observations each day.MEAN_WSPDAverage of all 5-minute wind speed observations each day.MEAN_ATOTDaily accumulation of solar radiation each day.MEAN_BAVGAverage of all 15-minute averaged soil temperature observations each day. This variable is only available prior to December 1, 2013.MEAN_S5AVAverage of all 15-minute averaged soil temperature observations each day.Description:The Oklahoma Mesonet dataset provides detailed daily environmental measurements across Oklahoma. It includes various parameters such as temperature, humidity, dewpoint, vapor deficit, air pressure, wind speed, solar radiation, and soil temperature. Each parameter is averaged over specific time intervals (e.g., 5-minute, 15-minute) and provides a comprehensive overview of daily weather and environmental conditions.The Mesonet stations require a minimum percentage of observations for the day to be included in daily calculations. For instance, total daily solar radiation is calculated only if 99% of the observations for the site and day are available. Other parameters, such as maximum wind gust, maximum heat index, and minimum wind chill, require at least one observation to be recorded for the day. Wind direction data are only recorded when wind speeds exceed 2.5 mph, and most other variables require at least 90% of the observations to be computed.This dataset is essential for researchers, policymakers, and agricultural professionals, providing critical data to analyze environmental trends, assess climate patterns, and make informed decisions related to agriculture and environmental management.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Title: Dataset for "Uncertainty-Aware Methods for Enhancing Rainfall Prediction with Deep-Learning Based Post-Processing Segmentation"
Authors: Simone Monaco, Luca Monaco, Daniele Apiletti
Description:
This dataset supports the study presented in the paper "Uncertainty-Aware Methods for Enhancing Rainfall Prediction with Deep-Learning Based Post-Processing Segmentation". The work focuses on improving daily quantitative precipitation forecasts and uncertainty estimates over the Piedmont and Aosta Valley regions in Italy by blending outputs from four Numerical Weather Prediction (NWP) models using uncertainty-aware deep learning methods and NWIOI observational data (Turco et al., 2013). NWPs forecasts can be obtained on request, observational data is provided in this repository. The NWPs include:
Observational data from NWIOI serve as the ground truth for model training. The dataset contains 420 gridded precipitation events from 2018 to 2024.
Dataset contents:
obs.zip
: NWIOI observed precipitation data (.csv
format, one file per event)domain_mask.csv
: Binary mask (1 for grid points in the study area, 0 otherwise)allevents_dates.csv
: Classification of all events by type and intensity, used for n-fold cross-validation and dataset splitsCitations:
Related Repository: https://github.com/simone7monaco/probabilistic-rainprediction
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This example dataset is designed for testing the gaia R package, an empirical econometric model designed to estimate annual crop yield responses to climate variables. The dataset includes monthly precipitation and temperature data covering both historical (1951–2001) and projected (2015–2100) periods.
gaia_example_data.zip: Contains the future climate data in NetCDF format, with global coverage.
pr_monthly_canesm5_w5e5_gcam-ref_2015_2100.nc4
: Monthly global precipitation data (kg m-2 s-1) at a 0.5-degree resolution.tas_monthly_canesm5_w5e5_gcam-ref_2015_2100.nc4
: Monthly global temperature data (Kelvin) at a 0.5-degree resolution.weighted_climate.zip: Contains aggregated climate data at the country level, weighted by cropland area using MIRCA2000 data, for both historical and future periods.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time to Update the Split-Sample Approach in Hydrological Model Calibration
Hongren Shen1, Bryan A. Tolson1, Juliane Mai1
1Department of Civil and Environmental Engineering, University of Waterloo, Waterloo, Ontario, Canada
Corresponding author: Hongren Shen (hongren.shen@uwaterloo.ca)
Abstract
Model calibration and validation are critical in hydrological model robustness assessment. Unfortunately, the commonly-used split-sample test (SST) framework for data splitting requires modelers to make subjective decisions without clear guidelines. This large-sample SST assessment study empirically assesses how different data splitting methods influence post-validation model testing period performance, thereby identifying optimal data splitting methods under different conditions. This study investigates the performance of two lumped conceptual hydrological models calibrated and tested in 463 catchments across the United States using 50 different data splitting schemes. These schemes are established regarding the data availability, length and data recentness of the continuous calibration sub-periods (CSPs). A full-period CSP is also included in the experiment, which skips model validation. The assessment approach is novel in multiple ways including how model building decisions are framed as a decision tree problem and viewing the model building process as a formal testing period classification problem, aiming to accurately predict model success/failure in the testing period. Results span different climate and catchment conditions across a 35-year period with available data, making conclusions quite generalizable. Calibrating to older data and then validating models on newer data produces inferior model testing period performance in every single analysis conducted and should be avoided. Calibrating to the full available data and skipping model validation entirely is the most robust split-sample decision. Experimental findings remain consistent no matter how model building factors (i.e., catchments, model types, data availability, and testing periods) are varied. Results strongly support revising the traditional split-sample approach in hydrological modeling.
Version updates
v1.1 Updated on May 19, 2022. We added hydrographs for each catchment.
There are 8 parts of the zipped file attached in v1.1. You should download all of them and unzip all those eight parts together.
In this update, we added two zipped files in each gauge subfolder:
(1) GR4J_Hydrographs.zip and
(2) HMETS_Hydrographs.zip
Each of the zip files contains 50 CSV files. These CSV files are named with keywords of model name, gauge ID, and the calibration sub-period (CSP) identifier.
Each hydrograph CSV file contains four key columns:
(1) Date time (note that the hour column is less significant since this is daily data);
(2) Precipitation in mm that is the aggregated basin mean precipitation;
(3) Simulated streamflow in m3/s and the column is named as "subXXX", where XXX is the ID of the catchment, specified in the CAMELS_463_gauge_info.txt file; and
(4) Observed streamflow in m3/s and the column is named as "subXXX(observed)".
Note that these hydrograph CSV files reported period-ending time-averaged flows. They were directly produced by the Raven hydrological modeling framework. More information about the format of the hydrograph CSV files can be redirected to the Raven webpage.
v1.0 First version published on Jan 29, 2022.
Data description
This data was used in the paper entitled "Time to Update the Split-Sample Approach in Hydrological Model Calibration" by Shen et al. (2022).
Catchment, meteorological forcing and streamflow data are provided for hydrological modeling use. Specifically, the forcing and streamflow data are archived in the Raven hydrological modeling required format. The GR4J and HMETS model building results in the paper, i.e., reference KGE and KGE metrics in calibration, validation and testing periods, are provided for replication of the split-sample assessment performed in the paper.
Data content
The data folder contains a gauge info file (CAMELS_463_gauge_info.txt), which reports basic information of each catchment, and 463 subfolders, each having four files for a catchment, including:
(1) Raven_Daymet_forcing.rvt, which contains Daymet meteorological forcing (i.e., daily precipitation in mm/d, minimum and maximum air temperature in deg_C, shortwave in MJ/m2/day, and day length in day) from Jan 1st 1980 to Dec 31 2014 in a Raven hydrological modeling required format.
(2) Raven_USGS_streamflow.rvt, which contains daily discharge data (in m3/s) from Jan 1st 1980 to Dec 31 2014 in a Raven hydrological modeling required format.
(3) GR4J_metrics.txt, which contains reference KGE and GR4J-based KGE metrics in calibration, validation and testing periods.
(4) HMETS_metrics.txt, which contains reference KGE and HMETS-based KGE metrics in calibration, validation and testing periods.
Data collection and processing methods
Data source
Catchment information and the Daymet meteorological forcing are retrieved from the CAMELS data set, which can be found here.
The USGS streamflow data are collected from the U.S. Geological Survey's (USGS) National Water Information System (NWIS), which can be found here.
The GR4J and HMETS performance metrics (i.e., reference KGE and KGE) are produced in the study by Shen et al. (2022).
Forcing data processing
A quality assessment procedure was performed. For example, daily maximum air temperature should be larger than the daily minimum air temperature; otherwise, these two values will be swapped.
Units are converted to Raven-required ones. Precipitation: mm/day, unchanged; daily minimum/maximum air temperature: deg_C, unchanged; shortwave: W/m2 to MJ/m2/day; day length: seconds to days.
Data for a catchment is archived in a RVT (ASCII-based) file, in which the second line specifies the start time of the forcing series, the time step (= 1 day), and the total time steps in the series (= 12784), respectively; the third and the fourth lines specify the forcing variables and their corresponding units, respectively.
More details of Raven formatted forcing files can be found in the Raven manual (here).
Streamflow data processing
Units are converted to Raven-required ones. Daily discharge originally in cfs is converted to m3/s.
Missing data are replaced with -1.2345 as Raven requires. Those missing time steps will not be counted in performance metrics calculation.
Streamflow series is archived in a RVT (ASCII-based) file, which is open with eight commented lines specifying relevant gauge and streamflow data information, such as gauge name, gauge ID, USGS reported catchment area, calculated catchment area (based on the catchment shapefiles in CAMELS dataset), streamflow data range, data time step, and missing data periods. The first line after the commented lines in the streamflow RVT files specifies data type (default is HYDROGRAPH), subbasin ID (i.e., SubID), and discharge unit (m3/s), respectively. And the next line specifies the start of the streamflow data, time step (=1 day), and the total time steps in the series(= 12784), respectively.
GR4J and HMETS metrics
The GR4J and HMETS metrics files consists of reference KGE and KGE in model calibration, validation, and testing periods, which are derived in the massive split-sample test experiment performed in the paper.
Columns in these metrics files are gauge ID, calibration sub-period (CSP) identifier, KGE in calibration, validation, testing1, testing2, and testing3, respectively.
We proposed 50 different CSPs in the experiment. "CSP_identifier" is a unique name of each CSP. e.g., CSP identifier "CSP-3A_1990" stands for the model is built in Jan 1st 1990, calibrated in the first 3-year sample (1981-1983), calibrated in the rest years during the period of 1980 to 1989. Note that 1980 is always used for spin-up.
We defined three testing periods (independent to calibration and validation periods) for each CSP, which are the first 3 years from model build year inclusive, the first 5 years from model build year inclusive, and the full years from model build year inclusive. e.g., "testing1", "testing2", and "testing3" for CSP-3A_1990 are 1990-1992, 1990-1994, and 1990-2014, respectively.
Reference flow is the interannual mean daily flow based on a specific period, which is derived for a one-year period and then repeated in each year in the calculation period.
For calibration, its reference flow is based on spin-up + calibration periods.
For validation, its reference flow is based on spin-up + calibration periods.
For testing, its reference flow is based on spin-up +calibration + validation periods.
Reference KGE is calculated based on the reference flow and observed streamflow in a specific calculation period (e.g., calibration). Reference KGE is computed using the KGE equation with substituting the "simulated" flow for "reference" flow in the period for calculation. Note that the reference KGEs for the three different testing periods corresponds to the same historical period, but are different, because each testing period spans in a different time period and covers different series of observed flow.
More details of the split-sample test experiment and modeling results analysis can be referred to the paper by Shen et al. (2022).
Citation
Journal Publication
This study:
Shen, H., Tolson, B. A., & Mai, J.(2022). Time to update the split-sample approach in hydrological model calibration. Water Resources Research, 58, e2021WR031523. https://doi.org/10.1029/2021WR031523
Original CAMELS dataset:
A. J. Newman, M. P. Clark, K. Sampson, A. Wood, L. E. Hay, A. Bock, R. J. Viger, D. Blodgett, L. Brekke, J. R. Arnold, T. Hopson, and Q. Duan (2015). Development of a large-sample
These data have been prepared as a supplement to Pourmokhtarian et al. (2016; full citation below), where complete details on methods can be found. We evaluated three downscaling methods: the delta method (or the change factor method); monthly quantile mapping (Bias Correction-Spatial Disaggregation, or BCSD); and daily quantile regression (Asynchronous Regional Regression Model, or ARRM). Additionally, we trained outputs from four atmosphere-ocean general circulation models (AOGCMs) (CCSM3, HadCM3, PCM, and GFDL-CM2.1) driven by higher (A1fi) and lower (B1) future emissions scenarios on two sets of observations (1/8th degree resolution grid vs. individual weather station) to generate the high-resolution climate input for the forest biogeochemical model PnET-BGC (8 ensembles of 6 runs).
This dataset consists of three files - 1) a zip archive file of all raw daily downscaled AOGCMs (csv format; years 1960-2099;
delta method 2012-2099 only) which were
used as input for PnET-BGC model, 2) a zip archive file of all PnET-BGC output files for each model run (csv format; years 1000-2100), and 3) a pdf document
file that describes the content of the input and output files.
Data were also used from the following Hubbard Brook longterm datasests:
Daily Streamflow Watershed 6 (http://dx.doi.org/10.6073/pasta/23d7b5feb24156908fe552f492f774e9 )
Chemistry of Streamwater at the Hubbard Brook Experimental Forest, Watershed 6 (http://dx.doi.org/10.6073/pasta/2ec152b0ab1d4e64aa40f4aa9bc492ac )
Daily Precipitation Watershed 6 (http://dx.doi.org/10.6073/pasta/54133475a47d98472eb6389035753c33)
Daily Maximum/Minimum Temperature Data (http://dx.doi.org/10.6073/pasta/343016c156eaac9bb7cb6c5d6fc04d2f)
Daily Solar Radiation Data (http://dx.doi.org/10.6073/pasta/2b74a8bda3eaa49e2caf1d19dafb23af)
These datasets are associated with the manuscript "Urban Heat Island Impacts on Heat-Related Cardiovascular Morbidity: A Time Series Analysis of Older Adults in US Metropolitan Areas." The datasets include (1) ZIP code-level daily average temperature for 2000-2017, (2) ZIP code-level daily counts of Medicare hospitalizations for cardiovascular disease for 2000-2017, and (3) ZIP code-level population-weighted urban heat island intensity (UHII). There are 9,917 ZIP codes included in the datasets, which are located in the urban cores of 120 metropolitan statistical areas across the contiguous United States. (1) The ZIP code-level daily temperature data is publicly available at: https://doi.org/10.15139/S3/ZL4UF9. A data dictionary is also available at this link. (2) The ZIP code-level daily counts of Medicare hospitalizations cannot be uploaded to ScienceHub because of privacy requirements in the data use agreement with Medicare. (3) The ZIP code-level UHII data is attached, along with a data dictionary describing the dataset. Portions of this dataset are inaccessible because: The ZIP code-level daily counts of Medicare cardiovascular disease hospitalizations cannot be uploaded to ScienceHub due to privacy requirements in data use agreements with Medicare. They can be accessed through the following means: The Medicare data can only be accessed internally at EPA with the correct permissions. Format: The Medicare data includes counts of the number of cardiovascular disease hospitalizations in each ZIP code on each day between 2000-2017. This dataset is associated with the following publication: Cleland, S., W. Steinhardt, L. Neas, J. West, and A. Rappold. Urban Heat Island Impacts on Heat-Related Cardiovascular Morbidity: A Time Series Analysis of Older Adults in US Metropolitan Areas. ENVIRONMENT INTERNATIONAL. Elsevier B.V., Amsterdam, NETHERLANDS, 178(108005): 1, (2023).