Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
These four NetCDF databases constitute the bulk of the spatial and spatiotemporal environmental covariates used in a latent health factor index (LHFI) model for assessment and prediction of ecosystem health across the MDB. The data formatting and hierarchical statistical modelling were conducted under a CSIRO appropriation project funded by the Water for a Healthy Country Flagship from July 2012 to June 2014. Each database was created by collating and aligning raw data downloaded from the respective state government websites (QLD, NSW, VIC, and SA). (ACT data were unavailable.) There are two primary components in each state-specific database: (1) a temporally static data matrix with axes "Site ID" and "Variable," and (2) a 3D data cube with axes "Site ID", "Variable," and "Date." Temporally static variables in (1) include geospatial metadata (all states), drainage area (VIC and SA only), and stream distance (SA only). Temporal variables in (2) include discharge, water temperature, etc. Missing data (empty cells) are highly abundant in the data cubes. The attached state-specific README.pdf files contain additional details on the contents of these databases, and any computer code that was used for semi-automation of raw data downloads. Lineage: (1) For NSW I created the NetCDF database by (a) downloading CSV raw data from the NSW Office of Water real-time data website (http://realtimedata.water.nsw.gov.au/water.stm) during February-April 2013, then (b) writing computer programs to preprocess such raw data into the current format. (2) The same was done for QLD, except through the Queensland Water Monitoring Data Portal (http://watermonitoring.derm.qld.gov.au/host.htm). (3) The same was also done for SA, except through the SA WaterConnect => Data Systems => Surface Water Data website (https://www.waterconnect.sa.gov.au/Systems/SWD/SitePages/Home.aspx) during April 2013 as well as May 2014. (4) For Victoria I created the NetCDF database by (a) manually downloading XLS raw data during November and December in 2013 from the Victoria DEPI Water Measurement Information System => Download Rivers and Streams sites website (http://data.water.vic.gov.au/monitoring.htm), then (b) writing computer programs to preprocess such raw data into CSV format (intermediate), then into the current final format.
Additional details on lineage are available from the attached README.pdf files.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Errata: Due to a coding error, monthly files with "dma8epax" statistics were wrongly aggregated. This concerns all gridded files of this metric as well as the monthly aggregated csv files. All erroneous files were replaced with corrected versions on Jan, 16th, 2018. Each updated file contains a version label "1.1" and a brief description of the error. If you have made use of previous TOAR data files with the "dma8epax" metric, please exchange your data files.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
zipped NetCDF file (unzipped ~444GB)
The NetCDF file contains a data matrix with the number of particles per bin (lon 0.015°, lat 0.01°) in the geographic area from -12°W - 10°E, and 47°N - 63°N.
Particle numbers are labelled according to the three simulated scenarios ("period" 0-2, 14-28, 0-28 days after release), by station ("station" as named in stations.csv)", by year ("year" 2019-2022), and by day when the simulation was started ("offset", 000-122 days counting from 01.05.-31.08.)
data variable:
particle_number [111,007,858,176 values, float32]
dimensions:
lon_bin [length 1468, float32]
lat_bin [length 1601, float32]
period [length 3, object '0-2','14-28','0-28']
station [length 32, object 'DK_044','FR_0206'....(all station names)]
year [length 4, object '2019'...'2022']
offset [length 123, object '000'...'122']
The ckanext-thredds extension enhances CKAN's ability to manage and provide subsets of NetCDF (Network Common Data Form) files. This extension provides an API function for creating these subsets, allowing users to extract specific portions of NetCDF resources based on spatial, temporal, and variable selections. This enables users to access only the data they need, reducing download sizes and processing time. Key Features: NetCDF Subset Creation: Enables the creation of subsets of NetCDF resources based on specified parameters. Format Selection: Supports outputting subsets in various formats including NetCDF, XML, and CSV.
Code of global termite CH4 emission estimation written in C language and output files in CSV and NetCDF. CSV file: time-series of global emissions from 1901 to 2021 NetCDF file: grid map of annual emissions from 1901 to 2021
This item contains data and code used in experiments that produced the results for Sadler et. al (2022) (see below for full reference). We ran five experiments for the analysis, Experiment A, Experiment B, Experiment C, Experiment D, and Experiment AuxIn. Experiment A tested multi-task learning for predicting streamflow with 25 years of training data and using a different model for each of 101 sites. Experiment B tested multi-task learning for predicting streamflow with 25 years of training data and using a single model for all 101 sites. Experiment C tested multi-task learning for predicting streamflow with just 2 years of training data. Experiment D tested multi-task learning for predicting water temperature with over 25 years of training data. Experiment AuxIn used water temperature as an input variable for predicting streamflow. These experiments and their results are described in detail in the WRR paper. Data from a total of 101 sites across the US was used for the experiments. The model input data and streamflow data were from the Catchment Attributes and Meteorology for Large-sample Studies (CAMELS) dataset (Newman et. al 2014, Addor et. al 2017). The water temperature data were gathered from the National Water Information System (NWIS) (U.S. Geological Survey, 2016). The contents of this item are broken into 13 files or groups of files aggregated into zip files:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
From http://northweb.hpl.umces.edu/LTRANS.htm. CHECK FOR UPDATES. NEWER VERSION MAY BE AVAILABLE.
PDF of original LTRANS_v.2b website [2016-09-14]
LTRANS v.2b Model Description
The Larval TRANSport Lagrangian model (LTRANS v.2b) is an off-line particle-tracking model that runs with the stored predictions of a 3D hydrodynamic model, specifically the Regional Ocean Modeling System (ROMS). Although LTRANS was built to simulate oyster larvae, it can easily be adapted to simulate passive particles and other planktonic organisms. LTRANS v.2 is written in Fortran 90 and is designed to track the trajectories of particles in three dimensions. It includes a 4th order Runge-Kutta scheme for particle advection and a random displacement model for vertical turbulent particle motion. Reflective boundary conditions, larval behavior, and settlement routines are also included. A brief description of the LTRANS particle-tracking model can be found here (68 KB .pdf). For more information on LTRANS and the application of LTRANS to oyster larvae transport, see a summary web page with animations, the publications North et al. (2008, 2011), and the LTRANS v.2 User's Guide. Please cite North et al. (2011) when referring to LTRANS v.2b. The updates that were made for LTRANS v.2b are listed here.
The Lagrangian TRANSport (LTRANS v.2b) model is based upon LTRANS v.1 (formerly the Larval TRANSport Lagrangian model). Ian Mitchell made the bug fixes in LTRANS v.2b. Zachary Schlag completed signigicant updates to the code in LTRANS v.2 with input from Elizabeth North, Chris Sherwood, and Scott Peckham. LTRANS v.1 was built by Elizabeth North and Zachary Schlag of University of Maryland Center for Environmental Science Horn Point Laboratory. Funding was provided by the National Science Foundation Biological and Physical Oceanography Programs**, Maryland Department of Natural Resources, NOAA Chesapeake Bay Studies, NOAA Maryland Sea Grant College Program, and NOAA-funded UMCP Advanced Study Institute for the Environment.
A beta version of LTRANS v2b which uses predictions from the circulation model ADCIRC is available here.
LTRANS Code
LTRANS v.2b Open Source Code. We would appreciate knowing who is using LTRANS. If you would like to share this information with us, please send us your name, contact information, and a brief description of how you plan to use LTRANS to enorth@umces.edu. To refer to LTRANS in a peer-reviewed publication, please cite the publication(s) listed in the Description section above.
License file. This license was based on the ROMS license. Please note that this license applies to all sections of LTRANS v.2b except those listed in the 'External Dependencies and Programs' section below. | |
LTRANS v.2b Code. This zip file contains the LTRANS code, license, and User's Guide. Section II of the LTRANS v.2 User's Guide contains instructions for setting up and running LTRANS v.2b in Linux and Windows environments. Before using LTRANS v.2b, please read the External Dependencies and Programs section below. This version of LTRANS is parameterized to run with the input files that are available in the LTRANS v.2b Example Input Files section below. This section also contains a tar ball with this code and the example input files. |
External Dependencies and Programs. LTRANS v.2b requires NetCDF libraries and uses the following programs to calculate random numbers (Mersenne Twister) and fit tension splines (TSPACK). Because LTRANS v.2 reads-in ROMS-generated NetCDF (.nc) files, it requires that the appropriate NetCDF libraries be installed on your computer (see files and links below). Also, please note that although the Mersenne Twister and TSPACK programs are included in the LTRANS v.2b in the Random_module.f90 and Tension_module.f90, respectively, they do not share the same license file as LTRANS v.2b. Please review and respect their permissions (links and instructions provided below).
Windows Visual Fortran NetCDF libraries. These NetCDF files that are compatible with Visual Fortran were downloaded from the Unidata NetCDF Binaries Website for LTRANS v.1. The NetCDF 90 files were downloaded from Building the F90 API for Windows for the Intel ifort compilerwebsite. The VF-NetCDF.zip folder contains README.txt that describes where to place the enclosed files. If these files do not work, you may have to download updated versions or build your own by following the instructions at the UCAR Unidata NetCDF website. | |
Linux NetCDF libraries. Linux users will likely have to build their own Fortran 90 libraries using the source code/binaries that are available on the UCAR Unidata NetCDF website. | |
Mersenne Twister random number generator. This program was recoded into F90 and included in the Random_module.f90 in LTRANS. See the Mersenne Twister Home Page for more information about this open source program. If you plan to use this program in LTRANS, please send an email to: m-mat @ math.sci.hiroshima-u.ac.jp (remove space) to inform the developers as a courtesy. | |
| TSPACK: tension spline curve-fitting package. This program (ACM TOMS Algorithm 716) was created by Robert J. Renka and is used in LTRANS as part of the water column profile interpolation technique. The original TSPACK code can be found at the link to the left and is copyrighted by the Association for Computing Machinery (ACM). With the permission of Dr. Renka and ACM, TSPACK was modified for use in LTRANS by removing unused code and call variables and updating it to Fortran 90. The modified version of TSPACK is included in the LTRANS source code in the Tension Spline Module (tension_module.f90). If you would like to use LTRANS with the modified TSPACK software, please read and respect the ACM Software Copyright and License Agreement. For noncommercial use, ACM grants "a royalty-free, nonexclusive right to execute, copy, modify and distribute both the binary and source code solely for academic, research and other similar noncommercial uses" subject to the conditions noted in the license agreement. Note that if you plan commercial use of LTRANS with the modified TSPACK software, you must contact ACM at permissions@acm.org to arrange an appropriate license. It may require payment of a license fee for commerical use. |
LTRANS v.2b Example Input Files. These files can be used to test LTRANS v.2b. They include examples of particle location and habitat polygon input files (.csv) and ROMS grid and history files (.nc) that are needed to run LTRANS v.2b. Many thanks to Wen Long for sharing the ROMS .nc files. The LTRANS v.2b code above is configured to run with these input files. Note: please download the tar (LTRANSv2.tgz) history files (clippped_macroms_his_*.nc) files between the hours of 5 pm and 6 am Eastern Standard Time because of their large size.
Deployment of satellite-transmitting tag on a blue shark in the Gulf Stream, Atlantic Ocean, courtesy of Camrin Braun _NCProperties=version=2,netcdf=4.7.1,hdf5=1.10.5 acknowledgement=Funding provided by ONR under grant N00014-19-1-2573 awarded to Dr Ana M. M. Sequeira, Senior Research Fellow at the University of Western Australia, ana.sequeira@uwa.edu.au. cdm_data_type=Trajectory cdm_trajectory_variables=trajectory Conventions=CF-1.7, ACDD 1.3, COARDS date_metadata_modified=2020-07-02T13:53:21Z Easternmost_Easting=-19.868 featureType=Trajectory geospatial_lat_max=54.012 geospatial_lat_min=29.499 geospatial_lat_resolution=0.1 degree geospatial_lat_units=degrees_north geospatial_lon_max=-19.868 geospatial_lon_min=-132.379 geospatial_lon_resolution=0.1 degree geospatial_lon_units=degrees_east history=atn.noaa.gov id=10.1073/pnas.1903067116 infoUrl=https://github.com/ocean-tracking-network/biologging_standardization/blob/master/examples/braun-blueshark/braun_blues_deployment-metadata.csv institution=Standardization of Biologging Data Working Group instrument=Wildlife Computers SPOT258G keywords_vocabulary=CF Standard Names, GCMD Science Keywords metadata_link=https://github.com/ocean-tracking-network/biologging_standardization/blob/master/examples/braun-blueshark/braun_blues_deployment-metadata.csv naming_authority=gov.noaa.gov.atn nodc_template_version=NODC_NetCDF_Trajectory_Template_v2.0 Northernmost_Northing=54.012 platform=Prionace glauca processing_level=Level 1 program=Standardization of Biologging Data Working Group project=Publication: A standardisation framework for bio-logging data to advance ecological research and conservation references=Camrin D. Braun, Peter Gaube, Tane H. Sinclair-Taylor, Gregory B. Skomal, Simon R. Thorrold, 2019. Mesoscale eddies release pelagic sharks from thermal constraints to foraging in the ocean twilight zone. Proceedings of the National Academy of Sciences Aug 2019, 116 (35) 17187-17192. sea_name=Atlantic source=atn.noaa.gov sourceUrl=(local files) Southernmost_Northing=29.499 standard_name_vocabulary=CF Standard Name Table v27 subsetVariables=trajectory time_coverage_end=2017-02-13T22:40:41Z time_coverage_resolution=PT1S time_coverage_start=2016-08-27T23:29:04Z Westernmost_Easting=-132.379
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Energy Climate dataset consistent with ENTSO-E Pan-European Climatic Database (PECD 2021.3) in CSV and netCDF format
TL;DR: this is a tidy and friendly version of a recreation of ENTSO-E's PECD 2021.3 data by using ERA5: hourly capacity factors for wind onshore, offshore, solar PV and hourly electricity demand are provided. All the data is provided for 28-71 climatic years (1950-2020 for wind and solar, 1982-2010 for demand).
Description Country averages of energy-climate variables generated using the Python scripts, based on the ENTSO-E's TYNDP 2020 study. For the following scenario's data is available
National trends 2025 (NT 2025)
National trends 2030 (NT 2030)
National trends 2040 (NT 2040)
Distributed Energy 2030 (DE 2030)
Distributed Energy 2040 (DE 2040)
Global Ambitions (GA 2030)
Global Ambitions (GA 2040)
The time-series are at hourly resolution and the included variables are:
Generation wind offshore (aggregated for all years per scenario in a .zip)
Generation wind onshore (aggregated for all years per scenario in a .zip)
Generation solar photovoltaic (aggregated for all years per scenario in a .zip)
Total energy demand (all zones combined in single file per scenario)
The Files are provided in CSV (.csv) & NetCDF (.nc). The data is given per ENTSO-E's bidding zone as used within the TYNDP2020.
DISCLAIMER: the content of this dataset has been created with the greatest possible care. However, we invite to use the original data for critical applications and studies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Compilation of public ground gravity data provided by Geoscience Australia. Based on the compilation by Wynne (2018), which is distributed under a CC-BY 4.0 license (see the source). This version has filtered surveys by data quality, referenced positions and heights to the WGS84 ellipsoid, and packaged the data into a single netCDF file with compression and CF-compliant metadata. Also includes a plain-text CSV version (which does not contain the associated metadata).If using this dataset, please cite the original compilation by Wynne (2018) as well as this compilation.Python source code used to download the original survey data and generate this compilation is open-source under the MIT license and can be found at https://github.com/compgeolab/australia-gravity-data
Fig1-needleleaf forest.txt contains all the observation data with each reference given for figure 1. The deposition velocity vd and diameter dp are shown in ordered arrays. vd_err and dp_err define the deposition velocity and diameter error bars. Fig 2-needleleaf.txt contains same observation data as Fig1-needleleaf forest.txt Fig3-Broadleaf forest.txt contains all the observation data with each reference given for broadleaf forests in Fig 3. Data format same as Fig1 Fig4-Grasst.txt contains all the observation data with each reference given for grass in Fig 4. Data format same as Fig1 Fig5.txt contains data from Zhang et al. 2014 for three different U* values Fig6-Watert.txt contains all the observation data with each reference given for water in Fig 6. Data format same as Fig1 DataFig7,TXT is a tab-deliminated text file containing the data in tabular for for Figure 7 DataFig8,TXT is a tab-deliminated text file containing the data in tabular for for Figure 8 Fig14a-133_P6p3_add_newadd_PM25_TOT_126719_boxplot_hourly_data.csv is a CSV file containing data for the hourly average median and 1st and 3rd quartiles of observation and two 1.33 km model runs that are represented by boxes in figure 14a. Fig14b-12US1_P6p3_add_PM25_TOT_211556_boxplot_hourly_data.csvis a CSV file containing data for the hourly average median and 1st and 3rd quartiles of observation and two 12 km model runs that are represented by boxes in Figure 14b. Fig15-133_P6p3_add_newadd_PM25_TOT_728997_spatialplot_diff.csv is a CSV file containing all the data for the bias and error for NEW and BASE 1.33 km model runs and the differences in bias and error between the models at AQS sites Fig16-12US1_P6p3_add_PM25_TOT_971641_spatialplot_diff.csv is a CSV file containing all the data for the bias and error for NEW and BASE 12 km model runs and the differences in bias and error between the models at AQS sites Fig17-12US1_P6p3_add_PM25_TOT_104554_spatialplot_diff.csv is a CSV file containing all the data for the bias and error for NEW and BASE 12 km model runs and the differences in bias and error between the models at IMPROVE sites. Portions of this dataset are inaccessible because: Figs 9-13 are all plots directly from CMAQ output files which are far too large. They can be accessed through the following means: Can contact primary author, Jon Pleim, to access the data. Format: CMAQ netcdf output files
On the continental scale, climate is an important determinant of the distributions of plant taxa and ecoregions. To quantify and depict the relations between specific climate variables and these distributions, we placed modern climate and plant taxa distribution data on an approximately 25-kilometer (km) equal-area grid with 27,984 points that cover Canada and the continental United States (Thompson and others, 2015). The gridded climatic data include annual and monthly temperature and precipitation, as well as bioclimatic variables (growing degree days, mean temperatures of the coldest and warmest months, and a moisture index) based on 1961-1990 30-year mean values from the University of East Anglia (UK) Climatic Research Unit (CRU) CL 2.0 dataset (New and others, 2002), and absolute minimum and maximum temperatures for 1951-1980 interpolated from climate-station data (WeatherDisc Associates, 1989). As described below, these data were used to produce portions of the "Atlas of relations between climatic parameters and distributions of important trees and shrubs in North America" (hereafter referred to as "the Atlas"; Thompson and others, 1999a, 1999b, 2000, 2006, 2007, 2012a, 2015). Evolution of the Atlas Over the 16 Years Between Volumes A & B and G: The Atlas evolved through time as technology improved and our knowledge expanded. The climate data employed in the first five Atlas volumes were replaced by more standard and better documented data in the last two volumes (Volumes F and G; Thompson and others, 2012a, 2015). Similarly, the plant distribution data used in Volumes A through D (Thompson and others, 1999a, 1999b, 2000, 2006) were improved for the latter volumes. However, the digitized ecoregion boundaries used in Volume E (Thompson and others, 2007) remain unchanged. Also, as we and others used the data in Atlas Volumes A through E, we came to realize that the plant distribution and climate data for areas south of the US-Mexico border were not of sufficient quality or resolution for our needs and these data are not included in this data release. The data in this data release are provided in comma-separated values (.csv) files. We also provide netCDF (.nc) files containing the climate and bioclimatic data, grouped taxa and species presence-absence data, and ecoregion assignment data for each grid point (but not the country, state, province, and county assignment data for each grid point, which are available in the .csv files). The netCDF files contain updated Albers conical equal-area projection details and more precise grid-point locations. When the original approximately 25-km equal-area grid was created (ca. 1990), it was designed to be registered with existing data sets, and only 3 decimal places were recorded for the grid-point latitude and longitude values (these original 3-decimal place latitude and longitude values are in the .csv files). In addition, the Albers conical equal-area projection used for the grid was modified to match projection irregularities of the U.S. Forest Service atlases (e.g., Little, 1971, 1976, 1977) from which plant taxa distribution data were digitized. For the netCDF files, we have updated the Albers conical equal-area projection parameters and recalculated the grid-point latitudes and longitudes to 6 decimal places. The additional precision in the location data produces maximum differences between the 6-decimal place and the original 3-decimal place values of up to 0.00266 degrees longitude (approximately 143.8 m along the projection x-axis of the grid) and up to 0.00123 degrees latitude (approximately 84.2 m along the projection y-axis of the grid). The maximum straight-line distance between a three-decimal-point and six-decimal-point grid-point location is 144.2 m. Note that we have not regridded the elevation, climate, grouped taxa and species presence-absence data, or ecoregion data to the locations defined by the new 6-decimal place latitude and longitude data. For example, the climate data described in the Atlas publications were interpolated to the grid-point locations defined by the original 3-decimal place latitude and longitude values. Interpolating the data to the 6-decimal place latitude and longitude values would in many cases not result in changes to the reported values and for other grid points the changes would be small and insignificant. Similarly, if the digitized Little (1971, 1976, 1977) taxa distribution maps were regridded using the 6-decimal place latitude and longitude values, the changes to the gridded distributions would be minor, with a small number of grid points along the edge of a taxa's digitized distribution potentially changing value from taxa "present" to taxa "absent" (or vice versa). These changes should be considered within the spatial margin of error for the taxa distributions, which are based on hand-drawn maps with the distributions evidently generalized, or represented by a small, filled circle, and these distributions were subsequently hand digitized. Users wanting to use data that exactly match the data in the Atlas volumes should use the 3-decimal place latitude and longitude data provided in the .csv files in this data release to represent the center point of each grid cell. Users for whom an offset of up to 144.2 m from the original grid-point location is acceptable (e.g., users investigating continental-scale questions) or who want to easily visualize the data may want to use the data associated with the 6-decimal place latitude and longitude values in the netCDF files. The variable names in the netCDF files generally match those in the data release .csv files, except where the .csv file variable name contains a forward slash, colon, period, or comma (i.e., "/", ":", ".", or ","). In the netCDF file variable short names, the forward slashes are replaced with an underscore symbol (i.e., "_") and the colons, periods, and commas are deleted. In the netCDF file variable long names, the punctuation in the name matches that in the .csv file variable names. The "country", "state, province, or territory", and "county" data in the .csv files are not included in the netCDF files. Data included in this release: - Geographic scope. The gridded data cover an area that we labelled as "CANUSA", which includes Canada and the USA (excluding Hawaii, Puerto Rico, and other oceanic islands). Note that the maps displayed in the Atlas volumes are cropped at their northern edge and do not display the full northern extent of the data included in this data release. - Elevation. The elevation data were regridded from the ETOPO5 data set (National Geophysical Data Center, 1993). There were 35 coastal grid points in our CANUSA study area grid for which the regridded elevations were below sea level and these grid points were assigned missing elevation values (i.e., elevation = 9999). The grid points with missing elevation values occur in five coastal areas: (1) near San Diego (California, USA; 1 grid point), (2) Vancouver Island (British Columbia, Canada) and the Olympic Peninsula (Washington, USA; 2 grid points), (3) the Haida Gwaii (formerly Queen Charlotte Islands, British Columbia, Canada) and southeast Alaska (USA, 9 grid points), (4) the Canadian Arctic Archipelago (22 grid points), and (5) Newfoundland (Canada; 1 grid point). - Climate. The gridded climatic data provided here are based on the 1961-1990 30-year mean values from the University of East Anglia (UK) Climatic Research Unit (CRU) CL 2.0 dataset (New and others, 2002), and include annual and monthly temperature and precipitation. The CRU CL 2.0 data were interpolated onto the approximately 25-km grid using geographically-weighted regression, incorporating local lapse-rate estimation and correction. Additional bioclimatic variables (growing degree days on a 5 degrees Celsius base, mean temperatures of the coldest and warmest months, and a moisture index calculated as actual evapotranspiration divided by potential evapotranspiration) were calculated using the interpolated CRU CL 2.0 data. Also included are absolute minimum and maximum temperatures for 1951-1980 interpolated in a similar fashion from climate-station data (WeatherDisc Associates, 1989). These climate and bioclimate data were used in Atlas volumes F and G (see Thompson and others, 2015, for a description of the methods used to create the gridded climate data). Note that for grid points with missing elevation values (i.e., elevation values equal to 9999), climate data were created using an elevation value of -120 meters. Users may want to exclude these climate data from their analyses (see the Usage Notes section in the data release readme file). - Plant distributions. The gridded plant distribution data align with Atlas volume G (Thompson and others, 2015). Plant distribution data on the grid include 690 species, as well as 67 groups of related species and genera, and are based on U.S. Forest Service atlases (e.g., Little, 1971, 1976, 1977), regional atlases (e.g., Benson and Darrow, 1981), and new maps based on information available from herbaria and other online and published sources (for a list of sources, see Tables 3 and 4 in Thompson and others, 2015). See the "Notes" column in Table 1 (https://pubs.usgs.gov/pp/p1650-g/table1.html) and Table 2 (https://pubs.usgs.gov/pp/p1650-g/table2.html) in Thompson and others (2015) for important details regarding the species and grouped taxa distributions. - Ecoregions. The ecoregion gridded data are the same as in Atlas volumes D and E (Thompson and others, 2006, 2007), and include three different systems, Bailey's ecoregions (Bailey, 1997, 1998), WWF's ecoregions (Ricketts and others, 1999), and Kuchler's potential natural vegetation regions (Kuchler, 1985), that are each based on distinctive approaches to categorizing ecoregions. For the Bailey and WWF ecoregions for North America and the Kuchler potential natural vegetation regions for the contiguous United States (i.e.,
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the accompanying dataset to the following paper https://www.nature.com/articles/s41597-023-01975-w
Caravan is an open community dataset of meteorological forcing data, catchment attributes, and discharge daat for catchments around the world. Additionally, Caravan provides code to derive meteorological forcing data and catchment attributes from the same data sources in the cloud, making it easy for anyone to extend Caravan to new catchments. The vision of Caravan is to provide the foundation for a truly global open source community resource that will grow over time.
If you use Caravan in your research, it would be appreciated to not only cite Caravan itself, but also the source datasets, to pay respect to the amount of work that was put into the creation of these datasets and that made Caravan possible in the first place.
All current development and additional community extensions can be found at https://github.com/kratzert/Caravan
IMPORTANT: Due to size limitations for individual repositories, the netCDF version and the CSV version of Caravan (since Version 1.6) are split into two different repositories. You can find the netCDF version at https://zenodo.org/records/14673536
Channel Log:
This dataset contains aerosol sub- and super- micron nss sulfate, MSA, ammonium and other major ion measurements, using 7-stage multi-jet cascade impactors, taken aboard the Ron Brown ship during the ACE-Asia field project. This dataset contains the netCDF data files. Data can also be downloaded in a comma delimited (.csv) format. A readme in PDF format accompanies the dataset when ordered.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data Files for Creel et al. 2022: "Postglacial Relative Sea Level Change in Norway" and Balascio et al. 2023: "Refining Holocene sea-level dynamics for the Lofoten and Vesterålen archipelagos, northern Norway: Implications for prehistoric human-environment interactions"
This repository contains the following files:
1. Netcdf and csv files for the mean (stehme_mean.nc, stehme_mean_ts.csv) and standard deviation (stehme_std.nc, stehme_std_ts.csv) of the spatiotemporal empirical hierarchical model ensemble (STEHME) produced for Creel et al. 2022. The 'ts' suffix denotes time series for each unique lat/lon site. The netcdf files contain spatial maps at 100 yr resolution.
2. mmc1.xlsx, the HOLSEA format Norway data compilation produced for Creel et al. 2022.
3. Netcdf and csv files for the mean (stehme_mean_230721.nc, stehme_mean_ts_230721.csv) and standard deviation (stehme_std_230721.nc, stehme_std_ts_230721.csv) of the spatiotemporal empirical hierarchical model ensemble (STEHME) produced for Balascio et al. 2023. The 'ts' suffix denotes time series for each unique lat/lon site. The netcdf files contain spatial maps at 100 yr resolution.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
KNMI collects observations from the automatic weather stations situated in the Netherlands and BES islands on locations such as aerodromes, North Sea platforms and wind poles. This dataset provides metadata on these weather stations, such as location, name and type. The data in this dataset is formatted as NetCDF. It is also available as a CSV file in this dataset: https://dataplatform.knmi.nl/dataset/waarneemstations-csv-1-0.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Published empirical algorithms for oceanic total alkalinity (TA) and dissolved inorganic carbon (DIC) are used with monthly sea surface salinity (SSS) and temperature (SST) derived from satellite (SMOS, Aquarius, SST CCI) and interpolated in situ (CORA) measurements and climatological (WOA) ancillary data to produce monthly maps of TA and DIC at one degree spatial resolution. Earth system model TA and DIC (HADGEM2-ES) are also included. Results are compared with in situ (GLODAPv2) TA and DIC and results analysed in five regions (global, Greater Caribbean, Amazon plume, Amazon plume with in situ SSS < 35 and Bay of Bengal). Results are presented in three versions, denoted by 'X' in the lists below: using all available data (X = ''); excluding data with bathymetry < 500m (X = 'Depth500'); excluding data with both bathymetry < 500m and distance from nearest coast < 300 km (X = 'Depth500Dist300'). Datasets S1 to S5 are .csv lists of matchups in each region - date and location, in situ TA and DIC measurements and estimated uncertainties, all input datasets, estimates of TA and DIC from all outputs, and the best available output estimates of TA and DIC for each matchup. S1_GlobalAlgorithmMatchupsX.csv S2_GreaterCaribbeanAlgorithmMatchupsX.csv S3_AmazonPlumeAlgorithmMatchupsX.csv S4_AmazonPlumeLowSAlgorithmMatchupsX.csv S5_BayOfBengalAlgorithmMatchupsX.csv Datasets S6 to S10 are .csv statistical analyses of the performance of each combination of algorithm and input data - carbonate system variable, algorithm, input datasets used, (MAD, RMSD using all available data, output score, RMSD estimated from output score, output and in situ mean and standard deviation, correlation coefficient), all items in brackets presented both unweighted and weighted, number of matchups, number of potential matchups, matchup coverage, RMSD after subtraction of linear regression, percentage reduction in RMSD due to subtraction of linear regression and weighted score divided by number of matchups). S6_GlobalAlgorithmScoresX.csv S7_GreaterCaribbeanAlgorithmScoresX.csv S8_AmazonPlumeAlgorithmScoresX.csv S9_AmazonPlumeLowSAlgorithmScoresX.csv S10_BayOfBengalAlgorithmScoresX.csv Datasets S11 to S15 are zipped netCDF files containing error analyses of all outputs in each region, including the squared error of each output at each matchup, the weight of each squared error (1/squared uncertainty), weight * squared error, number of matchups available to each output, number of matchups available to each combination of two outputs, (score of each output in a given comparison of two outputs, overall output score and RMSD estimated from output score), all items in the last brackets presented both unweighted and weighted. S11_GlobalSquaredErrorsX.nc S12_GreaterCaribbeanSquaredErrorsX.nc S13_AmazonPlumeSquaredErrorsX.nc S14_AmazonPlumeLowSSquaredErrorsX.nc S15_BayOfBengalSquaredErrorsX.nc Datasets S16 to S20 are zipped netCDF files containing global maps of the mean and standard deviation of each of: in situ data; output data; output data - in situ data and number of matchups. Regional files show the same maps, but only including data within the region. S16_GlobalmapsX.nc S17_GreaterCaribbeanmapsX.nc S18_AmazonPlumemapsX.nc S19_AmazonPlumeLowSmapsX.nc S20_BayOfBengalmapsX.nc Datasets S21 and S22 are .csv files containing the effect on estimated RMSD of excluding various combinations of algorithms and/or inputs for TA and DIC in each region. For a given variable and region, the first line shows the algorithm, input data sources, estimated RMSD and bias of the output with lowest estimated RMSD. Subsequent lines show the effect of excluding combinations of algorithms and/or inputs, ordered first by the number of algorithms/inputs excluded (fewest first), then by effect on lowest estimated RMSD. So the first line(s) consist of the effects of excluding the best algorithm and each of the input sources to that algorithm, most important first. Each line consists of the item excluded, ratio of resulting estimated RMSD to original estimated RMSD, resulting bias and number of items excluded. Some exclusions are equivalent, for instance exclusion of WOA nitrate (the only nitrate source) is equivalent to excluding all algorithms using nitrate. Dataset S21 contains a comprehensive list of all possible exclusions, and so is rather hard to read and interpret. To mitigate this, Dataset S22 contains only those exclusion sets with effect greater than 1% and at least 0.1% greater than any subset of its exclusions. S21_importancesX.csv S22_importances2X.csv Dataset S23 is a .csv file containing like-for-like comparisons of RMSD between TA and DIC in each region. Bear in mind that the RMSD shown here is not the same as the estimated RMSD (RMSDe) shown elsewhere. S23_TA_DICcomparisonX.csv
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Zenodo repository contains all migration flow estimates associated with the paper "Deep learning four decades of human migration." Evaluation code, training data, trained neural networks, and smaller flow datasets are available in the main GitHub repository, which also provides detailed instructions on data sourcing. Due to file size limits, the larger datasets are archived here.
Data is available in both NetCDF (.nc
) and CSV (.csv
) formats. The NetCDF format is more compact and pre-indexed, making it suitable for large files. In Python, datasets can be opened as xarray.Dataset
objects, enabling coordinate-based data selection.
Each dataset uses the following coordinate conventions:
The following data files are provided:
T
summed over Birth ISO). Dimensions: Year, Origin ISO, Destination ISOAdditionally, two CSV files are provided for convenience:
imm
: Total immigration flowsemi
: Total emigration flowsnet
: Net migrationimm_pop
: Total immigrant population (non-native-born)emi_pop
: Total emigrant population (living abroad)mig_prev
: Total origin-destination flowsmig_brth
: Total birth-destination flows, where Origin ISO
reflects place of birthEach dataset includes a mean
variable (mean estimate) and a std
variable (standard deviation of the estimate).
An ISO3 conversion table is also provided.
Surface meteorological data at five minute temporal resolution from the weather stations that comprise the New York State Mesonet. Data are available in either NetCDF or CSV (comma-delimited ASCII) format for the Investigation of Microphysics and Precipitation for Atlantic Coast-Threatening Snowstorms 2020 (IMPACTS 2020) campaign.
Data and code for "Spatial and temporal expansion of global wildland fire activity in response to climate change" by Martin Senande-Rivera, Damian Insua-Costa and Gonzalo Miguez-Macho Data formats: netcdf and csv Codes: Python v3.8
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
These four NetCDF databases constitute the bulk of the spatial and spatiotemporal environmental covariates used in a latent health factor index (LHFI) model for assessment and prediction of ecosystem health across the MDB. The data formatting and hierarchical statistical modelling were conducted under a CSIRO appropriation project funded by the Water for a Healthy Country Flagship from July 2012 to June 2014. Each database was created by collating and aligning raw data downloaded from the respective state government websites (QLD, NSW, VIC, and SA). (ACT data were unavailable.) There are two primary components in each state-specific database: (1) a temporally static data matrix with axes "Site ID" and "Variable," and (2) a 3D data cube with axes "Site ID", "Variable," and "Date." Temporally static variables in (1) include geospatial metadata (all states), drainage area (VIC and SA only), and stream distance (SA only). Temporal variables in (2) include discharge, water temperature, etc. Missing data (empty cells) are highly abundant in the data cubes. The attached state-specific README.pdf files contain additional details on the contents of these databases, and any computer code that was used for semi-automation of raw data downloads. Lineage: (1) For NSW I created the NetCDF database by (a) downloading CSV raw data from the NSW Office of Water real-time data website (http://realtimedata.water.nsw.gov.au/water.stm) during February-April 2013, then (b) writing computer programs to preprocess such raw data into the current format. (2) The same was done for QLD, except through the Queensland Water Monitoring Data Portal (http://watermonitoring.derm.qld.gov.au/host.htm). (3) The same was also done for SA, except through the SA WaterConnect => Data Systems => Surface Water Data website (https://www.waterconnect.sa.gov.au/Systems/SWD/SitePages/Home.aspx) during April 2013 as well as May 2014. (4) For Victoria I created the NetCDF database by (a) manually downloading XLS raw data during November and December in 2013 from the Victoria DEPI Water Measurement Information System => Download Rivers and Streams sites website (http://data.water.vic.gov.au/monitoring.htm), then (b) writing computer programs to preprocess such raw data into CSV format (intermediate), then into the current final format.
Additional details on lineage are available from the attached README.pdf files.