This file contains the data set used to develop a random forest model predict background specific conductivity for stream segments in the contiguous United States. This Excel readable file contains 56 columns of parameters evaluated during development. The data dictionary provides the definition of the abbreviations and the measurement units. Each row is a unique sample described as R** which indicates the NHD Hydrologic Unit (underscore), up to a 7-digit COMID, (underscore) sequential sample month. To develop models that make stream-specific predictions across the contiguous United States, we used StreamCat data set and process (Hill et al. 2016; https://github.com/USEPA/StreamCat). The StreamCat data set is based on a network of stream segments from NHD+ (McKay et al. 2012). These stream segments drain an average area of 3.1 km2 and thus define the spatial grain size of this data set. The data set consists of minimally disturbed sites representing the natural variation in environmental conditions that occur in the contiguous 48 United States. More than 2.4 million SC observations were obtained from STORET (USEPA 2016b), state natural resource agencies, the U.S. Geological Survey (USGS) National Water Information System (NWIS) system (USGS 2016), and data used in Olson and Hawkins (2012) (Table S1). Data include observations made between 1 January 2001 and 31 December 2015 thus coincident with Moderate Resolution Imaging Spectroradiometer (MODIS) satellite data (https://modis.gsfc.nasa.gov/data/). Each observation was related to the nearest stream segment in the NHD+. Data were limited to one observation per stream segment per month. SC observations with ambiguous locations and repeat measurements along a stream segment in the same month were discarded. Using estimates of anthropogenic stress derived from the StreamCat database (Hill et al. 2016), segments were selected with minimal amounts of human activity (Stoddard et al. 2006) using criteria developed for each Level II Ecoregion (Omernik and Griffith 2014). Segments were considered as potentially minimally stressed where watersheds had 0 - 0.5% impervious surface, 0 – 5% urban, 0 – 10% agriculture, and population densities from 0.8 – 30 people/km2 (Table S3). Watersheds with observations with large residuals in initial models were identified and inspected for evidence of other human activities not represented in StreamCat (e.g., mining, logging, grazing, or oil/gas extraction). Observations were removed from disturbed watersheds, with a tidal influence or unusual geologic conditions such as hot springs. About 5% of SC observations in each National Rivers and Stream Assessment (NRSA) region were then randomly selected as independent validation data. The remaining observations became the large training data set for model calibration. This dataset is associated with the following publication: Olson, J., and S. Cormier. Modeling spatial and temporal variation in natural background specific conductivity. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 53(8): 4316-4325, (2019).
https://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/licence-to-use-E-OBS-products/licence-to-use-E-OBS-products_22c02baab8ecc1c91abb598affb74f18bc69724559cfbe20b4e9155774c12d78.pdfhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/licence-to-use-E-OBS-products/licence-to-use-E-OBS-products_22c02baab8ecc1c91abb598affb74f18bc69724559cfbe20b4e9155774c12d78.pdf
E-OBS is a daily gridded land-only observational dataset over Europe. The blended time series from the station network of the European Climate Assessment & Dataset (ECA&D) project form the basis for the E-OBS gridded dataset. All station data are sourced directly from the European National Meteorological and Hydrological Services (NMHSs) or other data holding institutions. For a considerable number of countries the number of stations used is the complete national network and therefore much more dense than the station network that is routinely shared among NMHSs (which is the basis of other gridded datasets). The density of stations gradually increases through collaborations with NMHSs within European research contracts. Initially, in 2008, this gridded dataset was developed to provide validation for the suite of Europe-wide climate model simulations produced as part of the European Union ENSEMBLES project. While E-OBS remains an important dataset for model validation, it is also used more generally for monitoring the climate across Europe, particularly with regard to the assessment of the magnitude and frequency of daily extremes. The position of E-OBS is unique in Europe because of the relatively high spatial horizontal grid spacing, the daily resolution of the dataset, the provision of multiple variables and the length of the dataset. Finally, the station data on which E-OBS is based are available through the ECA&D webpages (where the owner of the data has given permission to do so). In these respects it contrasts with other datasets. The dataset is daily, meaning the observations cover 24 hours per time step. The exact 24-hour period can be different per region. The reason for this is that some data providers measure between midnight to midnight while others might measure from morning to morning. Since E-OBS is an observational dataset, no attempts have been made to adjust time series for this 24-hour offset. It is made sure, where known, that the largest part of the measured 24-hour period corresponds to the day attached to the time step in E-OBS (and ECA&D).
This produced dataset includes spatially aggregated records of measurements and observations from public and private organizations across the Upper Missouri River Basin. For this dataset the Upper Missouri River Basin is defined as Hydrologic Unit Code 1002-1013, and includes portions of the states of Montana, Wyoming, North Dakota, and South Dakota. Streamflow observations, defined as this dataset as the identification of flowing, dry, or pooled streamflow conditions, are an essential part of understanding the relationship between streamflow permanence and climatic and physical factors. For the purpose of this investigation, all streamflow observations were identified as perennial, non-perennial, or pooled to be used in the PROSPER (PRObability of Streamflow PERmanence) model.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This fileset provides supporting data and corpora for the empirical study described in: Laura Miron, Rafael S. Goncalves and Mark A. Musen. Obstacles to the Reuse of Metadata in ClinicalTrials.govDescription of filesOriginal data files:- AllPublicXml.zip contains the set of all public XML records in ClinicalTrials.gov (protocols and summary results information), on which all remaining analyses are based. Set contains 302,091 records downloaded on April 3, 2019.- public.xsd is the XML schema downloaded from ClinicalTrials.gov on April 3, 2019, used to validate records in AllPublicXML.BioPortal API Query Results- condition_matches.csv contains the results of querying the BioPortal API for all ontology terms that are an 'exact match' to each condition string scraped from the ClinicalTrials.gov XML. Columns={filename, condition, url, bioportal term, cuis, tuis}. - intervention_matches.csv contains BioPortal API query results for all interventions scraped from the ClinicalTrials.gov XML. Columns={filename, intervention, url, bioportal term, cuis, tuis}.Data Element Definitions- supplementary_table_1.xlsx Mapping of element names, element types, and whether elements are required in ClinicalTrials.gov data dictionaries, the ClinicalTrials.gov XML schema declaration for records (public.XSD), the Protocol Registration System (PRS), FDAAA801, and the WHO required data elements for clinical trial registrations.Column and value definitions: - CT.gov Data Dictionary Section: Section heading for a group of data elements in the ClinicalTrials.gov data dictionary (https://prsinfo.clinicaltrials.gov/definitions.html) - CT.gov Data Dictionary Element Name: Name of an element/field according to the ClinicalTrials.gov data dictionaries (https://prsinfo.clinicaltrials.gov/definitions.html) and (https://prsinfo.clinicaltrials.gov/expanded_access_definitions.html) - CT.gov Data Dictionary Element Type: "Data" if the element is a field for which the user provides a value, "Group Heading" if the element is a group heading for several sub-fields, but is not in itself associated with a user-provided value. - Required for CT.gov for Interventional Records: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to interventional records (only observational or expanded access) - Required for CT.gov for Observational Records: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to observational records (only interventional or expanded access) - Required in CT.gov for Expanded Access Records?: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to expanded access records (only interventional or observational) - CT.gov XSD Element Definition: abbreviated xpath to the corresponding element in the ClinicalTrials.gov XSD (public.XSD). The full xpath includes 'clinical_study/' as a prefix to every element. (There is a single top-level element called "clinical_study" for all other elements.) - Required in XSD? : "Yes" if the element is required according to public.XSD, "No" if the element is optional, "-" if the element is not made public or included in the XSD - Type in XSD: "text" if the XSD type was "xs:string" or "textblock", name of enum given if type was enum, "integer" if type was "xs:integer" or "xs:integer" extended with the "type" attribute, "struct" if the type was a struct defined in the XSD - PRS Element Name: Name of the corresponding entry field in the PRS system - PRS Entry Type: Entry type in the PRS system. This column contains some free text explanations/observations - FDAAA801 Final Rule FIeld Name: Name of the corresponding required field in the FDAAA801 Final Rule (https://www.federalregister.gov/documents/2016/09/21/2016-22129/clinical-trials-registration-and-results-information-submission). This column contains many empty values where elements in ClinicalTrials.gov do not correspond to a field required by the FDA - WHO Field Name: Name of the corresponding field required by the WHO Trial Registration Data Set (v 1.3.1) (https://prsinfo.clinicaltrials.gov/trainTrainer/WHO-ICMJE-ClinTrialsgov-Cross-Ref.pdf)Analytical Results:- EC_human_review.csv contains the results of a manual review of random sample eligibility criteria from 400 CT.gov records. Table gives filename, criteria, and whether manual review determined the criteria to contain criteria for "multiple subgroups" of participants.- completeness.xlsx contains counts and percentages of interventional records missing fields required by FDAAA801 and its Final Rule.- industry_completeness.xlsx contains percentages of interventional records missing required fields, broken up by agency class of trial's lead sponsor ("NIH", "US Fed", "Industry", or "Other"), and before and after the effective date of the Final Rule- location_completeness.xlsx contains percentages of interventional records missing required fields, broken up by whether record listed at least one location in the United States and records with only international location (excluding trials with no listed location), and before and after the effective date of the Final RuleIntermediate Results:- cache.zip contains pickle and csv files of pandas dataframes with values scraped from the XML records in AllPublicXML. Downloading these files greatly speeds up running analysis steps from jupyter notebooks in our github repository.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
November 2022 VersionThis dataset represents the "Observed Distribution" for coho salmon in California by using observations made only between 1990 and the present. It was developed for the express purpose of assisting with species recovery planning efforts. The process for developing this dataset was to collect as many observations of the species as possible and derive the stream-based geographic distribution for the species based solely on these positive observations.For the purpose of this dataset an observation is defined as a report of a sighting or other evidence of the presence of the species at a given place and time. As such, observations are modeled by year observed as point locations in the GIS. All such observations were collected with information regarding who reported the observation, their agency/organization/affiliation, the date that they observed the species, who compiled the information, etc. This information is maintained in the developers file geodatabase (©Environmental Science Research Institute (ESRI) 2016).To develop this distribution dataset, the species observations were applied to California Streams, a CDFW derivative of USGS National Hydrography Dataset (NHD) High Resolution hydrography. For each observation, a path was traced down the hydrography from the point of observation to the ocean, thereby deriving the shortest migration route from the point of observation to the sea. By appending all of these migration paths together, the "Observed Distribution" for the species is developed.It is important to note that this layer does not attempt to model the entire possible distribution of the species. Rather, it only represents the known distribution based on where the species has been observed and reported. While some observations indeed represent the upstream extent of the species (e.g., an observation made at a hard barrier), the majority of observations only indicate where the species was sampled for or otherwise observed. Because of this, this dataset likely underestimates the absolute geographic distribution of the species.It is also important to note that the species may not be found on an annual basis in all indicated reaches due to natural variations in run size, water conditions, and other environmental factors. As such, the information in this dataset should not be used to verify that the species are currently present in a given stream. Conversely, the absence of distribution linework for a given stream does not necessarily indicate that the species does not occur in that stream. The observation data were compiled from a variety of disparate sources including but not limited to CDFW, USFS, NMFS, timber companies, and the public. Forms of documentation include CDFW administrative reports, personal communications with biologists, observation reports, and literature reviews. The source of each feature (to the best available knowledge) is included in the data attributes for the observations in the geodatabase, but not for the resulting linework. The spatial data has been referenced to California Streams, a CDFW derivative of USGS National Hydrography Dataset (NHD) High Resolution hydrography.Usage of this dataset:Examples of appropriate uses include:- species recovery planning- Evaluation of future survey sites for the species- Validating species distribution modelsExamples of inappropriate uses include:- Assuming absence of a line feature means that the species are not present in that stream.- Using this data to make parcel or ground level land use management decisions.- Using this dataset to prove or support non-existence of the species at any spatial scale.- Assuming that the line feature represents the maximum possible extent of species distribution.All users of this data should seek the assistance of qualified professionals such as surveyors, hydrologists, or fishery biologists as needed to ensure that such users possess complete, precise, and up to date information on species distribution and water body location.Any copy of this dataset is considered to be a snapshot of the species distribution at the time of release. It is impingent upon the user to ensure that they have the most recent version prior to making management or planning decisions.Please refer to "Use Constraints" section below.
https://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/cc-by/cc-by_f24dc630aa52ab8c52a0ac85c03bc35e0abc850b4d7453bdc083535b41d5a5c3.pdfhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/cc-by/cc-by_f24dc630aa52ab8c52a0ac85c03bc35e0abc850b4d7453bdc083535b41d5a5c3.pdf
ERA5 is the fifth generation ECMWF reanalysis for the global climate and weather for the past 8 decades. Data is available from 1940 onwards. ERA5 replaces the ERA-Interim reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. This principle, called data assimilation, is based on the method used by numerical weather prediction centres, where every so many hours (12 hours at ECMWF) a previous forecast is combined with newly available observations in an optimal way to produce a new best estimate of the state of the atmosphere, called analysis, from which an updated, improved forecast is issued. Reanalysis works in the same way, but at reduced resolution to allow for the provision of a dataset spanning back several decades. Reanalysis does not have the constraint of issuing timely forecasts, so there is more time to collect observations, and when going further back in time, to allow for the ingestion of improved versions of the original observations, which all benefit the quality of the reanalysis product. This catalogue entry provides post-processed ERA5 hourly single-level data aggregated to daily time steps. In addition to the data selection options found on the hourly page, the following options can be selected for the daily statistic calculation:
The daily aggregation statistic (daily mean, daily max, daily min, daily sum*) The sub-daily frequency sampling of the original data (1 hour, 3 hours, 6 hours) The option to shift to any local time zone in UTC (no shift means the statistic is computed from UTC+00:00)
*The daily sum is only available for the accumulated variables (see ERA5 documentation for more details). Users should be aware that the daily aggregation is calculated during the retrieval process and is not part of a permanently archived dataset. For more details on how the daily statistics are calculated, including demonstrative code, please see the documentation. For more details on the hourly data used to calculate the daily statistics, please refer to the ERA5 hourly single-level data catalogue entry and the documentation found therein.
http://publications.europa.eu/resource/authority/licence/CC_BY_4_0http://publications.europa.eu/resource/authority/licence/CC_BY_4_0
This document contains a selection of standard terms and definitions relevant to the quality assurance of Essential Climate Variable (ECVs) data records. It reproduces appropriate terms and definitions published by normalization bodies, mainly by BIPM/JCGM/ISO in their International Vocabulary of Metrology (VIM) and Guide to the Expression of Uncertainties (GUM). It also reproduces selected terms and definitions related to the quality assurance and validation of Earth Observation (EO) data, available publicly on the ISO website and on the Cal/Val portal of the Committee on Earth Observation Satellites (CEOS).
Several of those terms have been recommended by CEOS in the GEO-CEOS Quality Assurance framework for Earth Observation (QA4EO) and, as such, are applicable to virtually all Copernicus data sets of EO origin. Terms and definitions are expected to evolve as normalization organisations regularly update their standards.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Number of observations used after 2 minutes (expressed in number of observations per second).
This table records high-level information for each Swift observation and provides access to the data archive. Each record is associated with a single observation that contains data from all instruments on board Swift. The BAT is the large field of view instrument and operates in the 10-300 keV energy band. The narrow field instruments, XRT and UVOT, operate in the X-ray and UV/optical regime, respectively. An observation is defined as a collection of snapshots, where a snapshot is defined as the time spent observing the same position continuously. Because of observing constraints, the length of a snapshot can be shorter than a single orbit and it can be interrupted because the satellite will point in a different direction of the sky or because the time allocated to that observation ends. The typical Swift observing strategy for a Gamma Ray Burst (GRB) and/or afterglow, consists of a serious of observations aimed at following the GRB and its afterglow evolution. This strategy is achieved with two different type of observations named Automatic Targets and Pre-Planned Targets. The Automatic Target is initiated on board soon after an event is triggered by the BAT. The Figure of Merit (FOM) algorithm, part of the observatory's autonomy, decides if it is worth requesting a slew maneuver to point the narrow field instruments (NFI) on Swift, XRT and UVOT, in the direction of the trigger. If the conditions to slew to the new position are satisfied, the Automatic Target observation takes place; all the instruments have a pre-set standard configuration of operating modes and filters and about 20000 seconds on source will be collected. The Pre-Planned Target observations instead are initiated from the ground once the trigger is known. These observations are planned on ground and uploaded onto the spacecraft. This database table is generated at the Swift processing site. During operation, it is updated on daily basis. This is a service provided by NASA HEASARC .
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Coho Distribution [ds326]’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/b1cc7bc9-0960-4008-a7e6-ffbae224a88e on 27 January 2022.
--- Dataset description provided by original source is as follows ---
June 2016 VersionThis dataset represents the "Observed Distribution" for coho salmon in California by using observations made only between 1990 and the present. It was developed for the express purpose of assisting with species recovery planning efforts. The process for developing this dataset was to collect as many observations of the species as possible and derive the stream-based geographic distribution for the species based solely on these positive observations.For the purpose of this dataset an observation is defined as a report of a sighting or other evidence of the presence of the species at a given place and time. As such, observations are modeled by year observed as point locations in the GIS. All such observations were collected with information regarding who reported the observation, their agency/organization/affiliation, the date that they observed the species, who compiled the information, etc. This information is maintained in the developers file geodatabase (©Environmental Science Research Institute (ESRI) 2016).To develop this distribution dataset, the species observations were applied to California Streams, a CDFW derivative of USGS National Hydrography Dataset (NHD) High Resolution hydrography. For each observation, a path was traced down the hydrography from the point of observation to the ocean, thereby deriving the shortest migration route from the point of observation to the sea. By appending all of these migration paths together, the "Observed Distribution" for the species is developed.It is important to note that this layer does not attempt to model the entire possible distribution of the species. Rather, it only represents the known distribution based on where the species has been observed and reported. While some observations indeed represent the upstream extent of the species (e.g., an observation made at a hard barrier), the majority of observations only indicate where the species was sampled for or otherwise observed. Because of this, this dataset likely underestimates the absolute geographic distribution of the species.It is also important to note that the species may not be found on an annual basis in all indicated reaches due to natural variations in run size, water conditions, and other environmental factors. As such, the information in this dataset should not be used to verify that the species are currently present in a given stream. Conversely, the absence of distribution linework for a given stream does not necessarily indicate that the species does not occur in that stream. The observation data were compiled from a variety of disparate sources including but not limited to CDFW, USFS, NMFS, timber companies, and the public. Forms of documentation include CDFW administrative reports, personal communications with biologists, observation reports, and literature reviews. The source of each feature (to the best available knowledge) is included in the data attributes for the observations in the geodatabase, but not for the resulting linework. The spatial data has been referenced to California Streams, a CDFW derivative of USGS National Hydrography Dataset (NHD) High Resolution hydrography.Usage of this dataset:Examples of appropriate uses include:- species recovery planning- Evaluation of future survey sites for the species- Validating species distribution modelsExamples of inappropriate uses include:- Assuming absence of a line feature means that the species are not present in that stream.- Using this data to make parcel or ground level land use management decisions.- Using this dataset to prove or support non-existence of the species at any spatial scale.- Assuming that the line feature represents the maximum possible extent of species distribution.All users of this data should seek the assistance of qualified professionals such as surveyors, hydrologists, or fishery biologists as needed to ensure that such users possess complete, precise, and up to date information on species distribution and water body location.Any copy of this dataset is considered to be a snapshot of the species distribution at the time of release. It is impingent upon the user to ensure that they have the most recent version prior to making management or planning decisions.Please refer to "Use Constraints" section below.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
June 2016 VersionThis dataset represents the "Observed Distribution" for coho salmon in California by using observations made only between 1990 and the present. It was developed for the express purpose of assisting with species recovery planning efforts. The process for developing this dataset was to collect as many observations of the species as possible and derive the stream-based geographic distribution for the species based solely on these positive observations.For the purpose of this dataset an observation is defined as a report of a sighting or other evidence of the presence of the species at a given place and time. As such, observations are modeled by year observed as point locations in the GIS. All such observations were collected with information regarding who reported the observation, their agency/organization/affiliation, the date that they observed the species, who compiled the information, etc. This information is maintained in the developers file geodatabase (©Environmental Science Research Institute (ESRI) 2016).To develop this distribution dataset, the species observations were applied to California Streams, a CDFW derivative of USGS National Hydrography Dataset (NHD) High Resolution hydrography. For each observation, a path was traced down the hydrography from the point of observation to the ocean, thereby deriving the shortest migration route from the point of observation to the sea. By appending all of these migration paths together, the "Observed Distribution" for the species is developed.It is important to note that this layer does not attempt to model the entire possible distribution of the species. Rather, it only represents the known distribution based on where the species has been observed and reported. While some observations indeed represent the upstream extent of the species (e.g., an observation made at a hard barrier), the majority of observations only indicate where the species was sampled for or otherwise observed. Because of this, this dataset likely underestimates the absolute geographic distribution of the species.It is also important to note that the species may not be found on an annual basis in all indicated reaches due to natural variations in run size, water conditions, and other environmental factors. As such, the information in this dataset should not be used to verify that the species are currently present in a given stream. Conversely, the absence of distribution linework for a given stream does not necessarily indicate that the species does not occur in that stream. The observation data were compiled from a variety of disparate sources including but not limited to CDFW, USFS, NMFS, timber companies, and the public. Forms of documentation include CDFW administrative reports, personal communications with biologists, observation reports, and literature reviews. The source of each feature (to the best available knowledge) is included in the data attributes for the observations in the geodatabase, but not for the resulting linework. The spatial data has been referenced to California Streams, a CDFW derivative of USGS National Hydrography Dataset (NHD) High Resolution hydrography.Usage of this dataset:Examples of appropriate uses include:- species recovery planning- Evaluation of future survey sites for the species- Validating species distribution modelsExamples of inappropriate uses include:- Assuming absence of a line feature means that the species are not present in that stream.- Using this data to make parcel or ground level land use management decisions.- Using this dataset to prove or support non-existence of the species at any spatial scale.- Assuming that the line feature represents the maximum possible extent of species distribution.All users of this data should seek the assistance of qualified professionals such as surveyors, hydrologists, or fishery biologists as needed to ensure that such users possess complete, precise, and up to date information on species distribution and water body location.Any copy of this dataset is considered to be a snapshot of the species distribution at the time of release. It is impingent upon the user to ensure that they have the most recent version prior to making management or planning decisions.Please refer to "Use Constraints" section below.
Visibility viewsheds incorporate influences of distance from observer, object size and limits of human visual acuity to define the degree of visibility as a probability between 1 - 0. Average visibility viewsheds represent the average visibility value across all visibility viewsheds, thus representing a middle scenario relative to maximum and minimum visibility viewsheds. Average Visibility viewsheds can be used as a potential resource conflict screening tools as it relates to the Great Plains Wind Energy Programmatic Environmental Impact Statement. Data includes binary and composite viewsheds, and average, maximum, minimum, and composite visibility viewsheds for the NPS unit. Viewsheds have been derived using a 30m National Elevation Dataset (NED) digital elevation model. Additonal viewshed parameters: Observer Height (offset A) was set at 2 meters. A vertical development object height (offset B) was set at 110 meters, representing an average wind tower and associated blade height.
A binary viewshed (1 visible, 0 not visible) was created for the defined NPS Unit specific Key Observation Points (KOP). A composite viewshed is the visibility of multiple viewsheds combined into one. A visible value in a composite viewshed implies that across all the combined binary viewsheds (one per key observation pointacross the nps unit in this case), at a minimum at least one of the sample points is visible. On a cell by cell basis throughout the study area of interest the numbers of visible sample points are recorded in the composite viewshed. Composite viewsheds are a quick way to synthesize multiple viewsheds into one layer, thus giving an efficient and cursory overview of potential visual resource effects.
To summarize visibility viewsheds across numerous viewsheds, (e.g. multiple viewsheds per high priority segment) three visibility scenario summary viewsheds have been derived: 1) A maximum visibility scenario is evaluated using a "Products" visibility viewshed, which represents the probability that all sample points are visible. Maximum visibility viewsheds are derived by multiplying probability values per visibility viewshed. 2) A minimum visibility scenario is assessed using a "Fuzzy sum" visibility viewshed. Minimum visibility viewsheds represent the probability that one sample point is visible, and is derived by calculating the fuzzy sum value across the probability values per visibility viewsheds. 3) Lastly an average visibility scenario is created from an "Average" visibility calculation. Average visibility viewsheds represent the average visibility value across all visibility viewsheds, thus representing a middle scenario relative to the aforementioned maximum and minimum visibility viewsheds. Equations for the maximum, average and minimum visibility viewsheds are defined below: Maximum Visibility: Products Visibility =(p1*p2*pn...), Average Visibility: Average Visibility =((p1*p2*pn)/n), and Minimum Visibility: Fuzzy Sum Visibility =(1-((1-p1 )*(1-p2 )*(1-pn )* ...).
Moving beyond a simplistic binary viewshed approach, visibility viewsheds define the degree of visibility as a probability between 1 - 0. Visibility viewsheds incorporate the influences of distance from observer, object size (solar energy towers, troughs, panels, etc.) and limits of human visual acuity to derive a fuzzy membership value. A fuzzy membership value is a probability of visibility ranging between 1 - 0, where a value of one implies that the object would be easily visible under most conditions and for most viewers, while a lower value represents reduced visibility. Visibility viewshed calculation is performed using the modified fuzzy viewshed equations (Ogburn D.E. 2006). Visibility viewsheds have been defined using: a foreground distance (b1) of 1 km, a visual arc threshold value of 1 minute (limit of 20/20 vision) which is used in the object width multiplier calculation, and an object width value of 10 meters.
https://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/insitu-gridded-observations-global-and-regional/insitu-gridded-observations-global-and-regional_15437b363f02bf5e6f41fc2995e3d19a590eb4daff5a7ce67d1ef6c269d81d68.pdfhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/insitu-gridded-observations-global-and-regional/insitu-gridded-observations-global-and-regional_15437b363f02bf5e6f41fc2995e3d19a590eb4daff5a7ce67d1ef6c269d81d68.pdf
This dataset provides high-resolution gridded temperature and precipitation observations from a selection of sources. Additionally the dataset contains daily global average near-surface temperature anomalies. All fields are defined on either daily or monthly frequency. The datasets are regularly updated to incorporate recent observations. The included data sources are commonly known as GISTEMP, Berkeley Earth, CPC and CPC-CONUS, CHIRPS, IMERG, CMORPH, GPCC and CRU, where the abbreviations are explained below. These data have been constructed from high-quality analyses of meteorological station series and rain gauges around the world, and as such provide a reliable source for the analysis of weather extremes and climate trends. The regular update cycle makes these data suitable for a rapid study of recently occurred phenomena or events. The NASA Goddard Institute for Space Studies temperature analysis dataset (GISTEMP-v4) combines station data of the Global Historical Climatology Network (GHCN) with the Extended Reconstructed Sea Surface Temperature (ERSST) to construct a global temperature change estimate. The Berkeley Earth Foundation dataset (BERKEARTH) merges temperature records from 16 archives into a single coherent dataset. The NOAA Climate Prediction Center datasets (CPC and CPC-CONUS) define a suite of unified precipitation products with consistent quantity and improved quality by combining all information sources available at CPC and by taking advantage of the optimal interpolation (OI) objective analysis technique. The Climate Hazards Group InfraRed Precipitation with Station dataset (CHIRPS-v2) incorporates 0.05° resolution satellite imagery and in-situ station data to create gridded rainfall time series over the African continent, suitable for trend analysis and seasonal drought monitoring. The Integrated Multi-satellitE Retrievals dataset (IMERG) by NASA uses an algorithm to intercalibrate, merge, and interpolate “all'' satellite microwave precipitation estimates, together with microwave-calibrated infrared (IR) satellite estimates, precipitation gauge analyses, and potentially other precipitation estimators over the entire globe at fine time and space scales for the Tropical Rainfall Measuring Mission (TRMM) and its successor, Global Precipitation Measurement (GPM) satellite-based precipitation products. The Climate Prediction Center morphing technique dataset (CMORPH) by NOAA has been created using precipitation estimates that have been derived from low orbiter satellite microwave observations exclusively. Then, geostationary IR data are used as a means to transport the microwave-derived precipitation features during periods when microwave data are not available at a location. The Global Precipitation Climatology Centre dataset (GPCC) is a centennial product of monthly global land-surface precipitation based on the ~80,000 stations world-wide that feature record durations of 10 years or longer. The data coverage per month varies from ~6,000 (before 1900) to more than 50,000 stations. The Climatic Research Unit dataset (CRU v4) features an improved interpolation process, which delivers full traceability back to station measurements. The station measurements of temperature and precipitation are public, as well as the gridded dataset and national averages for each country. Cross-validation was performed at a station level, and the results have been published as a guide to the accuracy of the interpolation. This catalogue entry complements the E-OBS record in many aspects, as it intends to provide high-resolution gridded meteorological observations at a global rather than continental scale. These data may be suitable as a baseline for model comparisons or extreme event analysis in the CMIP5 and CMIP6 dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains selected variables as monthly means, extracted from the original The Community Earth System Model 2 (CESM2) Large Ensemble Community Project (LENS2) dataset. Additionally, it contains observational reference datasets for the respective variables. The data in this bucket is lossy compressed with zfp. We provide full ensembles with 100 members for the variables listed in the table below. The 3d winds and temperature (ua, va, ta) are provided at preassure levels 200 and 850 hPa. Specific humidity is provided at levels 300 and 850 hPa. The full set comprises 4 repositories:
The LENS2 project provides open access to multi-decadal climate simulation data at 1-degree horizontal resolution, conducted with a large ensemble comprising 100
members (Rodgers et al., 2021). The simulation covers a historical period (1850-2014) and a future projection (2015-2100), following the Shared Socioeconomic Pathways (SSP) scenario SSP3-7.0.
For a full description of the dataset, we refer to the webpage of the LENS2 project: https://www.cesm.ucar.edu/community-projects/lens2, and to the article from Rodgers et al., 2021 (DOI: 10.5194/esd-12-1393-2021).
Variable | Description | Units | Observations period 1850-1880 | Observations period 1960-1990 | Observations period 1990-2014 |
tas | 2m air temperature | K | NOAA 20th Century Reanalysis (V3) | ERA5 | ERA5 |
psl | Sea level pressure | Pa | NOAA 20th Century Reanalysis (V3) | ERA5 | ERA5 |
ta | Air temperature | K | NOAA 20th Century Reanalysis (V3) | ERA5 | ERA5 |
ua | Zonal wind | m/s | NOAA 20th Century Reanalysis (V3) | ERA5 | ERA5 |
va | Meridional wind | m/s | NOAA 20th Century Reanalysis (V3) | ERA5 | ERA5 |
tauu | Zonal surface stress | Pa | - | ERA5 | ERA5 |
tauv | Meridional surface stress | Pa | - | ERA5 | ERA5 |
hus | Specific humidity | kg/kg | NOAA 20th Century Reanalysis (V3) | ERA5 | ERA5 |
tos | Sea surface temperature | K | - | ERA5 | ERA5 |
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
HadUK-Grid is a collection of gridded climate variables derived from the network of UK land surface observations. The data have been interpolated from meteorological station data onto a uniform grid to provide complete and consistent coverage across the UK. These data at 1 km resolution have been averaged across a set of discrete geographies defining UK river basins consistent with data from UKCP18 climate projections. The dataset spans the period from 1836 to 2021, but the start time is dependent on climate variable and temporal resolution.
The gridded data are produced for daily, monthly, seasonal and annual timescales, as well as long term averages for a set of climatological reference periods. Variables include air temperature (maximum, minimum and mean), precipitation, sunshine, mean sea level pressure, wind speed, relative humidity, vapour pressure, days of snow lying, and days of ground frost.
This data set supersedes the previous versions of this dataset which also superseded UKCP09 gridded observations. Subsequent versions may be released in due course and will follow the version numbering as outlined by Hollis et al. (2018, see linked documentation).
The changes for v1.1.0.0 HadUK-Grid datasets are as follows:
The addition of data for calendar year 2021
The addition of 30 year averages for the new reference period 1991-2020
An update to 30 year averages for 1961-1990 and 1981-2010. This is an order of operation change. In this version 30 year averages have been calculated from the underlying monthly/seasonal/annual grids (grid-then-average) in previous version they were grids of interpolated station average (average-then-grid). This order of operation change results in small differences to the values, but provides improved consistency with the monthly/seasonal/annual series grids. However this order of operation change means that 1961-1990 averages are not included for sfcWind or snowlying variables due to the start date for these variables being 1969 and 1971 respectively.
A substantial new collection of monthly rainfall data have been added for the period before 1960. These data originate from the rainfall rescue project (Hawkins et al. 2022) and this source now accounts for 84% of pre-1960 monthly rainfall data, and the monthly rainfall series has been extended back to 1836.
Net changes to the input station data used to generate this dataset:
-Total of 122664065 observations
-118464870 (96.5%) unchanged
-4821 (0.004%) modified for this version
-4194374 (3.4%) added in this version
-5887 (0.005%) deleted from this version
The primary purpose of these data are to facilitate monitoring of UK climate and research into climate change, impacts and adaptation. The datasets have been created by the Met Office with financial support from the Department for Business, Energy and Industrial Strategy (BEIS) and Department for Environment, Food and Rural Affairs (DEFRA) in order to support the Public Weather Service Customer Group (PWSCG), the Hadley Centre Climate Programme, and the UK Climate Projections (UKCP18) project. The output from a number of data recovery activities relating to 19th and early 20th Century data have been used in the creation of this dataset, these activities were supported by: the Met Office Hadley Centre Climate Programme; the Natural Environment Research Council project "Analysis of historic drought and water scarcity in the UK"; the UK Research & Innovation (UKRI) Strategic Priorities Fund UK Climate Resilience programme; The UK Natural Environment Research Council (NERC) Public Engagement programme; the National Centre for Atmospheric Science; National Centre for Atmospheric Science and the NERC GloSAT project; and the contribution of many thousands of public volunteers. The dataset is provided under Open Government Licence.
At UCL, data are defined as facts, observations or experiences on which an argument or theory is constructed or tested. Data may be numerical, descriptive, aural or visual. Data may be raw, abstracted or analysed, experimental or observational. Research Data Management (RDM) covers the decisions made and actions taken across the research data lifecycle to handle the outputs of research projects. The research data lifecycle has four phases: 1) planning and preparing; 2) active research; 3) archiving, preserving, and curating; and 4) discovery, access, and sharing. Harnessing the advantages of an open working environment serves to disseminate research findings more quickly and facilitates even greater collaboration. RDM is an essential enabler of Open Science and Scholarship - the practice of making research outputs and the research process available to as wide an audience as possible across the research data lifecycle. The purpose of this policy is to provide a framework defining the responsibilities of UCL staff and research students in managing their data. This in turn will facilitate the maintenance and preservation of research data, making them available to the widest possible audience for the highest possible impact. This policy is intended to ensure that research data created as part of the research process are FAIR - Findable, Accessible, Interoperable and Reusable. Further, managing research outputs in-line with best practice gives rise to opportunities relating to enhanced research integrity with a view to having greater transparency of the research process and potential for reproducible research.
This data set contains hourly means for meteorological data (wind speed, wind direction, barometric pressure, air temperature, relative humidity and rainfall) collected at the Kaashidhoo Climate Observatory (KCO) during the INDOEX IFP January to March 1999. Data coverage is (inclusive) 01 January 1999 to 31 March 1999. The instruments were mounted about 14 m above sea level (mean elevation of the site is about 0.5-1.0 m) on the top of the observation tower, except for the rain gauge, which was mounted on top of the roof of the observatory, about 6 m above sea level.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The following meta-data is defined for the values seen in the raw dataset. Rep: replication, which is defined as the unique group of animal(s) observed, based on the differences in time and location of observation. Date: specifies the calendar date (given as DD/MM/YYYY) of the observation of the bird species for that replication. Researcher: the initials of the observer Time: the exact time stamp when observation of the replication was first initiated. This is given in 24-hour format. Duration: the duration of the observation for the replication. Region: the general geographic location where observations were collected. All observations were made on the Gardenbrook Trail in Brampton, ON. Location: the specific area within the general region where observation were made. The two locations where observations were collected were Castleoaks Lake and Bellchase Lake. Species: the common name of the taxonomic bird species seen during observation. Bird species were identified based on distinct morphological characteristics. The Merlin Bird ID app as well as an open source website titles “Toronto Wildlife” were used to reference the species of birds. Frequency: defined as the number of individuals from a specific species of bird seen during observation session for that replication. Behavior: defined as the general term describing the actions of the individuals of a species of bird as seen during observation (which includes but is not limited to feeding, swimming, flying, etc.)
Visibility viewsheds incorporate influences of distance from observer, object size and limits of human visual acuity to define the degree of visibility as a probability between 1 - 0. Average visibility viewsheds represent the average visibility value across all visibility viewsheds, thus representing a middle scenario relative to maximum and minimum visibility viewsheds. Average Visibility viewsheds can be used as a potential resource conflict screening tools as it relates to the Great Plains Wind Energy Programmatic Environmental Impact Statement. Data includes binary and composite viewsheds, and average, maximum, minimum, and composite visibility viewsheds for the NPS unit. Viewsheds have been derived using a 30m National Elevation Dataset (NED) digital elevation model. Additonal viewshed parameters: Observer Height (offset A) was set at 2 meters. A vertical development object height (offset B) was set at 110 meters, representing an average wind tower and associated blade height. A binary viewshed (1 visible, 0 not visible) was created for the defined NPS Unit specific Key Observation Points (KOP). A composite viewshed is the visibility of multiple viewsheds combined into one. A visible value in a composite viewshed implies that across all the combined binary viewsheds (one per key observation pointacross the nps unit in this case), at a minimum at least one of the sample points is visible. On a cell by cell basis throughout the study area of interest the numbers of visible sample points are recorded in the composite viewshed. Composite viewsheds are a quick way to synthesize multiple viewsheds into one layer, thus giving an efficient and cursory overview of potential visual resource effects. To summarize visibility viewsheds across numerous viewsheds, (e.g. multiple viewsheds per high priority segment) three visibility scenario summary viewsheds have been derived: 1) A maximum visibility scenario is evaluated using a "Products" visibility viewshed, which represents the probability that all sample points are visible. Maximum visibility viewsheds are derived by multiplying probability values per visibility viewshed. 2) A minimum visibility scenario is assessed using a "Fuzzy sum" visibility viewshed. Minimum visibility viewsheds represent the probability that one sample point is visible, and is derived by calculating the fuzzy sum value across the probability values per visibility viewsheds. 3) Lastly an average visibility scenario is created from an "Average" visibility calculation. Average visibility viewsheds represent the average visibility value across all visibility viewsheds, thus representing a middle scenario relative to the aforementioned maximum and minimum visibility viewsheds. Equations for the maximum, average and minimum visibility viewsheds are defined below: Maximum Visibility: Products Visibility =(p1*p2*pn...), Average Visibility: Average Visibility =((p1*p2*pn)/n), and Minimum Visibility: Fuzzy Sum Visibility =(1-((1-p1 )*(1-p2 )*(1-pn )* ...). Moving beyond a simplistic binary viewshed approach, visibility viewsheds define the degree of visibility as a probability between 1 - 0. Visibility viewsheds incorporate the influences of distance from observer, object size (solar energy towers, troughs, panels, etc.) and limits of human visual acuity to derive a fuzzy membership value. A fuzzy membership value is a probability of visibility ranging between 1 - 0, where a value of one implies that the object would be easily visible under most conditions and for most viewers, while a lower value represents reduced visibility. Visibility viewshed calculation is performed using the modified fuzzy viewshed equations (Ogburn D.E. 2006). Visibility viewsheds have been defined using: a foreground distance (b1) of 1 km, a visual arc threshold value of 1 minute (limit of 20/20 vision) which is used in the object width multiplier calculation, and an object width value of 10 meters.
This data set contains 10-minute means for meteorological data (wind speed, wind direction, barometric pressure, air temperature, relative humidity and rainfall) collected at the Kaashidhoo Climate Observatory (KCO) during the INDOEX IFP January to March 1999. Data coverage is (inclusive) 01 January 1999 to 31 March 1999. The instruments were mounted about 14 m above sea level (mean elevation of the site is about 0.5-1.0 m) on the top of the observation tower, except for the rain gauge, which was mounted on top of the roof of the observatory, about 6 m above sea level.
This file contains the data set used to develop a random forest model predict background specific conductivity for stream segments in the contiguous United States. This Excel readable file contains 56 columns of parameters evaluated during development. The data dictionary provides the definition of the abbreviations and the measurement units. Each row is a unique sample described as R** which indicates the NHD Hydrologic Unit (underscore), up to a 7-digit COMID, (underscore) sequential sample month. To develop models that make stream-specific predictions across the contiguous United States, we used StreamCat data set and process (Hill et al. 2016; https://github.com/USEPA/StreamCat). The StreamCat data set is based on a network of stream segments from NHD+ (McKay et al. 2012). These stream segments drain an average area of 3.1 km2 and thus define the spatial grain size of this data set. The data set consists of minimally disturbed sites representing the natural variation in environmental conditions that occur in the contiguous 48 United States. More than 2.4 million SC observations were obtained from STORET (USEPA 2016b), state natural resource agencies, the U.S. Geological Survey (USGS) National Water Information System (NWIS) system (USGS 2016), and data used in Olson and Hawkins (2012) (Table S1). Data include observations made between 1 January 2001 and 31 December 2015 thus coincident with Moderate Resolution Imaging Spectroradiometer (MODIS) satellite data (https://modis.gsfc.nasa.gov/data/). Each observation was related to the nearest stream segment in the NHD+. Data were limited to one observation per stream segment per month. SC observations with ambiguous locations and repeat measurements along a stream segment in the same month were discarded. Using estimates of anthropogenic stress derived from the StreamCat database (Hill et al. 2016), segments were selected with minimal amounts of human activity (Stoddard et al. 2006) using criteria developed for each Level II Ecoregion (Omernik and Griffith 2014). Segments were considered as potentially minimally stressed where watersheds had 0 - 0.5% impervious surface, 0 – 5% urban, 0 – 10% agriculture, and population densities from 0.8 – 30 people/km2 (Table S3). Watersheds with observations with large residuals in initial models were identified and inspected for evidence of other human activities not represented in StreamCat (e.g., mining, logging, grazing, or oil/gas extraction). Observations were removed from disturbed watersheds, with a tidal influence or unusual geologic conditions such as hot springs. About 5% of SC observations in each National Rivers and Stream Assessment (NRSA) region were then randomly selected as independent validation data. The remaining observations became the large training data set for model calibration. This dataset is associated with the following publication: Olson, J., and S. Cormier. Modeling spatial and temporal variation in natural background specific conductivity. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 53(8): 4316-4325, (2019).