Facebook
TwitterThese data depict the western United States Map Unit areas as defined by the USDA NRCS. Each Map Unit area contains information on a variety of soil properties and interpretations. The raster is to be joined to the .csv file by the field "mukey." We keep the raster and csv separate to preserve the full attribute names in the csv that would be truncated if attached to the raster. Once joined, the raster can be classified or analyzed by the columns which depict the properties and interpretations. It is important to note that each property has a corresponding component percent column to indicate how much of the map unit has the dominant property provided. For example, if the property "AASHTO Group Classification (Surface) 0 to 1cm" is recorded as "A-1" for a map unit, a user should also refer to the component percent field for this property (in this case 75). This means that an estimated 75% of the map unit has a "A-1" AASHTO group classification and that "A-1" is the dominant group. The property in the column is the dominant component, and so the other 25% of this map unit is comprised of other AASHTO Group Classifications. This raster attribute table was generated from the "Map Soil Properties and Interpretations" tool within the gSSURGO Mapping Toolset in the Soil Data Management Toolbox for ArcGIS™ User Guide Version 4.0 (https://www.nrcs.usda.gov/wps/PA_NRCSConsumption/download?cid=nrcseprd362255&ext=pdf) from GSSURGO that used their Map Unit Raster as the input feature (https://gdg.sc.egov.usda.gov/). The FY2018 Gridded SSURGO Map Unit Raster was created for use in national, regional, and state-wide resource planning and analysis of soils data. These data were created with guidance from the USDA NRCS. The fields named "*COMPPCT_R" can exceed 100% for some map units. The NRCS personnel are aware of and working on fixing this issue. Take caution when interpreting these areas, as they are the result of some data duplication in the master gSSURGO database. The data are considered valuable and required for timely science needs, and thus are released with this known error. The USDA NRCS are developing a data release which will replace this item when it is available. For the most up to date ssurgo releases that do not include the custom fields as this release does, see https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/home/?cid=nrcs142p2_053628#tools For additional definitions, see https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/geo/?cid=nrcs142p2_053627.
Facebook
TwitterThe dataset measures the long-term seasonal means of the chlorophyll a concentrations of ocean surface waters. They are derived from MODIS (aqua) images using NASA's SeaDAS image processing software. The monthly chlorophyll a images between July 2002 and December 2017 are used to calculate the means of the four austral seasons: winter (June, July, and August), spring (September, October and November), summer (December, January and February) and autumn (March, April and May). The extent of the dataset covers the entire Australian EEZ and surrounding waters (including the southern ocean). The unit of the dataset is mg/m3. This research is supported by the National Environmental Science Program (NESP) Marine Biodiversity Hub through Project D1.
Facebook
TwitterThe dataset measures the long-term seasonal means of the sea surface temperature (SST) of ocean surface waters. They are derived from MODIS (aqua) images using NASA's SeaDAS image processing software. The monthly SST images between July 2002 and December 2017 are used to calculate the means of the four austral seasons: winter (June, July, and August), spring (September, October and November), summer (December, January and February) and autumn (March, April and May). The extent of the dataset covers the entire Australian EEZ and surrounding waters (including the southern ocean). The unit of the dataset is Celsius degree. This research is supported by the National Environmental Science Program (NESP) Marine Biodiversity Hub through Project D1.
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
The High Resolution Digital Elevation Model (HRDEM) product is derived from airborne LiDAR data (mainly in the south) and satellite images in the north. The complete coverage of the Canadian territory is gradually being established. It includes a Digital Terrain Model (DTM), a Digital Surface Model (DSM) and other derived data. For DTM datasets, derived data available are slope, aspect, shaded relief, color relief and color shaded relief maps and for DSM datasets, derived data available are shaded relief, color relief and color shaded relief maps. The productive forest line is used to separate the northern and the southern parts of the country. This line is approximate and may change based on requirements. In the southern part of the country (south of the productive forest line), DTM and DSM datasets are generated from airborne LiDAR data. They are offered at a 1 m or 2 m resolution and projected to the UTM NAD83 (CSRS) coordinate system and the corresponding zones. The datasets at a 1 m resolution cover an area of 10 km x 10 km while datasets at a 2 m resolution cover an area of 20 km by 20 km. In the northern part of the country (north of the productive forest line), due to the low density of vegetation and infrastructure, only DSM datasets are generally generated. Most of these datasets have optical digital images as their source data. They are generated at a 2 m resolution using the Polar Stereographic North coordinate system referenced to WGS84 horizontal datum or UTM NAD83 (CSRS) coordinate system. Each dataset covers an area of 50 km by 50 km. For some locations in the north, DSM and DTM datasets can also be generated from airborne LiDAR data. In this case, these products will be generated with the same specifications as those generated from airborne LiDAR in the southern part of the country. The HRDEM product is referenced to the Canadian Geodetic Vertical Datum of 2013 (CGVD2013), which is now the reference standard for heights across Canada. Source data for HRDEM datasets is acquired through multiple projects with different partners. Since data is being acquired by project, there is no integration or edgematching done between projects. The tiles are aligned within each project. The product High Resolution Digital Elevation Model (HRDEM) is part of the CanElevation Series created in support to the National Elevation Data Strategy implemented by NRCan. Collaboration is a key factor to the success of the National Elevation Data Strategy. Refer to the “Supporting Document” section to access the list of the different partners including links to their respective data.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Hydraulic properties such as porosity, water, and clay content can be inferred from electrical parameters like permittivity, conductivity, and resistivity. Spectral data enhance this analysis by revealing features such as pore size and clay type in wet particulate media. In liquid samples, electrode polarization is clearly observed, as orientational polarization occurs only at higher frequencies (MHz to sub-GHz). In contrast, particulate media exhibit electrode polarization artifacts that obscure spatial polarization peaks within the Hz–MHz range, especially in highly conductive materials like wet clayey soils, making the Cole–Cole model insufficient for distinguishing these effects. Therefore, a general circuit model using a parallel form of a resistor and a constant phase element configuration more effectively separates inherent material polarization from electrode polarization. The electrode polarization limiting frequency (fEP) correlates with both material conductivity and electrode properties, even with low-polarization electrodes like Ag/AgCl. A novel method is introduced to estimate the effective constant phase element exponent (η̃) using the slope of log permittivity vs log frequency. Finally, the chargeability of kaolinite (m = 0.83–0.86), derived from the ratio of critical frequencies between the Cole–Cole and Pelton models, aligns with its fundamental definition: m = (σ∞ – σ0)/σ∞, where σ0 is the DC conductivity and σ∞ is the high-frequency conductivity.
Facebook
TwitterThis data set represents the extent, approximate location and type of wetlands and deepwater habitats in the United States and its Territories. These data delineate the areal extent of wetlands and surface waters as defined by Cowardin et al. (1979). The National Wetlands Inventory - Version 2, Surface Waters and Wetlands Inventory was derived by retaining the wetland and deepwater polygons that compose the NWI digital wetlands spatial data layer and reintroducing any linear wetland or surface water features that were orphaned from the original NWI hard copy maps by converting them to narrow polygonal features. Additionally, the data are supplemented with hydrography data, buffered to become polygonal features, as a secondary source for any single-line stream features not mapped by the NWI and to complete segmented connections. Wetland mapping conducted in WA, OR, CA, NV and ID after 2012 and most other projects mapped after 2015 were mapped to include all surface water features and are not derived data. The linear hydrography dataset used to derive Version 2 was the U.S. Geological Survey's National Hydrography Dataset (NHD). Specific information on the NHD version used to derive Version 2 and where Version 2 was mapped can be found in the 'comments' field of the Wetlands_Project_Metadata feature class. Certain wetland habitats are excluded from the National mapping program because of the limitations of aerial imagery as the primary data source used to detect wetlands. These habitats include seagrasses or submerged aquatic vegetation that are found in the intertidal and subtidal zones of estuaries and near shore coastal waters. Some deepwater reef communities (coral or tuberficid worm reefs) have also been excluded from the inventory. These habitats, because of their depth, go undetected by aerial imagery. By policy, the Service also excludes certain types of "farmed wetlands" as may be defined by the Food Security Act or that do not coincide with the Cowardin et al. definition. Contact the Service's Regional Wetland Coordinator for additional information on what types of farmed wetlands are included on wetland maps. This dataset should be used in conjunction with the Wetlands_Project_Metadata layer, which contains project specific wetlands mapping procedures and information on dates, scales and emulsion of imagery used to map the wetlands within specific project boundaries.
Facebook
TwitterOverview
Purpose and Benefits A polygon feature service intended to serve as a repository to store daily wildfire perimeters for fires occurring within National Park Service parks. This service can be used to generate fire progressions.
Layer
Daily Wildfire Perimeter Attributes and their definitions can be found below.
Attributes:
Fire Occurrence ID
The Fire Occurence ID field is a unique identifier linking the daily wildfire perimeter to the wildland fire location feature class.
Perimeter Date
The Perimeter Date field is intended for users to capture the date the perimeter was collected.
Feature Category
The Feature Category field is intended for users to identify the type of event that occurred..
GIS Acres
The GIS Acres field is intended for users to capture the acres for the fire history or fuel treatment perimeter using GIS to calculate the acres.
Public Display
The Public Display field is intended for users to determine if the data can be used for public display – i.e any data representing sensitive information such as cultural resources should not be displayed on a public map.
Data Access
The Data Access field is intended for users to capture the accessibility of the data – i.e. most fire data is considered unrestricted, however, if cultural resources are included then the data would be restricted from sharing or use with others.
Unit Code
The UnitCode field is intended to allow users to identify the National Park that a particular resource may lie within. Some data collected and maintained by National Park Service may inventory resources outside NPS property or responsibility. To make data entry easier, the UnitCode field may select park unit names from a domain that contains all of the park unit 4-letter acronyms. All park units, associated monuments, memorials, seashores, etc., are represented in the domain values.
Map Method
The Map Method field is intended for users to define how the geospatial feature was derived.
Data Source
The Data Source field is intended for users to define the source of the data.
Date of Source
The Source Date field is intended for users to define the date of the source data.
XY Accuracy
The XY Accuracy field is intended to allow users to document the accuracy of the data.
Notes
The Notes field is intended for users to add any additional information describing the feature.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
There are a growing number of land cover data available for the conterminous United States, supporting various applications ranging from biofuel regulatory decisions to habitat conservation assessments. These datasets vary in their source information, frequency of data collection and reporting, land class definitions, categorical detail, and spatial scale and time intervals of representation. These differences limit direct comparison, contribute to disagreements among studies, confuse stakeholders, and hamper our ability to confidently report key land cover trends in the U.S. Here we assess changes in cropland derived from the Land Change Monitoring, Assessment, and Projection (LCMAP) dataset from the U.S. Geological Survey and compare them with analyses of three established land cover datasets across the coterminous U.S. from 2008-2017: (1) the National Resources Inventory (NRI), (2) a dataset Lark et al. 2020 derived from the Cropland Data Layer (CDL), and (3) a dataset from Potapov et al. 2022. LCMAP reports more stable cropland and less stable noncropland in all comparisons, likely due to its more expansive definition of cropland which includes managed grasslands (pasture and hay). Despite these differences, net cropland expansion from all four datasets was comparable (5.18-6.33 million acres), although the geographic extent and type of conversion differed. LCMAP projected the largest cropland expansion in the southern Great Plains, whereas other datasets projected the largest expansion in the northwestern and central Midwest. Most of the pixel-level disagreements (86%) between LCMAP and Lark et al. 2020 were due to definitional differences among datasets, whereas the remainder (14%) were from a variety of causes. Cropland expansion in the LCMAP likely reflects conversions of more natural areas, whereas cropland expansion in other data sources also captures conversion of managed pasture to cropland. The particular research question considered (e.g., habitat versus soil carbon) should influence which data source is more appropriate.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset was derived by the Bioregional Assessment Programme. This dataset was derived from multiple datasets. You can find a link to the parent datasets in the Lineage Field in this metadata statement. The History Field in this metadata statement describes how this dataset was derived.
A line shapefile of the Hunter subregion boundary with line segments attributed with the biophysical feature/dataset that defines that section of the boundary. This dataset is derived from the Bioregional Assessment areas and links to the source datasets are in the lineage field of this metadata statement.
To identify the underlying source used to define the boundary. Mostly the Bioregion boundary was used but some sections are defined by geology and CMA boundaries.For report map purposes.
A polygon shapefile of the Hunter subregion was converted to a line shapefile. The subregion boundary was then compared with the datasets that the subregion metadata listed as boundary sources (see lineage).
The subregion boundary line was split (ArcGIS Editor Split tool) into sections that coincided with the source boundary layers and attributed accordingly.
Bioregional Assessment Programme (2014) Hunter bioregion boundary definition sources. Bioregional Assessment Derived Dataset. Viewed 07 February 2017, http://data.bioregionalassessments.gov.au/dataset/3052c699-3b0d-4504-95e3-18598147c5ae.
Derived From Bioregional Assessment areas v02
Derived From Australian Coal Basins
Derived From Natural Resource Management (NRM) Regions 2010
Derived From Bioregional Assessment areas v03
Derived From Bioregional Assessment areas v01
Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)
Derived From GEODATA TOPO 250K Series 3
Derived From NSW Catchment Management Authority Boundaries 20130917
Derived From Geological Provinces - Full Extent
Facebook
TwitterJurisdictional Unit, 2022-05-21. For use with WFDSS, IFTDSS, IRWIN, and InFORM.This is a feature service which provides Identify and Copy Feature capabilities. If fast-drawing at coarse zoom levels is a requirement, consider using the tile (map) service layer located at https://nifc.maps.arcgis.com/home/item.html?id=3b2c5daad00742cd9f9b676c09d03d13.OverviewThe Jurisdictional Agencies dataset is developed as a national land management geospatial layer, focused on representing wildland fire jurisdictional responsibility, for interagency wildland fire applications, including WFDSS (Wildland Fire Decision Support System), IFTDSS (Interagency Fuels Treatment Decision Support System), IRWIN (Interagency Reporting of Wildland Fire Information), and InFORM (Interagency Fire Occurrence Reporting Modules). It is intended to provide federal wildland fire jurisdictional boundaries on a national scale. The agency and unit names are an indication of the primary manager name and unit name, respectively, recognizing that:There may be multiple owner names.Jurisdiction may be held jointly by agencies at different levels of government (ie State and Local), especially on private lands, Some owner names may be blocked for security reasons.Some jurisdictions may not allow the distribution of owner names. Private ownerships are shown in this layer with JurisdictionalUnitIdentifier=null,JurisdictionalUnitAgency=null, JurisdictionalUnitKind=null, and LandownerKind="Private", LandownerCategory="Private". All land inside the US country boundary is covered by a polygon.Jurisdiction for privately owned land varies widely depending on state, county, or local laws and ordinances, fire workload, and other factors, and is not available in a national dataset in most cases.For publicly held lands the agency name is the surface managing agency, such as Bureau of Land Management, United States Forest Service, etc. The unit name refers to the descriptive name of the polygon (i.e. Northern California District, Boise National Forest, etc.).These data are used to automatically populate fields on the WFDSS Incident Information page.This data layer implements the NWCG Jurisdictional Unit Polygon Geospatial Data Layer Standard.Relevant NWCG Definitions and StandardsUnit2. A generic term that represents an organizational entity that only has meaning when it is contextualized by a descriptor, e.g. jurisdictional.Definition Extension: When referring to an organizational entity, a unit refers to the smallest area or lowest level. Higher levels of an organization (region, agency, department, etc) can be derived from a unit based on organization hierarchy.Unit, JurisdictionalThe governmental entity having overall land and resource management responsibility for a specific geographical area as provided by law.Definition Extension: 1) Ultimately responsible for the fire report to account for statistical fire occurrence; 2) Responsible for setting fire management objectives; 3) Jurisdiction cannot be re-assigned by agreement; 4) The nature and extent of the incident determines jurisdiction (for example, Wildfire vs. All Hazard); 5) Responsible for signing a Delegation of Authority to the Incident Commander.See also: Unit, Protecting; LandownerUnit IdentifierThis data standard specifies the standard format and rules for Unit Identifier, a code used within the wildland fire community to uniquely identify a particular government organizational unit.Landowner Kind & CategoryThis data standard provides a two-tier classification (kind and category) of landownership. Attribute Fields JurisdictionalAgencyKind Describes the type of unit Jurisdiction using the NWCG Landowner Kind data standard. There are two valid values: Federal, and Other. A value may not be populated for all polygons.JurisdictionalAgencyCategoryDescribes the type of unit Jurisdiction using the NWCG Landowner Category data standard. Valid values include: ANCSA, BIA, BLM, BOR, DOD, DOE, NPS, USFS, USFWS, Foreign, Tribal, City, County, OtherLoc (other local, not in the standard), State. A value may not be populated for all polygons.JurisdictionalUnitNameThe name of the Jurisdictional Unit. Where an NWCG Unit ID exists for a polygon, this is the name used in the Name field from the NWCG Unit ID database. Where no NWCG Unit ID exists, this is the “Unit Name” or other specific, descriptive unit name field from the source dataset. A value is populated for all polygons.JurisdictionalUnitIDWhere it could be determined, this is the NWCG Standard Unit Identifier (Unit ID). Where it is unknown, the value is ‘Null’. Null Unit IDs can occur because a unit may not have a Unit ID, or because one could not be reliably determined from the source data. Not every land ownership has an NWCG Unit ID. Unit ID assignment rules are available from the Unit ID standard, linked above.LandownerKindThe landowner category value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. A value is populated for all polygons. There are three valid values: Federal, Private, or Other.LandownerCategoryThe landowner kind value associated with the polygon. May be inferred from jurisdictional agency, or by lack of a jurisdictional agency. A value is populated for all polygons. Valid values include: ANCSA, BIA, BLM, BOR, DOD, DOE, NPS, USFS, USFWS, Foreign, Tribal, City, County, OtherLoc (other local, not in the standard), State, Private.DataSourceThe database from which the polygon originated. Be as specific as possible, identify the geodatabase name and feature class in which the polygon originated.SecondaryDataSourceIf the Data Source is an aggregation from other sources, use this field to specify the source that supplied data to the aggregation. For example, if Data Source is "PAD-US 2.1", then for a USDA Forest Service polygon, the Secondary Data Source would be "USDA FS Automated Lands Program (ALP)". For a BLM polygon in the same dataset, Secondary Source would be "Surface Management Agency (SMA)."SourceUniqueIDIdentifier (GUID or ObjectID) in the data source. Used to trace the polygon back to its authoritative source.MapMethod:Controlled vocabulary to define how the geospatial feature was derived. Map method may help define data quality. MapMethod will be Mixed Method by default for this layer as the data are from mixed sources. Valid Values include: GPS-Driven; GPS-Flight; GPS-Walked; GPS-Walked/Driven; GPS-Unknown Travel Method; Hand Sketch; Digitized-Image; DigitizedTopo; Digitized-Other; Image Interpretation; Infrared Image; Modeled; Mixed Methods; Remote Sensing Derived; Survey/GCDB/Cadastral; Vector; Phone/Tablet; OtherDateCurrentThe last edit, update, of this GIS record. Date should follow the assigned NWCG Date Time data standard, using 24 hour clock, YYYY-MM-DDhh.mm.ssZ, ISO8601 Standard.CommentsAdditional information describing the feature. GeometryIDPrimary key for linking geospatial objects with other database systems. Required for every feature. This field may be renamed for each standard to fit the feature.JurisdictionalUnitID_sansUSNWCG Unit ID with the "US" characters removed from the beginning. Provided for backwards compatibility.JoinMethodAdditional information on how the polygon was matched information in the NWCG Unit ID database.LocalNameLocalName for the polygon provided from PADUS or other source.LegendJurisdictionalAgencyJurisdictional Agency but smaller landholding agencies, or agencies of indeterminate status are grouped for more intuitive use in a map legend or summary table.LegendLandownerAgencyLandowner Agency but smaller landholding agencies, or agencies of indeterminate status are grouped for more intuitive use in a map legend or summary table.DataSourceYearYear that the source data for the polygon were acquired.Data InputThis dataset is based on an aggregation of 4 spatial data sources: Protected Areas Database US (PAD-US 2.1), data from Bureau of Indian Affairs regional offices, the BLM Alaska Fire Service/State of Alaska, and Census Block-Group Geometry. NWCG Unit ID and Agency Kind/Category data are tabular and sourced from UnitIDActive.txt, in the WFMI Unit ID application (https://wfmi.nifc.gov/unit_id/Publish.html). Areas of with unknown Landowner Kind/Category and Jurisdictional Agency Kind/Category are assigned LandownerKind and LandownerCategory values of "Private" by use of the non-water polygons from the Census Block-Group geometry.PAD-US 2.1:This dataset is based in large part on the USGS Protected Areas Database of the United States - PAD-US 2.`. PAD-US is a compilation of authoritative protected areas data between agencies and organizations that ultimately results in a comprehensive and accurate inventory of protected areas for the United States to meet a variety of needs (e.g. conservation, recreation, public health, transportation, energy siting, ecological, or watershed assessments and planning). Extensive documentation on PAD-US processes and data sources is available.How these data were aggregated:Boundaries, and their descriptors, available in spatial databases (i.e. shapefiles or geodatabase feature classes) from land management agencies are the desired and primary data sources in PAD-US. If these authoritative sources are unavailable, or the agency recommends another source, data may be incorporated by other aggregators such as non-governmental organizations. Data sources are tracked for each record in the PAD-US geodatabase (see below).BIA and Tribal Data:BIA and Tribal land management data are not available in PAD-US. As such, data were aggregated from BIA regional offices. These data date from 2012 and were substantially updated in 2022. Indian Trust Land affiliated with Tribes, Reservations, or BIA Agencies: These data are not considered the system of record and are not intended to be used as such. The Bureau of Indian Affairs (BIA), Branch of Wildland Fire Management (BWFM) is not the originator of these data. The
Facebook
TwitterThe dataset was derived by the Bioregional Assessment Programme. This dataset was derived from multiple datasets. You can find a link to the parent datasets in the Lineage Field in this metadata statement. The History Field in this metadata statement describes how this dataset was derived.
The Preliminary Assessment Extent (PAE) is a spatial layer that defines the land surface area contained within a bioregion over which coal resource development may have potential impact on water-dependent assets and receptors associated with those assets (Barrett et al 2013).
The role of the PAE is to optimise research agency effort by focussing on those locations where a material causal link may occur between coal resource development and impacts on water dependent assets. The lists of assets collated by the Program are filtered for "proximity" such that only those assets that intersect with the PAE are considered further in the assessment process. Changes to the PAE such as through the identification of a different development pathway or an improved hydrological understanding may require the proximity of assets to be considered again. Should the assessment process identify a material connection between a water dependent asset outside the PAE and coal resource development impacts, the PAE would need to be amended.
The PAE is derived from the intersection of surface hydrology features; groundwater management units; mining development leases and/or CSG tenements; and, directional flows of surface and groundwater.
The following 5 inputs were used by the Specialists to define the Preliminary Assessment Extent:
Bioregion boundary
Geology and the coal resource
Surface water hydrology
Groundwater hydrology
Flow paths (Known available information on gradients of pressure, water table height, stream direction, surface-ground water interactions and any other available data)
Bioregional Assessment Programme (2014) CLM Preliminary Assessment Extent Definition & Report( CLM PAE). Bioregional Assessment Derived Dataset. Viewed 28 September 2017, http://data.bioregionalassessments.gov.au/dataset/2cdd0e81-026e-4a41-87b0-ec003eddc5c1.
Derived From Bioregional Assessment areas v02
Derived From Natural Resource Management (NRM) Regions 2010
Derived From QLD Petroleum Leases, 28/11/2013
Derived From Bioregional Assessment areas v01
Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)
Derived From QLD Current Exploration Permits for Minerals (EPM) in Queensland 6/3/2013
Derived From GEODATA TOPO 250K Series 3
Derived From NSW Catchment Management Authority Boundaries 20130917
Derived From Geological Provinces - Full Extent
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Scientists often try to reproduce observations with a model, helping them explain the observations by adjusting known and controllable features within the model. They then use a large variety of metrics for assessing the ability of a model to reproduce the observations. One such metric is called the relative operating characteristic (ROC) curve, a tool that assesses a model’s ability to predict events within the data. The ROC curve is made by sliding the event-definition threshold in the model output, calculating certain metrics and making a graph of the results. Here, a new model assessment tool is introduced, called the sliding threshold of observation for numeric evaluation (STONE) curve. The STONE curve is created by sliding the event definition threshold not only for the model output but also simultaneously for the data values. This is applicable when the model output is trying to reproduce the exact values of a particular data set. While the ROC curve is still a highly valuable tool for optimizing the prediction of known and pre-classified events, it is argued here that the STONE curve is better for assessing model prediction of a continuous-valued data set. ;Data and code were created using IDL, but can also be accessed with the open-source Gnu Data Language (GDL; see https://github.com/gnudatalanguage/gdl)
Facebook
TwitterIts been two years since the news that Canada has legalized weed hit us, so I was like why don't we get a dataset from Kaggle to practice a bit of data analysis and to my surprise I cannot find a weed dataset which reflects the economics behind legalized weed and how it has changed over time ,so I just went to the Canadian govt data site , and ola they have CSV files on exactly what I wanted floating around on their website and all I did was to download it straight up, and here I am to share it with the community.
We have a series of CSV files each having data about things like supply, use case, production, etc but before we go into the individual files there are a few data columns which are common to all csv files
Understanding metadata files:
Cube Title: The title of the table. The output files are unilingual and thus will contain either the English or French title.
Product Id (PID): The unique 8 digit product identifier for the table.
CANSIM Id: The ID number which formally identified the table in CANSIM. (where applicable)
URL: The URL for the representative (default) view of a given data table.
Cube Notes: Each note is assigned a unique number. This field indicates which notes, if any, are applied to the entire table.
Archive Status: Describes the status of a table as either 'Current' or 'Archived'. Archived tables are those that are no longer updated.
Frequency: Frequency of the table. (i.e. annual)
Start Reference Period: The starting reference period for the table.
End Reference Period: The end reference period for the table.
Total Number of Dimensions: The total number of dimensions contained in the table.
Dimension Name: The name of a dimension in a table. There can be up to 10 dimensions in a table. (i.e. – Geography)
Dimension ID: The reference code assigned to a dimension in a table. A unique reference Dimension ID code is assigned to each dimension in a table.
Dimension Notes: Each note is assigned a unique number. This field indicates which notes are applied to a particular dimension.
Dimension Definitions: Reserved for future development.
Member Name: The textual description of the members in a dimension. (i.e. – Nova Scotia, Ontario (members of the Geography dimension))
Member ID: The code assigned to a member of a dimension. There is a unique ID for each member within a dimension. These IDs are used to create the coordinate field in the data file. (see the 'coordinate' field in the data record layout).
Classification (where applicable): Classification code for a member. Definitions, data sources and methods
Parent Member ID: The code used to display the hierarchical relationship between members in a dimension. (i.e. – The member Ontario (5) is a child of the member Canada (1) in the dimension 'Geography')
Terminated: Indicates whether a member has been terminated or not. Terminated members are those that are no longer updated.
Member Notes: Each note is assigned a unique number. This field indicates which notes are applied to each member.
Member definitions: Reserved for future development.
Symbol Legend: The symbol legend provides descriptions of the various symbols which can appear in a table. This field describes a comprehensive list of all possible symbols, regardless of whether a selected symbol appears in a particular table.
Survey Code: The unique code associated with a survey or program from which the data in the table is derived. Data displayed in one table may be derived ...
Facebook
Twitterhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/eumetsat-cm-saf/eumetsat-cm-saf_7b12bbcf51145abbb79a82e4d2abe6aac6e84db8918a0214e8a80e783ff1ec9f.pdfhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/eumetsat-cm-saf/eumetsat-cm-saf_7b12bbcf51145abbb79a82e4d2abe6aac6e84db8918a0214e8a80e783ff1ec9f.pdf
Upper Tropospheric Humidity (UTH) is of key importance to the Earth’s greenhouse effect and understanding of climate change. It is considered an Essential Climate Variable (ECV) because it controls key atmospheric processes, including those involved in water vapour and cloud feedbacks, that can amplify the climate system’s response to increases in other greenhouse gases. The Upper Tropospheric Humidity is defined as the integrated amount of Water Vapour in the atmospheric layer between ~500 hPa and ~200 hPa. The Upper Tropospheric Humidity product consists of two editions: the UTH Thematic Climate Data Record (TCDR) and its extension in time Interim Climate Data Record (ICDR) Edition 1.0 on the one hand, the UTH Thematic Climate Data Record Edition 2.0 on the other hand. The UTH TCDR and ICDR Edition 1.0 are derived from observations from the AMSU-B and MHS microwave humidity sounder instruments on board the NOAA- and MetOp- satellite series. Instantaneous satellite observations are used to derive a spatio-temporal averaged data record. The data are available as twice daily (one for the ascending and the other for the descending passes) averages on a regular latitude/longitude grid. Additionally, the daily mean UTH is provided as a weighted average of ascending and descending orbits for all the grid points with valid ascending and descending observations. The UTH TCDR Edition 2.0 is a daily mean product computed from all available hourly observations for each grid cell. It relies on substantially improved retrievals and uncertainty estimates due to the inclusion of the additional sensors SSM/T-2 and ATMS on board the DMSP- satellite series and the S-NPP satellite respectively. UTH is part of the Water Vapour ECV inventory in the Climate Data Store (CDS) together with Total Column Water Vapour and Tropospheric Humidity Profiles. Both TCDR components are brokered to the CDS and were originally produced on behalf of EUMETSAT Satellite Application Facility on Climate Monitoring (CMSAF) by the UK Met Office. The ICDR component has been delivered directly to the CDS and was originally produced on behalf of the Copernicus Climate Change Service (C3S) by the UK Met Office.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The "Wikipedia SQLite Portable DB" is a compact and efficient database derived from the Kensho Derived Wikimedia Dataset (KDWD). This dataset provides a condensed subset of raw Wikimedia data in a format optimized for natural language processing (NLP) research and applications.
I am not affiliated or partnered with the Kensho in any way, just really like the dataset for giving my agents to query easily.
Key Features:
Contains over 5 million rows of data from English Wikipedia and Wikidata Stored in a portable SQLite database format for easy integration and querying Includes a link-annotated corpus of English Wikipedia pages and a compact sample of the Wikidata knowledge base Ideal for NLP tasks, machine learning, data analysis, and research projects
The database consists of four main tables:
This dataset is derived from the Kensho Derived Wikimedia Dataset (KDWD), which is built from the English Wikipedia snapshot from December 1, 2019, and the Wikidata snapshot from December 2, 2019. The KDWD is a condensed subset of the raw Wikimedia data in a form that is helpful for NLP work, and it is released under the CC BY-SA 3.0 license. Credits: The "Wikipedia SQLite Portable DB" is derived from the Kensho Derived Wikimedia Dataset (KDWD), created by the Kensho R&D group. The KDWD is based on data from Wikipedia and Wikidata, which are crowd-sourced projects supported by the Wikimedia Foundation. We would like to acknowledge and thank the Kensho R&D group for their efforts in creating the KDWD and making it available for research and development purposes. By providing this portable SQLite database, we aim to make Wikipedia data more accessible and easier to use for researchers, data scientists, and developers working on NLP tasks, machine learning projects, and other data-driven applications. We hope that this dataset will contribute to the advancement of NLP research and the development of innovative applications utilizing Wikipedia data.
https://www.kaggle.com/datasets/kenshoresearch/kensho-derived-wikimedia-data/data
Tags: encyclopedia, wikipedia, sqlite, database, reference, knowledge-base, articles, information-retrieval, natural-language-processing, nlp, text-data, large-dataset, multi-table, data-science, machine-learning, research, data-analysis, data-mining, content-analysis, information-extraction, text-mining, text-classification, topic-modeling, language-modeling, question-answering, fact-checking, entity-recognition, named-entity-recognition, link-prediction, graph-analysis, network-analysis, knowledge-graph, ontology, semantic-web, structured-data, unstructured-data, data-integration, data-processing, data-cleaning, data-wrangling, data-visualization, exploratory-data-analysis, eda, corpus, document-collection, open-source, crowdsourced, collaborative, online-encyclopedia, web-data, hyperlinks, categories, page-views, page-links, embeddings
Usage with LIKE queries: ``` import aiosqlite import asyncio
class KenshoDatasetQuery: def init(self, db_file): self.db_file = db_file
async def _aenter_(self):
self.conn = await aiosqlite.connect(self.db_file)
return self
async def _aexit_(self, exc_type, exc_val, exc_tb):
await self.conn.close()
async def search_pages_by_title(self, title):
query = """
SELECT pages.page_id, pages.item_id, pages.title, pages.views,
items.labels AS item_labels, items.description AS item_description,
link_annotated_text.sections
FROM pages
JOIN items ON pages.item_id = items.id
JOIN link_annotated_text ON pages.page_id = link_annotated_text.page_id
WHERE pages.title LIKE ?
"""
async with self.conn.execute(query, (f"%{title}%",)) as cursor:
return await cursor.fetchall()
async def search_items_by_label_or_description(self, keyword):
query = """
SELECT id, labels, description
FROM items
WHERE labels LIKE ? OR description LIKE ?
"""
async with self.conn.execute(query, (f"%{keyword}%", f"%{keyword}%")) as cursor:
return await cursor.fetchall()
async def search_items_by_label(self, label):
query = """
SELECT id, labels, description
FROM items
WHERE labels LIKE ?
"""
async with self.conn.execute(query, (f"%{label}%",)) as cursor:
return await cursor.fetchall()
async def search_properties_by_label_or_desc...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Pitch-by-pitch data from Statcast (https://www.mlb.com/statcast) for the 2024 MLB (Major League Baseball) Postseason, updated as the 2025 Postseason progresses.
Statcast is Major League Baseball’s (MLB) advanced tracking system that collects detailed, high-resolution data on every pitch, swing, and defensive play across all MLB ballparks. Using high-speed cameras and radar sensors, Statcast measures metrics such as pitch velocity, spin rate, release angle, exit velocity, launch angle, player sprint speed, and much more.
This dataset contains pitch-by-pitch data extracted from the Statcast system through public APIs. Each record represents a single pitch event and includes contextual information such as game ID, pitcher, batter, pitch type, pitch velocity, spin rate, pitch location, and outcome.
Researchers can use this dataset to explore topics such as pitch classification, pitcher performance analytics, batter decision-making, expected outcomes and player comparison models for MLB's 2025 Postseason.
Reference: https://baseballsavant.mlb.com/csv-docs
'game_year', = Year game took place
'game_date', = Date of the Game.
'pitch_type', = The type of pitch derived from Statcast.
'events', = Event of the resulting Plate Appearance.
'description', = Description of the resulting pitch.
'launch_speed' = Exit velocity of the batted ball as tracked by Statcast. For the limited subset of batted balls not tracked directly, estimates are included based on the process described here.
'pitch_name', = The name of the pitch derived from the Statcast Data.
'home_score', = Pre-pitch home score
'away_score', = Pre-pitch away score
'bat_score', = Pre-pitch bat team score
'fld_score', = Pre-pitch field team score
'if_fielding_alignment', = Infield fielding alignment at the time of the pitch.
'of_fielding_alignment', = Outfield fielding alignment at the time of the pitch.
'spin_axis', = The Spin Axis in the 2D X-Z plane in degrees from 0 to 360, such that 180
represents a pure backspin fastball and 0 degrees represents a pure topspin
(12-6) curveball
'effective_speed', = Derived speed based on the the extension of the pitcher's release.
'release_spin_rate', = TBD (not defined on site)
'release_extension', = Release extension of pitch in feet as tracked by Statcast.
'release_pos_y', = Release position of pitch measured in feet from the catcher's perspective.
'at_bat_number', = Plate appearance number of the game.
'player_name', = Player's name tied to the event of the search.
'batter', = MLB Player Id tied to the play event.
'pitcher', = MLB Player Id tied to the play event.
'pfx_x', = Horizontal movement in feet from the catcher's perspective.
'release_speed', = Pitch velocities from 2008-16 are via Pitch F/X, and adjusted to roughly out-of-hand release point. All velocities from 2017 and beyond are Statcast, which are reported out-of-hand.
'release_pos_x', = Horizontal Release Position of the ball measured in feet from the catcher's perspective.
'release_pos_z', = Vertical Release Position of the ball measured in feet from the catcher's perspective.
'spin_dir', = Deprecated field from the old tracking system.
'zone', = Zone location of the ball when it crosses the plate from the catcher's perspective.
'p_throws', = Hand pitcher throws with.
'stand', = Side of the plate batter is standing. 'balls', = Pre-pitch number of balls in count.
'strikes', = Pre-pitch number of strikes in count.
'pfx_z', = Vertical movement in feet from the catcher's perpsective.
'plate_x', = Horizontal position of the ball when it crosses home plate from the catcher's perspective.
'plate_z', = Vertical position of the ball when it crosses home plate from the catcher's perspective.
'vx0', = The velocity of the pitch, in feet per second, in x-dimension, determined at y=50 feet.
'vy0', = The velocity of the pitch, in feet per second, in y-dimension, determined at y=50 feet.
'vz0', = The velocity of the pitch, in feet per second, in z-dimension, determined at y=50 feet.
'ax', = The acceleration of the pitch, in feet per second per second, in x-dimension, determined at y=50 feet.
'ay', = The acceleration of the pitch, in feet per second per second, in y-dimension, determined at y=50 feet.
'az' = The acceleration of the pitch, in feet per second per second, in z-dimension, determined at y=50 feet.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
How to citePlease cite the original dictionary and the dataset and code in this repository as follows:Kähler, Hans. (1987). Enggano-Deutsches Wörterbuch (Veröffentlichungen Des Seminars Für Indonesische Und Südseesprachen Der Universität Hamburg 14). Berlin; Hamburg: Dietrich Reimer Verlag. https://search.worldcat.org/title/18191699.Rajeg, Gede Primahadi Wijaya; Pramartha, Cokorda Rai Adi; Sarasvananda, Ida Bagus Gede; Widiatmika, Putu Wahyu; Segara, Ida Bagus Made Ari; Pita, Yul Fulgensia Rusman; et al. (2024). Retro-digitised Enggano-German dictionary derived from Kähler’s (1987) “Enggano-Deutsches Wörterbuch”. University of Oxford. Dataset. https://doi.org/10.25446/oxford.28057742OverviewThis is a hand-digitised Enggano-German Dictionary derived from Hans Kähler's (1987) “Enggano-Deutsches Wörterbuch”. We crowdsourced the digitisation process by transcribing the dictionary's content into an online database system; the system was set up by Cokorda Pramartha and I B. G. Sarasvananda in collaboration with the first author. The database is exported into a .csv file to be further processed computationally and manually, such as fixing typos and incorrect mapping of the entry element, providing the English and Indonesian translations, and standardising the orthography.A pre-release can be accessed here. The minor update in the current version includes adding a description of the column names for the tabular data of the digitised dictionary. The dictionary is stored as a table for three file types: .rds (for the R data format), .csv, and .tsv.Aspects to be worked out for the future development of the dataset can be accessed here.
Facebook
TwitterIntegrated Global Radiosonde Archive (IGRA) Version 2 consists of quality-controlled radiosonde observations of temperature, humidity, and wind at stations across all continents. Data are drawn from more than 30 different sources. The earliest year of data is 1905, and the data are updated on a daily basis. Record length, vertical extent and resolution, and availability of variables varies among stations and over time. In addition to the merged and quality-controlled set of soundings, several supplementary products are included: sounding-derived moisture and stability parameters for each suitable sounding; monthly means at mandatory pressure levels; the Radiosonde Atmospheric Temperature Products for Assessing Climate (RATPAC) in which post-1997 data are based on IGRA 2; and station history information derived from documented changes in instruments and observing practice as well as from instrument codes received along with the sounding data. The change to Version 2.2 includes two additional data streams which permits further updating of the IGRA data records that use the new BUFR format. Version 2.2 began in 2023.
Facebook
TwitterThe dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.
The dataset contains 10,000 replicates of AWRA model pre-processing outputs (streamflow Qtot and baseflow Qb), used for calculating additional coal resources development impacts on hydrological response variables in 30 simulation nodes (Zhang et al., 2016).
References
Zhang Y Q, Viney N R, Peeters L J M, Wang B, Yang A, Li L T, McVicar T R, Marvanek S P, Rachakonda P K, Shi X G, Pagendam D E and Singh R M (2016) Surface water numerical modelling for the Gloucester subregion. Product 2.6.1 for the Gloucester subregion from the Northern Sydney Basin Bioregional Assessment. Department of the Environment, Bureau of Meteorology, CSIRO and Geoscience Australia, Australia., Department of the Environment, Bureau of Meteorology, CSIRO and Geoscience Australia, Australia., http://data.bioregionalassessments.gov.au/product/NSB/GLO/2.6.1.
This pre-processing data is used for estimating AWRA post-processing streamflow outputs under CRDP and baseline conditions, respectively.
The dataset has all files and scripts necessary to execute the 10,000 runs on the linux platform of the CSIRO High Performance Cluster computers.
The AWRA-L model version 4.5 has been used for all BA surface water simulations. The application is developed with the C\# language. All execution and class (dll) files can be found at \\OSM-07-CDC.it.csiro.au\OSM_CBR_LW_BA_working\Disciplines\SurfaceWater\Modelling\AWRA-LG\Bin. The executable file "BACalibrationAndSimulationApp.exe" generates global definition files which define the input and output data and input time series locations. The executable file "SimulateModel.exe" runs simulations based on the global definition files and outputs required variables (Qtot, Qb, Dd) in NetCDF format. All simulation runs have implemented on local Windows 7 work stations.
The AWRA preprocessing data are the inputs for estimating AWRA post-processing model outputs (GUID: http://data.bioregionalassessments.gov.au/dataset/15ca8f9d-84b4-4395-87db-ab4ff15b9f07).
The dataset was uploaded to
\\lw-osm-01-cdc.it.csiro.au\OSM_CBR_LW_BAModelRuns_app\GLO\AWRA_ScalingChange_rerun on 03 September 2016
This dataset were further used to compute daily streamflow post-processing outputs under CRDP and baseline conditions, respectively.
Bioregional Assessment Programme (XXXX) GLO AWRA Model Pre-Processing Data v01. Bioregional Assessment Derived Dataset. Viewed 18 July 2018, http://data.bioregionalassessments.gov.au/dataset/51079bcc-96a8-409d-a951-3671fbbad6a2.
Derived From Standard Instrument Local Environmental Plan (LEP) - Heritage (HER) (NSW)
Derived From NSW Office of Water GW licence extract linked to spatial locations - GLO v5 UID elements 27032014
Derived From GLO SW Receptors 20150828 withRivers&CatchmentAreas
Derived From Groundwater Economic Assets GLO 20150326
Derived From Gloucester digitised coal mine boundaries
Derived From Groundwater Dependent Ecosystems supplied by the NSW Office of Water on 13/05/2014
Derived From NSW Office of Water GW licence extract linked to spatial locations GLOv4 UID 14032014
Derived From Communities of National Environmental Significance Database - RESTRICTED - Metadata only
Derived From GLO SW receptor total catchment areas V01
Derived From National Groundwater Dependent Ecosystems (GDE) Atlas
Derived From Asset database for the Gloucester subregion on 12 September 2014
Derived From GEODATA 9 second DEM and D8: Digital Elevation Model Version 3 and Flow Direction Grid 2008
Derived From National Groundwater Information System (NGIS) v1.1
Derived From GLO Receptors 20150518
Derived From Groundwater Entitlement Data GLO NSW Office of Water 20150320 PersRemoved
Derived From Natural Resource Management (NRM) Regions 2010
Derived From Groundwater Entitlement Data Gloucester - NSW Office of Water 20150320
Derived From New South Wales NSW Regional CMA Water Asset Information WAIT tool databases, RESTRICTED Includes ALL Reports
Derived From National Groundwater Dependent Ecosystems (GDE) Atlas (including WA)
Derived From EIS Gloucester Coal 2010
Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)
Derived From GEODATA TOPO 250K Series 3
Derived From Asset database for the Gloucester subregion on 28 May 2015
Derived From NSW Catchment Management Authority Boundaries 20130917
Derived From Geological Provinces - Full Extent
Derived From Geofabric Surface Cartography - V2.1
Derived From NSW Office of Water GW licence extract linked to spatial locations GLOv3 12032014
Derived From EIS for Rocky Hill Coal Project 2013
Derived From Bioregional Assessment areas v03
Derived From BILO Gridded Climate Data: Daily Climate Data for each year from 1900 to 2012
Derived From National Heritage List Spatial Database (NHL) (v2.1)
Derived From Asset database for the Gloucester subregion on 8 April 2015
Derived From Gloucester - Additional assets from local councils
Derived From NSW Office of Water combined geodatabase of regulated rivers and water sharing plan regions
Derived From Asset database for the Gloucester subregion on 29 August 2014
Derived From Collaborative Australian Protected Areas Database (CAPAD) 2010 - External Restricted
Derived From Groundwater Modelling Report for Stratford Coal Mine
Derived From Directory of Important Wetlands in Australia (DIWA) Spatial Database (Public)
Derived From NSW Office of Water Groundwater Licence Extract Gloucester - Oct 2013
Derived From New South Wales NSW - Regional - CMA - Water Asset Information Tool - WAIT - databases
Derived From Freshwater Fish Biodiversity Hotspots
Derived From NSW Office of Water Groundwater licence extract linked to spatial locations GLOv2 19022014
Derived From GLO climate data stats summary
Derived From Australia - Species of National Environmental Significance Database
**Derived
Facebook
Twitterhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/insitu-gridded-observations-global-and-regional/insitu-gridded-observations-global-and-regional_15437b363f02bf5e6f41fc2995e3d19a590eb4daff5a7ce67d1ef6c269d81d68.pdfhttps://object-store.os-api.cci2.ecmwf.int:443/cci2-prod-catalogue/licences/insitu-gridded-observations-global-and-regional/insitu-gridded-observations-global-and-regional_15437b363f02bf5e6f41fc2995e3d19a590eb4daff5a7ce67d1ef6c269d81d68.pdf
This dataset provides high-resolution gridded temperature and precipitation observations from a selection of sources. Additionally the dataset contains daily global average near-surface temperature anomalies. All fields are defined on either daily or monthly frequency. The datasets are regularly updated to incorporate recent observations. The included data sources are commonly known as GISTEMP, Berkeley Earth, CPC and CPC-CONUS, CHIRPS, IMERG, CMORPH, GPCC and CRU, where the abbreviations are explained below. These data have been constructed from high-quality analyses of meteorological station series and rain gauges around the world, and as such provide a reliable source for the analysis of weather extremes and climate trends. The regular update cycle makes these data suitable for a rapid study of recently occurred phenomena or events. The NASA Goddard Institute for Space Studies temperature analysis dataset (GISTEMP-v4) combines station data of the Global Historical Climatology Network (GHCN) with the Extended Reconstructed Sea Surface Temperature (ERSST) to construct a global temperature change estimate. The Berkeley Earth Foundation dataset (BERKEARTH) merges temperature records from 16 archives into a single coherent dataset. The NOAA Climate Prediction Center datasets (CPC and CPC-CONUS) define a suite of unified precipitation products with consistent quantity and improved quality by combining all information sources available at CPC and by taking advantage of the optimal interpolation (OI) objective analysis technique. The Climate Hazards Group InfraRed Precipitation with Station dataset (CHIRPS-v2) incorporates 0.05° resolution satellite imagery and in-situ station data to create gridded rainfall time series over the African continent, suitable for trend analysis and seasonal drought monitoring. The Integrated Multi-satellitE Retrievals dataset (IMERG) by NASA uses an algorithm to intercalibrate, merge, and interpolate “all'' satellite microwave precipitation estimates, together with microwave-calibrated infrared (IR) satellite estimates, precipitation gauge analyses, and potentially other precipitation estimators over the entire globe at fine time and space scales for the Tropical Rainfall Measuring Mission (TRMM) and its successor, Global Precipitation Measurement (GPM) satellite-based precipitation products. The Climate Prediction Center morphing technique dataset (CMORPH) by NOAA has been created using precipitation estimates that have been derived from low orbiter satellite microwave observations exclusively. Then, geostationary IR data are used as a means to transport the microwave-derived precipitation features during periods when microwave data are not available at a location. The Global Precipitation Climatology Centre dataset (GPCC) is a centennial product of monthly global land-surface precipitation based on the ~80,000 stations world-wide that feature record durations of 10 years or longer. The data coverage per month varies from ~6,000 (before 1900) to more than 50,000 stations. The Climatic Research Unit dataset (CRU v4) features an improved interpolation process, which delivers full traceability back to station measurements. The station measurements of temperature and precipitation are public, as well as the gridded dataset and national averages for each country. Cross-validation was performed at a station level, and the results have been published as a guide to the accuracy of the interpolation. This catalogue entry complements the E-OBS record in many aspects, as it intends to provide high-resolution gridded meteorological observations at a global rather than continental scale. These data may be suitable as a baseline for model comparisons or extreme event analysis in the CMIP5 and CMIP6 dataset.
Facebook
TwitterThese data depict the western United States Map Unit areas as defined by the USDA NRCS. Each Map Unit area contains information on a variety of soil properties and interpretations. The raster is to be joined to the .csv file by the field "mukey." We keep the raster and csv separate to preserve the full attribute names in the csv that would be truncated if attached to the raster. Once joined, the raster can be classified or analyzed by the columns which depict the properties and interpretations. It is important to note that each property has a corresponding component percent column to indicate how much of the map unit has the dominant property provided. For example, if the property "AASHTO Group Classification (Surface) 0 to 1cm" is recorded as "A-1" for a map unit, a user should also refer to the component percent field for this property (in this case 75). This means that an estimated 75% of the map unit has a "A-1" AASHTO group classification and that "A-1" is the dominant group. The property in the column is the dominant component, and so the other 25% of this map unit is comprised of other AASHTO Group Classifications. This raster attribute table was generated from the "Map Soil Properties and Interpretations" tool within the gSSURGO Mapping Toolset in the Soil Data Management Toolbox for ArcGIS™ User Guide Version 4.0 (https://www.nrcs.usda.gov/wps/PA_NRCSConsumption/download?cid=nrcseprd362255&ext=pdf) from GSSURGO that used their Map Unit Raster as the input feature (https://gdg.sc.egov.usda.gov/). The FY2018 Gridded SSURGO Map Unit Raster was created for use in national, regional, and state-wide resource planning and analysis of soils data. These data were created with guidance from the USDA NRCS. The fields named "*COMPPCT_R" can exceed 100% for some map units. The NRCS personnel are aware of and working on fixing this issue. Take caution when interpreting these areas, as they are the result of some data duplication in the master gSSURGO database. The data are considered valuable and required for timely science needs, and thus are released with this known error. The USDA NRCS are developing a data release which will replace this item when it is available. For the most up to date ssurgo releases that do not include the custom fields as this release does, see https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/home/?cid=nrcs142p2_053628#tools For additional definitions, see https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/geo/?cid=nrcs142p2_053627.