Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For more information, see the Aquatic Biodiversity Index Factsheet at https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=150856" STYLE="text-decoration:underline;">https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=150856.
The California Department of Fish and Wildlife’s (CDFW) Areas of Conservation Emphasis (ACE) is a compilation and analysis of the best-available statewide spatial information in California on biodiversity, rarity and endemism, harvested species, significant habitats, connectivity and wildlife movement, climate vulnerability, climate refugia, and other relevant data (e.g., other conservation priorities such as those identified in the State Wildlife Action Plan (SWAP), stressors, land ownership). ACE addresses both terrestrial and aquatic data. The ACE model combines and analyzes terrestrial information in a 2.5 square mile hexagon grid and aquatic information at the HUC12 watershed level across the state to produce a series of maps for use in non-regulatory evaluation of conservation priorities in California. The model addresses as many of CDFWs statewide conservation and recreational mandates as feasible using high quality data sources. High value areas statewide and in each USDA Ecoregion were identified. The ACE maps and data can be viewed in the ACE online map viewer, or downloaded for use in ArcGIS. For more detailed information see https://www.wildlife.ca.gov/Data/Analysis/ACE" STYLE="text-decoration:underline;">https://www.wildlife.ca.gov/Data/Analysis/ACE and https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=24326" STYLE="text-decoration:underline;">https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=24326.
Facebook
TwitterAttribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
Catholic Carbon Footprint Story Map Map:DataBurhans, Molly A., Cheney, David M., Gerlt, R.. . “PerCapita_CO2_Footprint_InDioceses_FULL”. Scale not given. Version 1.0. MO and CT, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2019.Map Development: Molly BurhansMethodologyThis is the first global Carbon footprint of the Catholic population. We will continue to improve and develop these data with our research partners over the coming years. While it is helpful, it should also be viewed and used as a "beta" prototype that we and our research partners will build from and improve. The years of carbon data are (2010) and (2015 - SHOWN). The year of Catholic data is 2018. The year of population data is 2016. Care should be taken during future developments to harmonize the years used for catholic, population, and CO2 data.1. Zonal Statistics: Esri Population Data and Dioceses --> Population per dioceses, non Vatican based numbers2. Zonal Statistics: FFDAS and Dioceses and Population dataset --> Mean CO2 per Diocese3. Field Calculation: Population per Diocese and Mean CO2 per diocese --> CO2 per Capita4. Field Calculation: CO2 per Capita * Catholic Population --> Catholic Carbon FootprintAssumption: PerCapita CO2Deriving per-capita CO2 from mean CO2 in a geography assumes that people's footprint accounts for their personal lifestyle and involvement in local business and industries that are contribute CO2. Catholic CO2Assumes that Catholics and non-Catholic have similar CO2 footprints from their lifestyles.Derived from:A multiyear, global gridded fossil fuel CO2 emission data product: Evaluation and analysis of resultshttp://ffdas.rc.nau.edu/About.htmlRayner et al., JGR, 2010 - The is the first FFDAS paper describing the version 1.0 methods and results published in the Journal of Geophysical Research.Asefi et al., 2014 - This is the paper describing the methods and results of the FFDAS version 2.0 published in the Journal of Geophysical Research.Readme version 2.2 - A simple readme file to assist in using the 10 km x 10 km, hourly gridded Vulcan version 2.2 results.Liu et al., 2017 - A paper exploring the carbon cycle response to the 2015-2016 El Nino through the use of carbon cycle data assimilation with FFDAS as the boundary condition for FFCO2."S. Asefi‐Najafabady P. J. Rayner K. R. Gurney A. McRobert Y. Song K. Coltin J. Huang C. Elvidge K. BaughFirst published: 10 September 2014 https://doi.org/10.1002/2013JD021296 Cited by: 30Link to FFDAS data retrieval and visualization: http://hpcg.purdue.edu/FFDAS/index.phpAbstractHigh‐resolution, global quantification of fossil fuel CO2 emissions is emerging as a critical need in carbon cycle science and climate policy. We build upon a previously developed fossil fuel data assimilation system (FFDAS) for estimating global high‐resolution fossil fuel CO2 emissions. We have improved the underlying observationally based data sources, expanded the approach through treatment of separate emitting sectors including a new pointwise database of global power plants, and extended the results to cover a 1997 to 2010 time series at a spatial resolution of 0.1°. Long‐term trend analysis of the resulting global emissions shows subnational spatial structure in large active economies such as the United States, China, and India. These three countries, in particular, show different long‐term trends and exploration of the trends in nighttime lights, and population reveal a decoupling of population and emissions at the subnational level. Analysis of shorter‐term variations reveals the impact of the 2008–2009 global financial crisis with widespread negative emission anomalies across the U.S. and Europe. We have used a center of mass (CM) calculation as a compact metric to express the time evolution of spatial patterns in fossil fuel CO2 emissions. The global emission CM has moved toward the east and somewhat south between 1997 and 2010, driven by the increase in emissions in China and South Asia over this time period. Analysis at the level of individual countries reveals per capita CO2 emission migration in both Russia and India. The per capita emission CM holds potential as a way to succinctly analyze subnational shifts in carbon intensity over time. Uncertainties are generally lower than the previous version of FFDAS due mainly to an improved nightlight data set."Global Diocesan Boundaries:Burhans, M., Bell, J., Burhans, D., Carmichael, R., Cheney, D., Deaton, M., Emge, T. Gerlt, B., Grayson, J., Herries, J., Keegan, H., Skinner, A., Smith, M., Sousa, C., Trubetskoy, S. “Diocesean Boundaries of the Catholic Church” [Feature Layer]. Scale not given. Version 1.2. Redlands, CA, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2016.Using: ArcGIS. 10.4. Version 10.0. Redlands, CA: Environmental Systems Research Institute, Inc., 2016.Boundary ProvenanceStatistics and Leadership DataCheney, D.M. “Catholic Hierarchy of the World” [Database]. Date Updated: August 2019. Catholic Hierarchy. Using: Paradox. Retrieved from Original Source.Catholic HierarchyAnnuario Pontificio per l’Anno .. Città del Vaticano :Tipografia Poliglotta Vaticana, Multiple Years.The data for these maps was extracted from the gold standard of Church data, the Annuario Pontificio, published yearly by the Vatican. The collection and data development of the Vatican Statistics Office are unknown. GoodLands is not responsible for errors within this data. We encourage people to document and report errant information to us at data@good-lands.org or directly to the Vatican.Additional information about regular changes in bishops and sees comes from a variety of public diocesan and news announcements.GoodLands’ polygon data layers, version 2.0 for global ecclesiastical boundaries of the Roman Catholic Church:Although care has been taken to ensure the accuracy, completeness and reliability of the information provided, due to this being the first developed dataset of global ecclesiastical boundaries curated from many sources it may have a higher margin of error than established geopolitical administrative boundary maps. Boundaries need to be verified with appropriate Ecclesiastical Leadership. The current information is subject to change without notice. No parties involved with the creation of this data are liable for indirect, special or incidental damage resulting from, arising out of or in connection with the use of the information. We referenced 1960 sources to build our global datasets of ecclesiastical jurisdictions. Often, they were isolated images of dioceses, historical documents and information about parishes that were cross checked. These sources can be viewed here:https://docs.google.com/spreadsheets/d/11ANlH1S_aYJOyz4TtG0HHgz0OLxnOvXLHMt4FVOS85Q/edit#gid=0To learn more or contact us please visit: https://good-lands.org/Esri Gridded Population Data 2016DescriptionThis layer is a global estimate of human population for 2016. Esri created this estimate by modeling a footprint of where people live as a dasymetric settlement likelihood surface, and then assigned 2016 population estimates stored on polygons of the finest level of geography available onto the settlement surface. Where people live means where their homes are, as in where people sleep most of the time, and this is opposed to where they work. Another way to think of this estimate is a night-time estimate, as opposed to a day-time estimate.Knowledge of population distribution helps us understand how humans affect the natural world and how natural events such as storms and earthquakes, and other phenomena affect humans. This layer represents the footprint of where people live, and how many people live there.Dataset SummaryEach cell in this layer has an integer value with the estimated number of people likely to live in the geographic region represented by that cell. Esri additionally produced several additional layers World Population Estimate Confidence 2016: the confidence level (1-5) per cell for the probability of people being located and estimated correctly. World Population Density Estimate 2016: this layer is represented as population density in units of persons per square kilometer.World Settlement Score 2016: the dasymetric likelihood surface used to create this layer by apportioning population from census polygons to the settlement score raster.To use this layer in analysis, there are several properties or geoprocessing environment settings that should be used:Coordinate system: WGS_1984. This service and its underlying data are WGS_1984. We do this because projecting population count data actually will change the populations due to resampling and either collapsing or splitting cells to fit into another coordinate system. Cell Size: 0.0013474728 degrees (approximately 150-meters) at the equator. No Data: -1Bit Depth: 32-bit signedThis layer has query, identify, pixel, and export image functions enabled, and is restricted to a maximum analysis size of 30,000 x 30,000 pixels - an area about the size of Africa.Frye, C. et al., (2018). Using Classified and Unclassified Land Cover Data to Estimate the Footprint of Human Settlement. Data Science Journal. 17, p.20. DOI: http://doi.org/10.5334/dsj-2018-020.What can you do with this layer?This layer is unsuitable for mapping or cartographic use, and thus it does not include a convenient legend. Instead, this layer is useful for analysis, particularly for estimating counts of people living within watersheds, coastal areas, and other areas that do not have standard boundaries. Esri recommends using the Zonal Statistics tool or the Zonal Statistics to Table tool where you provide input zones as either polygons, or raster data, and the tool will summarize the count of population within those zones. https://www.esri.com/arcgis-blog/products/arcgis-living-atlas/data-management/2016-world-population-estimate-services-are-now-available/
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For more information, see the Terrestrial Biodiversity Summary Factsheet at https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=150831" STYLE="text-decoration:underline;">https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=150831.
The user can view a list of species potentially present in each hexagon in the ACE online map viewer https://map.dfg.ca.gov/ace/" STYLE="text-decoration:underline;">https://map.dfg.ca.gov/ace/. Note that the names of some rare or endemic species, such as those at risk of over-collection, have been suppressed from the list of species names per hexagon, but are still included in the species counts.
The California Department of Fish and Wildlife’s (CDFW) Areas of Conservation Emphasis (ACE) is a compilation and analysis of the best-available statewide spatial information in California on biodiversity, rarity and endemism, harvested species, significant habitats, connectivity and wildlife movement, climate vulnerability, climate refugia, and other relevant data (e.g., other conservation priorities such as those identified in the State Wildlife Action Plan (SWAP), stressors, land ownership). ACE addresses both terrestrial and aquatic data. The ACE model combines and analyzes terrestrial information in a 2.5 square mile hexagon grid and aquatic information at the HUC12 watershed level across the state to produce a series of maps for use in non-regulatory evaluation of conservation priorities in California. The model addresses as many of CDFWs statewide conservation and recreational mandates as feasible using high quality data sources. High value areas statewide and in each USDA Ecoregion were identified. The ACE maps and data can be viewed in the ACE online map viewer, or downloaded for use in ArcGIS. For more detailed information see https://www.wildlife.ca.gov/Data/Analysis/ACE" STYLE="text-decoration:underline;">https://www.wildlife.ca.gov/Data/Analysis/ACE and https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=24326" STYLE="text-decoration:underline;">https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=24326.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains 82 annotated map samples from diverse historical city maps of Jerusalem and Paris, suitable for map text detection, recognition, and sequencing.
The data in maptext_format.json is organized in the same way as in the General Data from the David Rumsey Collection from ICDAR 2024 Competition on Historical Map Text Detection, Recognition, and Linking [1].
The data is structured by image, and list of sequences (groups). The boolean attributes illegible and truncated are used to provide additional insight on the data quality.
Our interpretation of the truncated and illegible tags is the following:
truncated refers to the case where part of a word is located outside the image crop, and is thus missing. In that case, the transcription stops at the image border, focusing only on the visible part of the wordillegible is a subjective indication of (un)certainty in the transcription provided. Whenever possible, a best guess transcription is provided. Otherwise, the illegible letters are filled with blank spacesThe text corresponds to the diplomatic transcription, i.e. as it appears on the document. Text are transcribes with all latin characters, with cases, diacritics (e.g. ö, ḡ) and diagraphs (e.g. Œ).
Each word polygon consists of an even number of vertices arranged in clockwise order starting from the initial point to the top left. The first n/2 vertices represent the upper boundary line following the reading direction, while the second half represents the lower boundary line in the reverse direction. Here is an illustration:
[ { "image": "map_image_1.jpg", # Here groups are what we call sequences. "groups": [ { "vertices": [[x1, y1], [x2, y2], ...], "text": "Champs", "illegible": "false", "truncated": "false" }, { "vertices": [[x1, y1], [x2, y2], ...], "text": "Elysées", "illegible": "false", "truncated": "false" } ] } ]
The file pandas_format.pkl contains the same data. It is only provided for convenience.
The maps of Paris were taken from the Historical City Maps Semantic Segmentation Dataset [2]. The original documents were digitized by the Bibliothèque nationale de France (BnF), and the Bibliothèque Historique de la Ville de Paris (BHVP).
The maps of Jerusalem were curated from the collections of the National Library of Israel (NLI), and Wikimedia Commons.
Number of words: 7528
Number of single-word sequences: 1757
Number of multi-word sequences: 1969
Statistics of multi-word sequences length:
mean: 2.93 words
std: 1.25 words
min: 2.00 words
med: 3.00 words
max: 15.00 words
The transcribed text, corresponds to the diplomatic transcription, suitable for text recognition tasks. In future updates, we hope to complement it with an additional normalization attribute, which could extend abbreviations (e.g. "bvd." => "boulevard") and normalize transcriptions (e.g. "QVARTER" => "QUARTER").
For any mention of this dataset, please cite :
@misc{paris_jerusalem_dataset_2025, author = {Dai, Tianhao and Johnson, Kaede and Petitpierre, R{\'{e}}mi and Vaienti, Beatrice and di Lenardo, Isabella}, title = {{Paris and Jerusalem City Maps Text Dataset}}, year = {2025},
publisher = {Zenodo},
url = {https://doi.org/10.5281/zenodo.14982662}}@article{recognizing_sequencing_2025, author = {Zou, Mengjie and Dai, Tianhao and Petitpierre, R{\'{e}}mi and Vaienti, Beatrice and di Lenardo, Isabella}, title = {{Recognizing and Sequencing Multi-word Texts in Maps Using an Attentive Pointer}}, year = {2025}}
Rémi PETITPIERRE - remi.petitpierre@epfl.ch - ORCID - Github - Scholar - ResearchGate
The data were annotated by two master's students from EPFL, Switzerland. The students were paid for their work using public funding, and were offered the possibility to be associated with the publication of the data.
This project is licensed under the CC BY 4.0 License.
We do not assume any liability for the use of this dataset.
Facebook
TwitterThe National Insect and Disease Risk map identifies areas with risk of significant tree mortality due to insects and plant diseases. The layer identifies lands in three classes: areas with risk of tree mortality from insects and disease between 2013 and 2027, areas with lower tree mortality risk, and areas that were formerly at risk but are no longer at risk due to disturbance (human or natural) between 2012 and 2018. Areas with risk of tree mortality are defined as places where at least 25% of standing live basal area greater than one inch in diameter will die over a 15-year time frame (2013 to 2027) due to insects and diseases.The National Insect and Disease Risk map, produced by the US Forest Service FHAAST, is part of a nationwide strategic assessment of potential hazard for tree mortality due to major forest insects and diseases. Dataset Summary Phenomenon Mapped: Risk of tree mortality due to insects and diseaseUnits: MetersCell Size: 30 meters in Hawaii and 240 meters in Alaska and the Contiguous USSource Type: DiscretePixel Type: 2-bit unsigned integerData Coordinate System: NAD 1983 Albers (Contiguous US), WGS 1984 Albers (Alaska), Hawaii Albers (Hawaii)Mosaic Projection: North America Albers Equal Area ConicExtent: Alaska, Hawaii, and the Contiguous United States Source: National Insect Disease Risk MapPublication Date: 2018ArcGIS Server URL: https://landscape11.arcgis.com/arcgis/This layer was created from the 2018 version of the National Insect Disease Risk Map.What can you do with this Layer? This layer is suitable for both visualization and analysis across the ArcGIS system. This layer can be combined with your data and other layers from the ArcGIS Living Atlas of the World in ArcGIS Online and ArcGIS Pro to create powerful web maps that can be used alone or in a story map or other application.Because this layer is part of the ArcGIS Living Atlas of the World it is easy to add to your map:In ArcGIS Online you can add this layer to a map by selecting Add then Browse Living Atlas Layers. A window will open. Type "insects and disease" in the search box and browse to the layer. Select the layer then click Add to Map.In ArcGIS Pro open a map and select Add Data from the Map Tab. Select Data at the top of the drop down menu. The Add Data dialog box will open on the left side of the box, expand Portal if necessary, then select Living Atlas. Type "insects and disease" in the search box, browse to the layer then click OK.In ArcGIS Pro you can use raster functions to create your own custom extracts of the data. Imagery layers provide fast, powerful inputs to geoprocessing tools, models, or Python scripts in Pro. For example, Zonal Statistics as Table tool can be used to summarize risk of tree mortality across several watersheds, counties, or other areas that you may be interested in such as areas near homes.In ArcGIS Online you can change then layer's symbology in the image display control, set the layer's transparency, and control the visible scale range.The ArcGIS Living Atlas of the World provides an easy way to explore many other beautiful and authoritative maps on hundreds of topics like this one.
Facebook
TwitterGap Analysis Project (GAP) habitat maps are predictions of the spatial distribution of suitable environmental and land cover conditions within the United States for individual species. Mapped areas represent places where the environment is suitable for the species to occur (i.e. suitable to support one or more life history requirements for breeding, resting, or foraging), while areas not included in the map are those predicted to be unsuitable for the species. While the actual distributions of many species are likely to be habitat limited, suitable habitat will not always be occupied because of population dynamics and species interactions. Furthermore, these maps correspond to midscale characterizations of landscapes, but individual animals may deem areas to be unsuitable because of presence or absence of fine-scale features and characteristics that are not represented in our models (e.g. snags, vernal pools, shrubby undergrowth). These maps are intended to be used at a 1:100,000 or smaller map scale. These habitat maps are created by applying a deductive habitat model to remotely-sensed data layers within a species’ range. The deductive habitat models are built by compiling information on species’ habitat associations and entering it into a relational database. Information is compiled from the best available characterizations of species’ habitat, which included species accounts in books and databases, primary peer-reviewed literature. The literature references for each species are included in the "Species Habitat Model Report" and "Machine Readable Habitat Database Parameters" files attached to each habitat map item in the repository. For all species, the compiled habitat information is used by a biologist to determine which of the ecological systems and land use classes represented in the National Gap Analysis Project’s (GAP) Land Cover Map Ver. 1.0 that species is associated with. The name of the biologist who conducted the literature review and assembled the modeling parameters is shown as the "editor" type contact for each habitat map item in the repository. For many species, information on other mapped factors that define the environment that is suitable is also entered into the database. These factors included elevation (i.e. minimum, maximum), proximity to water features, proximity to wetlands, level of human development, forest ecotone width, and forest edge; and each of these factors corresponded to a data layer that is available during the map production. The individual datasets used in the modeling process with these parameters are also made available in the ScienceBase Repository (see the end of this Summary section for details). The "Machine Readable Habitat Database Parameters" JSON file attached to each species habitat map item has an "input_layers" object that contains the specific parameter names and references (via Digital Object Identifier) to the input data used with that parameter. The specific parameters for each species were output from the database used in the modeling and mapping process to the "Species Habitat Model Report" and "Machine Readable Habitat Database Parameters" files attached to each habitat map item in the repository. The maps are generated using a python script that queries the model parameters in the database; reclassifies the GAP Land Cover Ver 1.0 and ancillary data layers within the species’ range; and combines the reclassified layers to produce the final 30m resolution habitat map. Map output is, therefore, not only a reflection of the ecological systems that are selected in the habitat model, but also any other constraints in the model that are represented by the ancillary data layers. Modeling regions were used to stratify the conterminous U.S. into six regions (Northwest, Southwest, Great Plains, Upper Midwest, Southeast, and Northeast). These regions allowed for efficient processing of the species distribution models on smaller, ecologically homogenous extents. The 2008 start date for the models represents the shift in focus from state and regional project efforts to a national one. At that point all of the datasets needed to be standardized across the national extent and the species list derived based on the current understanding of the taxonomy. The end date for the individual models represents when the species model was considered complete, and therefore reflects the current knowledge related to that species concept and the habitat requirements for the species. Versioning, Naming Conventions and Codes: A composite version code is employed to allow the user to track the spatial extent, the date of the ground conditions, and the iteration of the data set for that extent/date. For example, CONUS_2001v1 represents the spatial extent of the conterminous US (CONUS), the ground condition year of 2001, and the first iteration (v1) for that extent/date. In many cases, a GAP species code is used in conjunction with the version code to identify specific data sets or files (i.e. Cooper’s Hawk Habitat Map named bCOHAx_CONUS_2001v1_HabMap). This collection represents the first complete compilation of terrestrial vertebrate species models for the conterminous U.S. based on 2001 ground conditions. The taxonomic concept for the species model being presented is identified through the Integrated Taxonomic Information System – Taxonomic Serial Number. To provide a link to the NatureServe species information the NatureServe Element Code is provided for each species. The identifiers included for each species habitat map item in the repository include references to a vocabulary system in ScienceBase where definitions can be found for each type of identifier. Source Datasets Uses in Species Habitat Modeling: Gap Analysis Project Species Range Maps - Species ranges were used as model delimiters in predicted distribution models. https://www.sciencebase.gov/catalog/item/5951527de4b062508e3b1e79 Hydrologic Units - Modified 12-digit hydrologic units were used as the spatial framework for species ranges. https://www.sciencebase.gov/catalog/item/56d496eee4b015c306f17a42 Modeling regions - Used to stratify the conterminous U.S. into six ecologically homogeneous regions to facilitate efficient processing. https://www.sciencebase.gov/catalog/item/58b9b8cee4b03b285c07ddef Land Cover - Species were linked to individual map units to document habitat affinity in two ways. Primary map units are those land cover types critical for nesting, rearing young, and/or optimal foraging. Secondary or auxiliary map units are those land cover types generally not critical for breeding, but are typically used in conjunction with primary map units for foraging, roosting, and/or sub-optimal nesting locations. These map units are selected only when located within a specified distance from primary map units. https://www.sciencebase.gov/catalog/item/5540e2d7e4b0a658d79395db Human Impact Avoidance - Buffers around urban areas and roads were used to identify areas that would be suitable for urban exploitative species and unsuitable for urban avoiding species. https://www.sciencebase.gov/catalog/item/5540e099e4b0a658d79395d6 Forest & Edge Habitats - The land cover map was used to derive datasets of forest interior and ecotones between forest and open habitats. Forest edge https://www.sciencebase.gov/catalog/item/5540e3fce4b0a658d79395fe Forest/Open Woodland/Shrubland https://www.sciencebase.gov/catalog/item/5540e48fe4b0a658d7939600 Elevation Derivatives - Slope and aspect were used to constrain some of the southwestern models where those variables are good indicators of microclimates (moist north facing slopes) and local topography (cliffs, flats). For species with a documented relationship to altitude the elevation data was used to constrain the mapped distribution. Aspect https://www.sciencebase.gov/catalog/item/5540ec40e4b0a658d7939628 Slope https://www.sciencebase.gov/catalog/item/5540ebe2e4b0a658d7939626 Elevation https://www.sciencebase.gov/catalog/item/5540e111e4b0a658d79395d9 Hydrology - https://www.sciencebase.gov/catalog/item/5540eb44e4b0a658d7939624: A number of water related data layers were used to refine the species distribution including: water type (i.e. flowing, open/standing), distance to and from water, and stream flow and underlying gradient. The source for this data was the USGS National Hydrography Dataset (NHD)(USGS 2007). Hydrographic features were divided into three types: flowing water, open/standing water, and wet vegetation. Canopy Cover - Some species are limited to open woodlands or dense forest, the National Land Cover’s Canopy Cover dataset was used to constrain the species models based on canopy density. https://www.sciencebase.gov/catalog/item/5540eca9e4b0a658d793962b
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For more information, see the Aquatic Significant Habitats Factsheet at https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=150855. The California Department of Fish and Wildlife’s (CDFW) Areas of Conservation Emphasis (ACE) is a compilation and analysis of the best-available statewide spatial information in California on biodiversity, rarity and endemism, harvested species, significant habitats, connectivity and wildlife movement, climate vulnerability, climate refugia, and other relevant data (e.g., other conservation priorities such as those identified in the State Wildlife Action Plan (SWAP), stressors, land ownership). ACE addresses both terrestrial and aquatic data. The ACE model combines and analyzes terrestrial information in a 2.5 square mile hexagon grid and aquatic information at the HUC12 watershed level across the state to produce a series of maps for use in non-regulatory evaluation of conservation priorities in California. The model addresses as many of CDFWs statewide conservation and recreational mandates as feasible using high quality data sources. High value areas statewide and in each USDA Ecoregion were identified. The ACE maps and data can be viewed in the ACE online map viewer, or downloaded for use in ArcGIS. For more detailed information see https://www.wildlife.ca.gov/Data/Analysis/ACE and https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=24326.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This comprehensive dataset offers an in-depth exploration into US travel check-ins from Instagram. It includes detailed data scraped from Instagram, such as the location of each check-in, the USIndex for each state, average temperature for each state per month, and crime rate per state. In addition to location and time information, this dataset also provides latitude and longitude coordinates for every entry. This extensive collection of data is invaluable for those interested in studying various aspects of movement within the United States. With detailed insights on factors like climate conditions and economic health of a region at a given point in time, this dataset can help uncover fascinating trends regarding how travelers choose their destinations and how they experience their journeys around the country
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This Kaggle dataset - US Travel Check-Ins Analysis - provides valuable insights for travel researchers, marketers and businesses in the travel industry. It contains check-in location, USIndex rating (economic health of each state), average temperature, and crime rate per state. Latitude and longitude of each check-ins are also provided with added geographic context to help you visualize the data.
This guide will show you how to use this dataset for your research or business venture.
Step 1: Prepare your data First and foremost, it is important to cleanse your data before you can analyze it. Depending on what sort of analysis needs to be conducted (e.g., time series analysis) you will need to select the applicable columns from the dataset that match your needs best and exclude any unnecessary columns such as dates or season related data points as they are not relevant here. Furthermore, variable formatting should be consistent across all instances in a variable/column category as well (elevation is a good example here). You can always double check that everything is formatted correctly by running a quick summary on selected columns using conditional queries like df['var'].describe() command in Python for descriptive results about an entire column’s statistical makeup including mean values, quartile ranges etc..
Step 2: Explore & Analyze Your Data Graphically Once the data has been prepped properly you can start visualizing it in order to gain better insights into any trends or patterns that may be present within it when compared with other datasets or information sources simultaneously such as weather forecasts or nationwide trend indicators etc.. Grafana dashboards are feasible solutions when multiple dataset need to be compared but depending on what type of graphs/charts being used Excel worksheet formats can offer great customization options flexiblity along with various export file types (.csv; .jpegs; .pdfs). Plotting markers onto map applications like Google Maps API offers more geographical awareness that could useful when analyzing location dependent variables too which means we have one advantage over manual inspection tasks just by leveraging existing software applications alongside publicly available APIs!
Step 3: Interpretation & Hypothesis Testing
After generating informative graphical interpretation from exploratory visualizations the next step would involve testing out various hypotheses based on established correlations between different variables derived from overall quantitative estimates vizualizations regarding distribution trends across different regions tends towards geographical areas where certain logistical processes could yeild higher success ratios giving potential customers greater satisfaction than
- Travel trends analysis: Using this dataset, researchers could track which areas of the US are popular destinations based on travel check-ins and spot any interesting trends or correlations in terms of geography, seasonal changes, economic health or crime rates.
- Predictive Modeling: By using various features from this dataset such as average temperature, US Index and crime rate, predictors could be developed to suggest how safe an area would feel to a tourist based on their current location and other predetermined variables they choose to input into the model.
- Trip Planning Tool: The dataset can also be used to develop a tool that quickly allows travelers to plan trips according to their preferences in terms of duration and budget as well a...
Facebook
TwitterOpen AccessThis archive contains raw annual land cover maps, cropland abandonment maps, and accompanying derived data products to support: Crawford C.L., Yin, H., Radeloff, V.C., and Wilcove, D.S. 2022. Rural land abandonment is too ephemeral to provide major benefits for biodiversity and climate. Science Advances doi.org/10.1126/sciadv.abm8999. An archive of the analysis scripts developed for this project can be found at: https://github.com/chriscra/abandonment_trajectories (https://doi.org/10.5281/zenodo.6383127). Note that the label '_2022_02_07' in many file names refers to the date of the primary analysis. 'dts” or “dt” refer to “data.tables,' large .csv files that were manipulated using the data.table package in R (Dowle and Srinivasan 2021, http://r-datatable.com/). “Rasters” refer to “.tif” files that were processed using the raster and terra packages in R (Hijmans, 2022; https://rspatial.org/terra/; https://rspatial.org/raster). Data files fall into one of four categories of data derived during our analysis of abandonment: observed, potential, maximum, or recultivation. Derived datasets also follow the same naming convention, though are aggregated across sites. These four categories are as follows (using “age_dts” for our site in Shaanxi Province, China as an example): observed abandonment identified through our primary analysis, with a threshold of five years. These files do not have a specific label beyond the description of the file and the date of analysis (e.g., shaanxi_age_2022_02_07.csv); potential abandonment for a scenario without any recultivation, in which abandoned croplands are left abandoned from the year of initial abandonment through the end of the time series, with the label “_potential” (e.g., shaanxi_potential_age_2022_02_07.csv); maximum age of abandonment over the course of the time series, with the label “_max” (e.g., shaanxi_max_age_2022_02_07.csv); recultivation periods, corresponding to the lengths of recultivation periods following abandonment, given the label “_recult” (e.g., shaanxi_recult_age_2022_02_07.csv). This archive includes multiple .zip files, the contents of which are described below: age_dts.zip - Maps of abandonment age (i.e., how long each pixel has been abandoned for, as of that year, also referred to as length, duration, etc.), for each year between 1987-2017 for all 11 sites. These maps are stored as .csv files, where each row is a pixel, the first two columns refer to the x and y coordinates (in terms of longitude and latitude), and subsequent columns contain the abandonment age values for an individual year (where years are labeled with 'y' followed by the year, e.g., 'y1987'). Maps are given with a latitude and longitude coordinate reference system. Folder contains observed age, potential age (“_potential”), maximum age (“_max”), and recultivation lengths (“_recult”) for all sites. Maximum age .csv files include only three columns: x, y, and the maximum length (i.e., “max age”, in years) for each pixel throughout the entire time series (1987-2017). Files were produced using the custom functions 'cc_filter_abn_dt(),' “cc_calc_max_age(),' “cc_calc_potential_age(),” and “cc_calc_recult_age();” see '_util/_util_functions.R.' age_rasters.zip - Maps of abandonment age (i.e., how long each pixel has been abandoned for), for each year between 1987-2017 for all 11 sites. Maps are stored as .tif files, where each band corresponds to one of the 31 years in our analysis (1987-2017), in ascending order (i.e., the first layer is 1987 and the 31st layer is 2017). Folder contains observed age, potential age (“_potential”), and maximum age (“_max”) rasters for all sites. Maximum age rasters include just one band (“layer”). These rasters match the corresponding .csv files contained in 'age_dts.zip.” derived_data.zip - summary datasets created throughout this analysis, listed below. diff.zip - .csv files for each of our eleven sites containing the year-to-year lagged differences in abandonment age (i.e., length of time abandoned) for each pixel. The rows correspond to a single pixel of land, and the columns refer to the year the difference is in reference to. These rows do not have longitude or latitude values associated with them; however, rows correspond to the same rows in the .csv files in 'input_data.tables.zip' and 'age_dts.zip.' These files were produced using the custom function 'cc_diff_dt()' (much like the base R function 'diff()'), contained within the custom function 'cc_filter_abn_dt()' (see '_util/_util_functions.R'). Folder contains diff files for observed abandonment, potential abandonment (“_potential”), and recultivation lengths (“_recult”) for all sites. input_dts.zip - annual land cover maps for eleven sites with four land cover classes (see below), adapted from Yin et al. 2020 Remote Sensing of Environment (https://doi.org/10.1016/j.rse.2020.111873). Like “age_dts,” these maps are stored as .csv files, where each row is a pixel and the first two columns refer to x and y coordinates (in terms of longitude and latitude). Subsequent columns contain the land cover class for an individual year (e.g., 'y1987'). Note that these maps were recoded from Yin et al. 2020 so that land cover classification was consistent across sites (see below). This contains two files for each site: the raw land cover maps from Yin et al. 2020 (after recoding), and a “clean” version produced by applying 5- and 8-year temporal filters to the raw input (see custom function “cc_temporal_filter_lc(),” in “_util/_util_functions.R” and “1_prep_r_to_dt.R”). These files correspond to those in 'input_rasters.zip,' and serve as the primary inputs for the analysis. input_rasters.zip - annual land cover maps for eleven sites with four land cover classes (see below), adapted from Yin et al. 2020 Remote Sensing of Environment. Maps are stored as '.tif' files, where each band corresponds one of the 31 years in our analysis (1987-2017), in ascending order (i.e., the first layer is 1987 and the 31st layer is 2017). Maps are given with a latitude and longitude coordinate reference system. Note that these maps were recoded so that land cover classes matched across sites (see below). Contains two files for each site: the raw land cover maps (after recoding), and a “clean” version that has been processed with 5- and 8-year temporal filters (see above). These files match those in 'input_dts.zip.' length.zip - .csv files containing the length (i.e., age or duration, in years) of each distinct individual period of abandonment at each site. This folder contains length files for observed and potential abandonment, as well as recultivation lengths. Produced using the custom function 'cc_filter_abn_dt()' and “cc_extract_length();” see '_util/_util_functions.R.' derived_data.zip contains the following files: 'site_df.csv' - a simple .csv containing descriptive information for each of our eleven sites, along with the original land cover codes used by Yin et al. 2020 (updated so that all eleven sites in how land cover classes were coded; see below). Primary derived datasets for both observed abandonment (“area_dat”) and potential abandonment (“potential_area_dat”). area_dat - Shows the area (in ha) in each land cover class at each site in each year (1987-2017), along with the area of cropland abandoned in each year following a five-year abandonment threshold (abandoned for >=5 years) or no threshold (abandoned for >=1 years). Produced using custom functions 'cc_calc_area_per_lc_abn()' via 'cc_summarize_abn_dts()'. See scripts 'cluster/2_analyze_abn.R' and '_util/_util_functions.R.' persistence_dat - A .csv containing the area of cropland abandoned (ha) for a given 'cohort' of abandoned cropland (i.e., a group of cropland abandoned in the same year, also called 'year_abn') in a specific year. This area is also given as a proportion of the initial area abandoned in each cohort, or the area of each cohort when it was first classified as abandoned at year 5 ('initial_area_abn'). The 'age' is given as the number of years since a given cohort of abandoned cropland was last actively cultivated, and 'time' is marked relative to the 5th year, when our five-year definition first classifies that land as abandoned (and where the proportion of abandoned land remaining abandoned is 1). Produced using custom functions 'cc_calc_persistence()' via 'cc_summarize_abn_dts()'. See scripts 'cluster/2_analyze_abn.R' and '_util/_util_functions.R.' This serves as the main input for our linear models of recultivation (“decay”) trajectories. turnover_dat - A .csv showing the annual gross gain, annual gross loss, and annual net change in the area (in ha) of abandoned cropland at each site in each year of the time series. Produced using custom functions 'cc_calc_abn_diff()' via 'cc_summarize_abn_dts()' (see '_util/_util_functions.R'), implemented in 'cluster/2_analyze_abn.R.' This file is only produced for observed abandonment. Area summary files (for observed abandonment only) area_summary_df - Contains a range of summary values relating to the area of cropland abandonment for each of our eleven sites. All area values are given in hectares (ha) unless stated otherwise. It contains 16 variables as columns, including 1) 'site,' 2) 'total_site_area_ha_2017' - the total site area (ha) in 2017, 3) 'cropland_area_1987' - the area in cropland in 1987 (ha), 4) 'area_abn_ha_2017' -
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For more information, see the Terrestrial Significant Habitats Factsheet at https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=150834.
The California Department of Fish and Wildlife’s (CDFW) Areas of Conservation Emphasis (ACE) is a compilation and analysis of the best-available statewide spatial information in California on biodiversity, rarity and endemism, harvested species, significant habitats, connectivity and wildlife movement, climate vulnerability, climate refugia, and other relevant data (e.g., other conservation priorities such as those identified in the State Wildlife Action Plan (SWAP), stressors, land ownership). ACE addresses both terrestrial and aquatic data. The ACE model combines and analyzes terrestrial information in a 2.5 square mile hexagon grid and aquatic information at the HUC12 watershed level across the state to produce a series of maps for use in non-regulatory evaluation of conservation priorities in California. The model addresses as many of CDFWs statewide conservation and recreational mandates as feasible using high quality data sources. High value areas statewide and in each USDA Ecoregion were identified. The ACE maps and data can be viewed in the ACE online map viewer, or downloaded for use in ArcGIS. For more detailed information see https://www.wildlife.ca.gov/Data/Analysis/ACE and https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=24326.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains shapefiles and rasters that summarize the results of a stochastic analysis of temperatures at depth in the Appalachian Basin states of New York, Pennsylvania, and West Virginia. This analysis provides an update to the temperature-at-depth maps provided in the Geothermal Play Fairway Analysis of the Appalachian Basin (GPFA-AB) Thermal Quality Analysis (GDR repository 879: https://gdr.openei.org/submissions/879). This dataset improves upon the GPFA-AB dataset by considering several additional uncertainties in the temperature-at-depth calculations, including geologic properties and thermal properties. A Monte Carlo analysis of these uncertain properties and the GPFA-AB estimated surface heat flow was used to predict temperatures at depth using a 1-D heat conduction model. In this data submission, temperatures are provided for depths from 1-5 km in 0.5 km increments. The mean, standard deviation, and selected quantiles of temperatures at these depths are provided as shapefiles with attribute tables that contain the data. Rasters are provided for the mean and standard deviation data. Figures and maps that summarize the data are also provided. For the pixel corresponding to Cornell University, Ithaca, NY, a .csv file containing the 10,000 temperature-depth profiles estimated from the Monte Carlo analysis is provided. These data are summarized in a figure containing violin plots that illustrate the probability of obtaining certain temperatures at depths below Cornell.
Facebook
TwitterA. SUMMARY This dataset maps 2020 census tracts to Analysis Neighborhoods. The Department of Public Health and the Mayor’s Office of Housing and Community Development, with support from the Planning Department originally created the 41 Analysis Neighborhoods by grouping 2010 Census tracts, using common real estate and residents’ definitions for the purpose of providing consistency in the analysis and reporting of socio-economic, demographic, and environmental data, and data on City-funded programs and services. They are not codified in Planning Code nor Administrative Code. B. HOW THE DATASET IS CREATED This dataset is produced by mapping the 2020 Census tracts to Analysis neighborhoods. C. UPDATE PROCESS This dataset is static. Changes to the census tract boundaries are tracked in multiple datasets. See here for the 2010 census tracts assigned to neighborhoods D. HOW TO USE THIS DATASET This boundary file can be joined to other census datasets on GEOID, which is the primary key for census tracts in the dataset E. RELATED DATASET 2020 census tract boundaries for San Francisco can be found here
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For more information, see the Terrestrial Endemic Species Index Factsheet at https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=150816. The user can view a list of species potentially present in each hexagon in the ACE online map viewer https://apps.wildlife.ca.gov/ace/. Note that the names of some rare or endemic species, such as those at risk of over-collection, have been suppressed from the list of species names per hexagon, but are still included in the species counts. The California Department of Fish and Wildlife’s (CDFW) Areas of Conservation Emphasis (ACE) is a compilation and analysis of the best-available statewide spatial information in California on biodiversity, rarity and endemism, harvested species, significant habitats, connectivity and wildlife movement, climate vulnerability, climate refugia, and other relevant data (e.g., other conservation priorities such as those identified in the State Wildlife Action Plan (SWAP), stressors, land ownership). ACE addresses both terrestrial and aquatic data. The ACE model combines and analyzes terrestrial information in a 2.5 square mile hexagon grid and aquatic information at the HUC12 watershed level across the state to produce a series of maps for use in non-regulatory evaluation of conservation priorities in California. The model addresses as many of CDFWs statewide conservation and recreational mandates as feasible using high quality data sources. High value areas statewide and in each USDA Ecoregion were identified. The ACE maps and data can be viewed in the ACE online map viewer, or downloaded for use in ArcGIS. For more detailed information see https://www.wildlife.ca.gov/Data/Analysis/ACE and https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=24326.
Facebook
TwitterGlobal COVID-19 surveys conducted by National Statistical Offices. This dataset has several columns that contain different types of information. Here's a brief explanation of each column:
1.**Country**: This column likely contains the names of the countries for which the survey data is collected. Each row represents data related to a specific country.
2.**Category**: This column might contain information about the type or category of the survey. It could include categories such as healthcare, economic impact, public sentiment, etc. This helps in categorizing the surveys.
3.**Title and Link**: These columns may contain the title or name of the specific survey and a link to the source or webpage where more information about the survey can be found. The link can be useful for referencing the original source of the data.
4.**Description**: This column likely contains a brief description or summary of the survey's objectives, methodology, or key findings. It provides additional context for the survey data.
5.**Source**: This column may contain information about the organization or agency that conducted the survey. It's essential for understanding the authority behind the data.
6.**Date Added**: This column probably contains the date when the survey data was added to the dataset. This helps track the freshness of the data and can be useful for historical analysis.
With this dataset, you can perform various types of analysis, including but not limited to:
Country-based analysis: You can analyze survey data for specific countries to understand the impact of COVID-19 in different regions.
Category-based analysis: You can group surveys by category and analyze trends or patterns related to healthcare, economics, or public sentiment.
Temporal analysis: You can examine how survey data has evolved over time by using the "Date Added" column to track changes and trends.
Source-based analysis: You can assess the reliability and credibility of the data by considering the source of the surveys.
Data visualization: Create visual representations like charts, graphs, and maps to make the data more understandable and informative.
Before conducting any analysis, it's essential to clean and preprocess the data, handle missing values, and ensure data consistency. Additionally, consider the research questions or insights you want to gain from the dataset, which will guide your analysis approach.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For more information, see the Aquatic Native Species Richness Factsheet at https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=150852. The California Department of Fish and Wildlife’s (CDFW) Areas of Conservation Emphasis (ACE) is a compilation and analysis of the best-available statewide spatial information in California on biodiversity, rarity and endemism, harvested species, significant habitats, connectivity and wildlife movement, climate vulnerability, climate refugia, and other relevant data (e.g., other conservation priorities such as those identified in the State Wildlife Action Plan (SWAP), stressors, land ownership). ACE addresses both terrestrial and aquatic data. The ACE model combines and analyzes terrestrial information in a 2.5 square mile hexagon grid and aquatic information at the HUC12 watershed level across the state to produce a series of maps for use in non-regulatory evaluation of conservation priorities in California. The model addresses as many of CDFWs statewide conservation and recreational mandates as feasible using high quality data sources. High value areas statewide and in each USDA Ecoregion were identified. The ACE maps and data can be viewed in the ACE online map viewer, or downloaded for use in ArcGIS. For more detailed information see https://www.wildlife.ca.gov/Data/Analysis/ACE and https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=24326.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Summary
This dataset provides the most accurate and comprehensive geospatial information on wind turbines in South Africa as of 2025. It includes precise turbine coordinates, detailed technical attributes, and spatially harmonized metadata across 42 wind farms. The dataset contains 1,487 individual turbine entries with validated information on turbine type, rated capacity, rotor diameter, commissioning year, and administrative regions. It was compiled by integrating OpenStreetMap (OSM) data, satellite imagery from Google and Bing, a RetinaNet-based deep learning model for coordinate correction, and manual verification.
Data Structure
Format: GeoJSON
Coordinate Reference System (CRS): WGS 84 (EPSG:4326)
Number of features: 1,487
Geometry type: Point (turbine locations)
Key attributes:
id: Unique internal identifier
osm_id: Reference ID from OpenStreetMap
gid, country, type1, name1, type2, name2: Administrative region (based on GADM)
farm_name: Name of the wind farm
commissioning_year: Year the turbine was commissioned
number_of_turbines: Total number of turbines at the wind farm
total_farm_capacity: Total installed capacity of the wind farm (MW)
capacity_per_turbine: Rated power per turbine (MW)
turbine_type: Manufacturer and model of the turbine
geometry: Point geometry (longitude, latitude)
Publication Abstract
Accurate and detailed spatial data on wind energy infrastructure is essential for renewable energy planning, grid integration, and system analysis. However, publicly available datasets often suffer from limited spatial accuracy, missing attributes, and inconsistent metadata. To address these challenges, this study presents a harmonized and spatially refined dataset of wind turbines in South Africa, combining OpenStreetMap (OSM) data with high-resolution satellite imagery, deep learning-based coordinate correction, and manual curation. The dataset includes 1487 turbines across 42 wind farms, representing over 3.9 GW of installed capacity as of 2025. Of this, more than 3.6 GW is currently operational. The Geo-Coordinates were validated and corrected using a RetinaNet-based object detection model applied to both Google and Bing satellite imagery. Instead of relying solely on spatial precision, the curation process emphasized attribute completeness and consistency. Through systematic verification and cross-referencing with multiple public sources, the final dataset achieves a high level of attribute completeness and internal consistency across all turbines, including turbine type, rated capacity, and commissioning year. The resulting dataset is the most accurate and comprehensive publicly available dataset on wind turbines in South Africa to date. It provides a robust foundation for spatial analysis, energy modeling, and policy assessment related to wind energy development. The dataset is publicly available.
Citation Notification
If you use this dataset, please cite the following publication:
Kleebauer, M.; Karamanski, S.; Callies, D.; Braun, M. A Wind Turbines Dataset for South Africa: OpenStreetMap Data, Deep Learning Based Geo-Coordinate Correction and Capacity Analysis. ISPRS Int. J. Geo-Inf. 2025, 14, 232. https://doi.org/10.3390/ijgi14060232
Facebook
TwitterThe MAPS Model Location Time Series (MOLTS) is one of the model output datasets provided in the Southern Great Plains - 1997 (SGP97). The full MAPS MOLTS dataset covers most of North America east of the Rocky Mountains (283 locations). MOLTS are hourly time series output at selected locations that contain values for various surface parameters and ‘sounding' profiles at MAPS model levels and are derived from the MAPS model output. The MOLTS output files were converted into Joint Office for Science Support (JOSS) Quality Control Format (QCF), the same format used for atmospheric rawinsonde soundings processed by JOSS. The MOLTS output provided by JOSS online includes only the initial analysis output (i.e. no forecast MOLTS) and only state parameters (pressure, altitude, temperature, humidity, and wind). The full output, including the forecast MOLTS and all output parameters, in its original format (Binary Universal Form for the Representation of meteorological data, or BUFR) is available from the National Center for Atmospheric Research (NCAR)/Scientific Computing Division. The Forecast Systems Laboratory (FSL) operates the MAPS model with a resolution of 40 km and 40 vertical levels. The MAPS analysis and forecast fields are generated every 3 hours at 0000, 0300, 0600, 0900, 1200, 1500, 1800, and 2100 UTC daily. MOLTS are hourly vertical profile and surface time series derived from the MAPS model output. The complete MOLTS output includes six informational items, 16 parameters for each level and 27 parameters at the surface. Output are available each hour beginning at the initial analysis (the only output available from JOSS) and ending at the 48 hour forecast. JOSS converts the raw format files into JOSS QCF format which is the same format used for atmospheric sounding data such as National Weather Service (NWS) soundings. JOSS calculated the total wind speed and direction from the u and v wind components. JOSS calculated the mixing ratio from the specific humidity (Pruppacher and Klett 1980) and the dew point from the mixing ratio (Wallace and Hobbs 1977). Then the relative humidity was calculated from the dew point (Bolton 1980). JOSS did not conduct any quality control on this output. The header records (15 total records) contain output type, project ID, the location of the nearest station to the MOLTS location (this can be a rawinsonde station, an Atmospheric Radiation Measurement (ARM)/Cloud and Radiation Testbed (CART) station, a wind profiler station, a surface station, or just the nearest town), the location of the MOLTS output, and the valid time for the MOLTS output. The five header lines contain information identifying the sounding, and have a rigidly defined form. The following 6 header lines are used for auxiliary information and comments about the sounding, and they vary significantly from dataset to dataset. The last 3 header records contain header information for the data columns. Line 13 holds the field names, line 14 the field units, and line 15 contains dashes ('-' characters) delineating the extent of the field. Resources in this dataset:Resource Title: GeoData catalog record. File Name: Web Page, url: https://geodata.nal.usda.gov/geonetwork/srv/eng/catalog.search#/metadata/2ad09880-6439-440c-9829-c4653ec12a4f
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This comprehensive dataset includes every line from all of William Shakespeare’s plays, categorized by play, genre, character, and more. It is an invaluable resource for those interested in literary analysis, natural language processing, and the historical study of one of the most significant figures in English literature. The dataset consists of 108,093 rows and 9 columns, capturing lines from various plays by William Shakespeare. Here’s a breakdown of the dataset structure and its contents:
This dataset offers a plethora of possibilities for anyone interested in delving deep into the linguistic and thematic elements of Shakespeare's works, with ready-to-use data for various levels of analysis.
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A. SUMMARY The Department of Public Health and the Mayor’s Office of Housing and Community Development, with support from the Planning Department, created these 41 neighborhoods by grouping 2010 Census tracts, using common real estate and residents’ definitions for the purpose of providing consistency in the analysis and reporting of socio-economic, demographic, and environmental data, and data on City-funded programs and services. These neighborhoods are not codified in Planning Code nor Administrative Code, although this map is referenced in Planning Code Section 415 as the “American Community Survey Neighborhood Profile Boundaries Map. Note: These are NOT statistical boundaries as they are not controlled for population size. This is also NOT an official map of neighborhood boundaries in SF but an aggregation of Census tracts and should be used in conjunction with other spatial boundaries for decision making. B. HOW THE DATASET IS CREATED This dataset is produced by assigning Census tracts to neighborhoods based on existing neighborhood definitions used by Planning and MOHCD. A qualitative assessment is made to identify the appropriate neighborhood for a given tract based on understanding of population distribution and significant landmarks. Once all tracts have been assigned a neighborhood, the tracts are dissolved to produce this dataset, Analysis Neighborhoods. C. UPDATE PROCESS This dataset is static. Changes to the analysis neighborhood boundaries will be evaluated as needed by the Analysis Neighborhood working group led by DataSF and the Planning department and includes staff from various other city departments. Contact us for any questions. D. HOW TO USE THIS DATASET Downloading this dataset and opening it in Excel may cause some of the data values to be lost or not display properly (particularly the Analysis Neighborhood column). For a simple list of Analysis Neighborhoods without geographic coordinates, click here: https://data.sfgov.org/resource/xfcw-9evu.csv?$select=nhood E. RELATED DATASETS 2020 Census tracts assigned a neighborhood 2010 Census tracts assigned a neighborhood View this dataset on ArcGIS Online
Facebook
TwitterBy US Open Data Portal, data.gov [source]
This dataset contains stroke mortality data among US adults (35+) by state/territory and county. Learn more about the health of people within your own state or region, across genders and ethnicities. Reliable statistics even for small counties can be seen, thanks to 3-year averages, age-standardization, and spatial smoothing. Data sources such as the National Vital Statistics System give you all the data you need to get a detailed sense of your population's total cardiovascular health. With interactive maps created from this data also provided covering heart disease risks, death rates and hospital bed availability across each location in America, you can now gain a powerful perspective on how effective healthcare initiatives are making an impact in those who live there. Study up on the real cardiovascular conditions plaguing those around us today to make a real change in public health!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains stroke mortality data among US adults (35+) by state/territory and county. This data can be useful in helping identify areas where stroke mortality is high, and interventions to reduce mortality should be taken into account.
To access the dataset, you need to download it from Kaggle. The dataset consists of 18 columns including year, location description, geographic level, source of data, class of data values provided, topic of discussion with regard to stroke mortality rates (age-standardized), labels for stratification categories and stratifications used within the given age group when performing this analysis. The last 3 columns consist of geographical coordinates for each location (Y_lat & X_lon) as well as an overall georeferenced column (Georeferenced Column).
Once you have downloaded the dataset there are a few ways you can go about using it:
- You can perform a descriptive analysis on any particular column using methods such as summary statistics or distributions graphs;
- You can create your own maps or other visual representation based on the latitude/longitude columns;
- You could look at differences between states and counties/areas within states by subsetting out certain areas;
- Using statistical testing methods you could create inferential analyses that may lead to insights on why some areas seem more prone to higher levels of stroke mortality than others
- Track county-level stroke mortality trends among US adults (35+) over time.
- Identify regions of higher stroke mortality risk and use that information to inform targeted, preventative health policies and interventions.
- Analyze differences in stroke mortality rates by gender, race/ethnicity, or geographic location to identify potential disparities in care access or outcomes for certain demographic groups
If you use this dataset in your research, please credit the original authors. Data Source
Unknown License - Please check the dataset description for more information.
File: csv-1.csv | Column name | Description | |:-------------------------------|:---------------------------------------------------------| | Year | Year of the data. (Integer) | | LocationAbbr | Abbreviation of the state or territory. (String) | | LocationDesc | Name of the state or territory. (String) | | GeographicLevel | Level of geographic detail. (String) | | DataSource | Source of the data. (String) | | Class | Classification of the data. (String) | | Topic | Topic of the data. (String) | | Data_Value | Numeric value associated with the topic. (Float) | | Data_Value_Unit | Unit used to express the data value. (String) | | Data_Value_Type | Type of data value. (String) | | Data_Value_Footnote_Symbol | Symbol associated with the data value footnote. (String) | | StratificationCategory1 | First category of stratification. (String) | | Stratification1 | First stratifica...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For more information, see the Aquatic Biodiversity Index Factsheet at https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=150856" STYLE="text-decoration:underline;">https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=150856.
The California Department of Fish and Wildlife’s (CDFW) Areas of Conservation Emphasis (ACE) is a compilation and analysis of the best-available statewide spatial information in California on biodiversity, rarity and endemism, harvested species, significant habitats, connectivity and wildlife movement, climate vulnerability, climate refugia, and other relevant data (e.g., other conservation priorities such as those identified in the State Wildlife Action Plan (SWAP), stressors, land ownership). ACE addresses both terrestrial and aquatic data. The ACE model combines and analyzes terrestrial information in a 2.5 square mile hexagon grid and aquatic information at the HUC12 watershed level across the state to produce a series of maps for use in non-regulatory evaluation of conservation priorities in California. The model addresses as many of CDFWs statewide conservation and recreational mandates as feasible using high quality data sources. High value areas statewide and in each USDA Ecoregion were identified. The ACE maps and data can be viewed in the ACE online map viewer, or downloaded for use in ArcGIS. For more detailed information see https://www.wildlife.ca.gov/Data/Analysis/ACE" STYLE="text-decoration:underline;">https://www.wildlife.ca.gov/Data/Analysis/ACE and https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=24326" STYLE="text-decoration:underline;">https://nrm.dfg.ca.gov/FileHandler.ashx?DocumentID=24326.