46 datasets found
  1. Simulation Files (.prj and .cvf) for Virus Particle Exposure in Residences...

    • catalog.data.gov
    Updated Mar 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2025). Simulation Files (.prj and .cvf) for Virus Particle Exposure in Residences (ViPER) Webtool [Dataset]. https://catalog.data.gov/dataset/simulation-files-prj-and-cvf-for-virus-particle-exposure-in-residences-viper-webtool
    Explore at:
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    This dataset is comprised of the .prj and .cvf files used to build the database for the Virus Particle Exposure in Residences (ViPER) Webtool, a single zone indoor air quality and ventilation analysis tool developed by the National Institute of Standards and Technology (NIST).

  2. f

    ESRI Projection file for 1km and 2.5km grids

    • springernature.figshare.com
    txt
    Updated Nov 27, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Annette Menzel; Tongli Wang; Andreas Hamann; Maurizio Marchi; Dante Castellanos-Acuña; Duncan Ray (2020). ESRI Projection file for 1km and 2.5km grids [Dataset]. http://doi.org/10.6084/m9.figshare.11827830.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 27, 2020
    Dataset provided by
    figshare
    Authors
    Annette Menzel; Tongli Wang; Andreas Hamann; Maurizio Marchi; Dante Castellanos-Acuña; Duncan Ray
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Duplicate the Projection.prj file and rename the duplicate to the same name as the ASCII grid, e.g. MAT.asc and MAT.prj. When MAT.asc is imported to ESRI ArcGIS or QGIS, the GIS systems will automatically pick-up the correct grid projection.

  3. d

    Process-based water temperature predictions in the Midwest US: 1 Spatial...

    • datasets.ai
    • data.usgs.gov
    • +1more
    55
    Updated Sep 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of the Interior (2024). Process-based water temperature predictions in the Midwest US: 1 Spatial data (GIS polygons for 7,150 lakes) [Dataset]. https://datasets.ai/datasets/process-based-water-temperature-predictions-in-the-midwest-us-1-spatial-data-gis-polygons-
    Explore at:
    55Available download formats
    Dataset updated
    Sep 11, 2024
    Dataset authored and provided by
    Department of the Interior
    Area covered
    Midwestern United States
    Description

    This dataset provides shapefile outlines of the 7,150 lakes that had temperature modeled as part of this study. The format is a shapefile for all lakes combined (.shp, .shx, .dbf, and .prj files). A csv file of lake metadata is also included. This dataset is part of a larger data release of lake temperature model inputs and outputs for 7,150 lakes in the U.S. states of Minnesota and Wisconsin (http://dx.doi.org/10.5066/P9CA6XP8).

  4. d

    Process-guided deep learning water temperature predictions: 1 Spatial data...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Jun 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Climate Adaptation Science Centers (2024). Process-guided deep learning water temperature predictions: 1 Spatial data (GIS polygons for 68 lakes) [Dataset]. https://catalog.data.gov/dataset/process-guided-deep-learning-water-temperature-predictions-1-spatial-data-gis-polygons-for
    Explore at:
    Dataset updated
    Jun 15, 2024
    Dataset provided by
    Climate Adaptation Science Centers
    Description

    This dataset provides shapefile of outlines of the 68 lakes where temperature was modeled as part of this study. The format is a shapefile for all lakes combined (.shp, .shx, .dbf, and .prj files). This dataset is part of a larger data release of lake temperature model inputs and outputs for 68 lakes in the U.S. states of Minnesota and Wisconsin (http://dx.doi.org/10.5066/P9AQPIVD).

  5. Z

    Dataset of the paper: "How do Hugging Face Models Document Datasets, Bias,...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pepe, Federica (2024). Dataset of the paper: "How do Hugging Face Models Document Datasets, Bias, and Licenses? An Empirical Study" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8200098
    Explore at:
    Dataset updated
    Jan 16, 2024
    Dataset provided by
    Pepe, Federica
    BAVOTA, Gabriele
    Nardone, Vittoria
    Di Penta, Massimiliano
    Mastropaolo, Antonio
    Canfora, Gerardo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This replication package contains datasets and scripts related to the paper: "*How do Hugging Face Models Document Datasets, Bias, and Licenses? An Empirical Study*"

    Root directory

    • statistics.r: R script used to compute the correlation between usage and downloads, and the RQ1/RQ2 inter-rater agreements
    • modelsInfo.zip: zip file containing all the downloaded model cards (in JSON format)
    • script: directory containing all the scripts used to collect and process data. For further details, see README file inside the script directory.

    Dataset

    • Dataset/Dataset_HF-models-list.csv: list of HF models analyzed
    • Dataset/Dataset_github-prj-list.txt: list of GitHub projects using the transformers library
    • Dataset/Dataset_github-Prj_model-Used.csv: contains usage pairs: project, model
    • Dataset/Dataset_prj-num-models-reused.csv: number of models used by each GitHub project
    • Dataset/Dataset_model-download_num-prj_correlation.csv contains, for each model used by GitHub projects: the name, the task, the number of reusing projects, and the number of downloads

    RQ1

    • RQ1/RQ1_dataset-list.txt: list of HF datasets
    • RQ1/RQ1_datasetSample.csv: sample set of models used for the manual analysis of datasets
    • RQ1/RQ1_analyzeDatasetTags.py: Python script to analyze model tags for the presence of datasets. it requires to unzip the modelsInfo.zip in a directory with the same name (modelsInfo) at the root of the replication package folder. Produces the output to stdout. To redirect in a file fo be analyzed by the RQ2/countDataset.py script
    • RQ1/RQ1_countDataset.py: given the output of RQ2/analyzeDatasetTags.py (passed as argument) produces, for each model, a list of Booleans indicating whether (i) the model only declares HF datasets, (ii) the model only declares external datasets, (iii) the model declares both, and (iv) the model is part of the sample for the manual analysis
    • RQ1/RQ1_datasetTags.csv: output of RQ2/analyzeDatasetTags.py
    • RQ1/RQ1_dataset_usage_count.csv: output of RQ2/countDataset.py

    RQ2

    • RQ2/tableBias.pdf: table detailing the number of occurrences of different types of bias by model Task
    • RQ2/RQ2_bias_classification_sheet.csv: results of the manual labeling
    • RQ2/RQ2_isBiased.csv: file to compute the inter-rater agreement of whether or not a model documents Bias
    • RQ2/RQ2_biasAgrLabels.csv: file to compute the inter-rater agreement related to bias categories
    • RQ2/RQ2_final_bias_categories_with_levels.csv: for each model in the sample, this file lists (i) the bias leaf category, (ii) the first-level category, and (iii) the intermediate category

    RQ3

    • RQ3/RQ3_LicenseValidation.csv: manual validation of a sample of licenses
    • RQ3/RQ3_{NETWORK-RESTRICTIVE|RESTRICTIVE|WEAK-RESTRICTIVE|PERMISSIVE}-license-list.txt: lists of licenses with different permissiveness
    • RQ3/RQ3_prjs_license.csv: for each project linked to models, among other fields it indicates the license tag and name
    • RQ3/RQ3_models_license.csv: for each model, indicates among other pieces of info, whether the model has a license, and if yes what kind of license
    • RQ3/RQ3_model-prj-license_contingency_table.csv: usage contingency table between projects' licenses (columns) and models' licenses (rows)
    • RQ3/RQ3_models_prjs_licenses_with_type.csv: pairs project-model, with their respective licenses and permissiveness level

    scripts

    Contains the scripts used to mine Hugging Face and GitHub. Details are in the enclosed README

  6. Analysis, Modeling, and Simulation (AMS) Testbed Development and Evaluation...

    • catalog.data.gov
    • data.bts.gov
    • +3more
    Updated Dec 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federal Highway Administration (2023). Analysis, Modeling, and Simulation (AMS) Testbed Development and Evaluation to Support Dynamic Mobility Applications (DMA) and Active Transportation and Demand Management (ATDM) Programs: Dallas Testbed Analysis Plan [supporting datasets] [Dataset]. https://catalog.data.gov/dataset/analysis-modeling-and-simulation-ams-testbed-development-and-evaluation-to-support-dynamic-d4e77
    Explore at:
    Dataset updated
    Dec 7, 2023
    Dataset provided by
    Federal Highway Administrationhttps://highways.dot.gov/
    Area covered
    Dallas
    Description

    The datasets in this zip file are in support of Intelligent Transportation Systems Joint Program Office (ITS JPO) report FHWA-JPO-16-385, "Analysis, Modeling, and Simulation (AMS) Testbed Development and Evaluation to Support Dynamic Mobility Applications (DMA) and Active Transportation and Demand Management (ATDM) Programs — Evaluation Report for ATDM Program," https://rosap.ntl.bts.gov/view/dot/32520 and FHWA-JPO-16-373, "Analysis, modeling, and simulation (AMS) testbed development and evaluation to support dynamic mobility applications (DMA) and active transportation and demand management (ATDM) programs : Dallas testbed analysis plan," https://rosap.ntl.bts.gov/view/dot/32106 The files in this zip file are specifically related to the Dallas Testbed. The compressed zip files total 2.2 GB in size. The files have been uploaded as-is; no further documentation was supplied by NTL. All located .docx files were converted to .pdf document files which are an open, archival format. These pdfs were then added to the zip file alongside the original .docx files. These files can be unzipped using any zip compression/decompression software. This zip file contains files in the following formats: .pdf document files which can be read using any pdf reader; .cvs text files which can be read using any text editor; .txt text files which can be read using any text editor; .docx document files which can be read in Microsoft Word and some other word processing programs; . xlsx spreadsheet files which can be read in Microsoft Excel and some other spreadsheet programs; .dat data files which may be text or multimedia; as well as GIS or mapping files in the fowlling formats: .mxd, .dbf, .prj, .sbn, .shp., .shp.xml; which may be opened in ArcGIS or other GIS software. [software requirements] These files were last accessed in 2017.

  7. Data from: Neighborhoods in New York

    • kaggle.com
    zip
    Updated Jul 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jack Cook (2017). Neighborhoods in New York [Dataset]. https://www.kaggle.com/jackcook/neighborhoods-in-new-york
    Explore at:
    zip(1069387 bytes)Available download formats
    Dataset updated
    Jul 23, 2017
    Authors
    Jack Cook
    Area covered
    New York
    Description

    Context

    This dataset contains shapefiles outlining 558 neighborhoods in 50 major cities in New York state, notably including Albany, Buffalo, Ithaca, New York City, Rochester, and Syracuse. This adds context to your datasets by identifying the neighborhood of any locations you have, as coordinates on their own don't carry a lot of information.

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too. What fields does it include? What's the time period of the data and how was it collected?

    Four files are included containing data about the shapes: an SHX file, a DBF file, an SHP file, and a PRJ file. Including all of them in your input data are necessary, as they all contain pieces of the data; one file alone will not have everything that you need.

    Seeing how none of these files are plaintext, it can be a little difficult to get set up with them. I highly recommend using mapshaper.org to get started- this site will show you the boundaries drawn on a plane, as well as allow you to export the files in a number of different formats (e.g. GeoJSON, CSV) if you are unable to use them in the format they are provided in. Personally, I have found it easier to work with the shapefile format though.

    To get started with the shapefile in R, you can use the the rgdal and rgeos packages. To see an example of these being used, be sure to check out my kernel, "Incorporating neighborhoods into your model".

    Acknowledgements

    These files were provided by Zillow and are available under a Creative Commons license.

    Test

    Inspiration

    I'll be using these in the NYC Taxi Trip Duration competition to add context to the pickup and dropoff locations of the taxi rides and hopefully greatly improve my predictions.

  8. c

    NOAA Office for Coastal Management Benthic Habitat Data, Long Island Sound,...

    • s.cnmilf.com
    • datasets.ai
    • +3more
    Updated Jul 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (Point of Contact) (2025). NOAA Office for Coastal Management Benthic Habitat Data, Long Island Sound, Jamaica Bay, and Lower Bay of NY/NJ Harbor, NY, 1994-2002 (NCEI Accession 0089467) [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/noaa-office-for-coastal-management-benthic-habitat-data-long-island-sound-jamaica-bay-and-lower
    Explore at:
    Dataset updated
    Jul 1, 2025
    Dataset provided by
    (Point of Contact)
    Area covered
    Long Island, Jamaica Bay, Long Island Sound, New York, New York
    Description

    These data are a collection of benthic habitat data from studies conducted in the coastal Long Island Sound, NY region in GIS shapefile (.shp, .dbf, .shx, and .prj files) with associated Federal Geographic Data Committee (FGDC) metadata. Generalized browse graphics were generated by the NODC and are included with the data. Individual subdirectories include data as follows - 2002 Long Island South Shore Estuary Benthic Habitat Polygon Data Set, 1995 benthic grab, sediment grab, and sediment profile image GIS point data files from inland harbor bays (Jamaica Bay), and 1994-1995 benthic grab, sediment grab, and sediment profile image GIS point data files from lower inland harbor bays.

  9. d

    Metabolism estimates for 356 U.S. rivers (2007-2017): 2a. Site coordinates

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Metabolism estimates for 356 U.S. rivers (2007-2017): 2a. Site coordinates [Dataset]. https://catalog.data.gov/dataset/metabolism-estimates-for-356-u-s-rivers-2007-2017-2a-site-coordinates
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    U.S. Geological Survey
    Area covered
    United States
    Description

    This dataset provides site locations as shapefile points. The format is a shapefile for all sites combined (.shp, .shx, .dbf, and .prj files). This dataset is part of a larger data release of metabolism model inputs and outputs for 356 streams and rivers across the United States (https://doi.org/10.5066/F70864KX). The complete release includes: modeled estimates of gross primary productivity, ecosystem respiration, and the gas exchange coefficient; model input data and alternative input data; model fit and diagnostic information; site catchment boundaries and site point locations; and potential predictors of metabolism such as discharge and light availability.

  10. U

    Walleye Thermal Optical Habitat Area (TOHA) of selected Minnesota lakes: 1...

    • data.usgs.gov
    • catalog.data.gov
    Updated Jul 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jordan Read; Gretchen Hansen; Hayley Corson-Dosch; Alison Appling; Kelsey Vitense; Samantha Oliver; Lindsay Platt (2024). Walleye Thermal Optical Habitat Area (TOHA) of selected Minnesota lakes: 1 Lake information for 881 lakes [Dataset]. http://doi.org/10.5066/P9PPHJE2
    Explore at:
    Dataset updated
    Jul 24, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Jordan Read; Gretchen Hansen; Hayley Corson-Dosch; Alison Appling; Kelsey Vitense; Samantha Oliver; Lindsay Platt
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    Dec 1, 1979 - Dec 31, 2018
    Area covered
    Minnesota
    Description

    This dataset provides shapefile outlines of the 881 lakes that had temperature modeled as part of this study. The format is a shapefile for all lakes combined (.shp, .shx, .dbf, and .prj files). A csv file of lake metadata is also included. This dataset is part of a larger data release of lake temperature model inputs and outputs for 881 lakes in the U.S. state of Minnesota (https://doi.org/10.5066/P9PPHJE2).

  11. c

    Niagara Open Data

    • catalog.civicdataecosystem.org
    Updated May 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Niagara Open Data [Dataset]. https://catalog.civicdataecosystem.org/dataset/niagara-open-data
    Explore at:
    Dataset updated
    May 13, 2025
    Description

    The Ontario government, generates and maintains thousands of datasets. Since 2012, we have shared data with Ontarians via a data catalogue. Open data is data that is shared with the public. Click here to learn more about open data and why Ontario releases it. Ontario’s Open Data Directive states that all data must be open, unless there is good reason for it to remain confidential. Ontario’s Chief Digital and Data Officer also has the authority to make certain datasets available publicly. Datasets listed in the catalogue that are not open will have one of the following labels: If you want to use data you find in the catalogue, that data must have a licence – a set of rules that describes how you can use it. A licence: Most of the data available in the catalogue is released under Ontario’s Open Government Licence. However, each dataset may be shared with the public under other kinds of licences or no licence at all. If a dataset doesn’t have a licence, you don’t have the right to use the data. If you have questions about how you can use a specific dataset, please contact us. The Ontario Data Catalogue endeavors to publish open data in a machine readable format. For machine readable datasets, you can simply retrieve the file you need using the file URL. The Ontario Data Catalogue is built on CKAN, which means the catalogue has the following features you can use when building applications. APIs (Application programming interfaces) let software applications communicate directly with each other. If you are using the catalogue in a software application, you might want to extract data from the catalogue through the catalogue API. Note: All Datastore API requests to the Ontario Data Catalogue must be made server-side. The catalogue's collection of dataset metadata (and dataset files) is searchable through the CKAN API. The Ontario Data Catalogue has more than just CKAN's documented search fields. You can also search these custom fields. You can also use the CKAN API to retrieve metadata about a particular dataset and check for updated files. Read the complete documentation for CKAN's API. Some of the open data in the Ontario Data Catalogue is available through the Datastore API. You can also search and access the machine-readable open data that is available in the catalogue. How to use the API feature: Read the complete documentation for CKAN's Datastore API. The Ontario Data Catalogue contains a record for each dataset that the Government of Ontario possesses. Some of these datasets will be available to you as open data. Others will not be available to you. This is because the Government of Ontario is unable to share data that would break the law or put someone's safety at risk. You can search for a dataset with a word that might describe a dataset or topic. Use words like “taxes” or “hospital locations” to discover what datasets the catalogue contains. You can search for a dataset from 3 spots on the catalogue: the homepage, the dataset search page, or the menu bar available across the catalogue. On the dataset search page, you can also filter your search results. You can select filters on the left hand side of the page to limit your search for datasets with your favourite file format, datasets that are updated weekly, datasets released by a particular organization, or datasets that are released under a specific licence. Go to the dataset search page to see the filters that are available to make your search easier. You can also do a quick search by selecting one of the catalogue’s categories on the homepage. These categories can help you see the types of data we have on key topic areas. When you find the dataset you are looking for, click on it to go to the dataset record. Each dataset record will tell you whether the data is available, and, if so, tell you about the data available. An open dataset might contain several data files. These files might represent different periods of time, different sub-sets of the dataset, different regions, language translations, or other breakdowns. You can select a file and either download it or preview it. Make sure to read the licence agreement to make sure you have permission to use it the way you want. Read more about previewing data. A non-open dataset may be not available for many reasons. Read more about non-open data. Read more about restricted data. Data that is non-open may still be subject to freedom of information requests. The catalogue has tools that enable all users to visualize the data in the catalogue without leaving the catalogue – no additional software needed. Have a look at our walk-through of how to make a chart in the catalogue. Get automatic notifications when datasets are updated. You can choose to get notifications for individual datasets, an organization’s datasets or the full catalogue. You don’t have to provide and personal information – just subscribe to our feeds using any feed reader you like using the corresponding notification web addresses. Copy those addresses and paste them into your reader. Your feed reader will let you know when the catalogue has been updated. The catalogue provides open data in several file formats (e.g., spreadsheets, geospatial data, etc). Learn about each format and how you can access and use the data each file contains. A file that has a list of items and values separated by commas without formatting (e.g. colours, italics, etc.) or extra visual features. This format provides just the data that you would display in a table. XLSX (Excel) files may be converted to CSV so they can be opened in a text editor. How to access the data: Open with any spreadsheet software application (e.g., Open Office Calc, Microsoft Excel) or text editor. Note: This format is considered machine-readable, it can be easily processed and used by a computer. Files that have visual formatting (e.g. bolded headers and colour-coded rows) can be hard for machines to understand, these elements make a file more human-readable and less machine-readable. A file that provides information without formatted text or extra visual features that may not follow a pattern of separated values like a CSV. How to access the data: Open with any word processor or text editor available on your device (e.g., Microsoft Word, Notepad). A spreadsheet file that may also include charts, graphs, and formatting. How to access the data: Open with a spreadsheet software application that supports this format (e.g., Open Office Calc, Microsoft Excel). Data can be converted to a CSV for a non-proprietary format of the same data without formatted text or extra visual features. A shapefile provides geographic information that can be used to create a map or perform geospatial analysis based on location, points/lines and other data about the shape and features of the area. It includes required files (.shp, .shx, .dbt) and might include corresponding files (e.g., .prj). How to access the data: Open with a geographic information system (GIS) software program (e.g., QGIS). A package of files and folders. The package can contain any number of different file types. How to access the data: Open with an unzipping software application (e.g., WinZIP, 7Zip). Note: If a ZIP file contains .shp, .shx, and .dbt file types, it is an ArcGIS ZIP: a package of shapefiles which provide information to create maps or perform geospatial analysis that can be opened with ArcGIS (a geographic information system software program). A file that provides information related to a geographic area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open using a GIS software application to create a map or do geospatial analysis. It can also be opened with a text editor to view raw information. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format for sharing data in a machine-readable way that can store data with more unconventional structures such as complex lists. How to access the data: Open with any text editor (e.g., Notepad) or access through a browser. Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A text-based format to store and organize data in a machine-readable way that can store data with more unconventional structures (not just data organized in tables). How to access the data: Open with any text editor (e.g., Notepad). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. A file that provides information related to an area (e.g., phone number, address, average rainfall, number of owl sightings in 2011 etc.) and its geospatial location (i.e., points/lines). How to access the data: Open with a geospatial software application that supports the KML format (e.g., Google Earth). Note: This format is machine-readable, and it can be easily processed and used by a computer. Human-readable data (including visual formatting) is easy for users to read and understand. This format contains files with data from tables used for statistical analysis and data visualization of Statistics Canada census data. How to access the data: Open with the Beyond 20/20 application. A database which links and combines data from different files or applications (including HTML, XML, Excel, etc.). The database file can be converted to a CSV/TXT to make the data machine-readable, but human-readable formatting will be lost. How to access the data: Open with Microsoft Office Access (a database management system used to develop application software). A file that keeps the original layout and

  12. ZIP+4. Complete dataset based on US postal data consisting of plus 35...

    • datarade.ai
    .json, .csv, .txt
    Updated Aug 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Geojunxion (2022). ZIP+4. Complete dataset based on US postal data consisting of plus 35 millions of polygons​ [Dataset]. https://datarade.ai/data-products/zip-4-complete-dataset-based-on-us-postal-data-consisting-of-geojunxion
    Explore at:
    .json, .csv, .txtAvailable download formats
    Dataset updated
    Aug 9, 2022
    Dataset provided by
    GeoJunxionhttp://www.geojunxion.com/
    Authors
    Geojunxion
    Area covered
    United States
    Description

    GeoJunxion‘s ZIP+4 is a complete dataset based on US postal data consisting of plus 35 millions of polygons​. The dataset is NOT JUST a table of spot data, which can be downloaded as csv or other text file as it happens with other suppliers​. The data can be delivered as shapefile through a single RAW data delivery or through an API​.

    The January 2021 USPS data source has significantly changed since the previous delivery. Some States have sizably lower ZIP+4 totals across all counties when compared with previous deliveries due to USPS parcelpoint cleanup, while other States have a significant increase in ZIP+4 totals across all counties due to cleanup and other rezoning. California and North Carolina in particular have several new ZIP5s, contributing to the increase in distinct ZIPs and ZIP+4s​.

    GeoJunxion‘s ZIP+4 data can be used as an additional layer on an existing map to run customer or other analysis, e.g. who is my customer who not, what is the density of my customer base in a certain ZIP+4 etc.

    Information can be put into visual context, due to the polygons, which is good for complex overviews or management decisions. ​CRM data can be enriched with the ZIP+4 to have more detailed customer information​.

    Key specifications:

    Topologized ZIP polygons​

    GeoJunxion ZIP+4 polygons follow USPS postal codes ​

    ZIP+4 code polygons: ​

    ZIP5 attributes ​

    State codes. ​

    Overlapping ZIP+4 ​

    boundaries for multiple ZIP+4 addresses on one area​

    Updated USPS source (January 2021) ​

    Distinct ZIP5 codes: 34 731​

    Distinct ZIP+4 codes: 35 146 957 ​

    The ZIP + 4 polygons are delivered in Esri shapefile format. This format allows the storage of geometry and attribute information for each of the features. ​

    The four components for the shapefile data are: ​

    .shp – This file stores the geometry of the feature​

    .shx –This file stores an index that stores the feature geometry​

    .dbf –This file stores attribute information relating to individual features​

    .prj –This file stores projection information associated with features​

    ​Current release version 2021. Earlier versions from previous years available on request.

  13. o

    The Geography of Oxia Planum 01 Geography and Quad Grids

    • ordo.open.ac.uk
    • figshare.com
    zip
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Fawdon; Peter Grindrod; Csilla Orgel; Elliot Sefton-Nash; Solmaz Adeli; Matt Balme; Gabriele Cremonese; Joel Davis; Alessandro Frigeri; Ernst Hauber; Laetitia Le Deit; Damien Loizeau; Andrea Nass; Adam Parks-Bowen; Cathy Quantin-Nataf; Nick Thomas; Jorge L. Vago; Matthieu Volat (2023). The Geography of Oxia Planum 01 Geography and Quad Grids [Dataset]. http://doi.org/10.21954/ou.rd.16451205.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    The Open University
    Authors
    Peter Fawdon; Peter Grindrod; Csilla Orgel; Elliot Sefton-Nash; Solmaz Adeli; Matt Balme; Gabriele Cremonese; Joel Davis; Alessandro Frigeri; Ernst Hauber; Laetitia Le Deit; Damien Loizeau; Andrea Nass; Adam Parks-Bowen; Cathy Quantin-Nataf; Nick Thomas; Jorge L. Vago; Matthieu Volat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set provides a grid of quads and projection information to be used for rover operations and the informal geographic naming convention for the regional geography of Oxia Planum. Both subject to update prior to the landed mission.Contents This data set contains 4 shapefiles and 1 zipped folder.OxiaPlanum_GeographicFeatures_2021_08_26. Point shapefile with the names of geographic features last updated at the date indicatedOxiaPlanum_GeographicRegions_2021_08_26. Polygon shapefile with the outlines of geographic regions fitted to the master quad grid and last updated at the date indicated.OxiaPlanum_QuadGrid_1km. Polygon shapefile of 1km quad that will be used for ExoMars rover missionOxiaPlanum_Origin_clong_335_45E_18_20N. The center point of the Oxia Planum as defined by the Rover Operations and Control center and origin point used for the Quad gridCRS_PRJ_Equirectangular_OxiaPlanum_Mars2000.zip. Zip folder containing the projection information use for all the data associated with this study. These are saved in the ESRI projection (.prj) and well know text formal (.wkt)Guide to individual filesFile name (example) Description OxiaPlanum_QuadGrid_1km.cpg Text display information OxiaPlanum_QuadGrid_1km.dbf Database file OxiaPlanum_QuadGrid_1km.prj Projection information OxiaPlanum_QuadGrid_1km.sbx Spatial index file OxiaPlanum_QuadGrid_1km.shp Shape file data

  14. Z

    Data from: Changes in the building stock of DaNang between 2015 and 2017

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Warth, Gebhard (2020). Changes in the building stock of DaNang between 2015 and 2017 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3757709
    Explore at:
    Dataset updated
    May 9, 2020
    Dataset provided by
    Bachofer, Felix
    Hochschild, Volker
    Bui, Tram
    Tran, Hao
    Braun, Andreas
    Warth, Gebhard
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Da Nang, Da Nang
    Description

    Description

    This dataset consist of two vector files which show the change in the building stock of the City of DaNang retrieved from satellite image analysis. Buildings were first identified from a Pléiades satellite image from 24.10.2015 and classified into 9 categories in a semi-automatic workflow desribed by Warth et al. (2019) and Vetter-Gindele et al. (2019).

    In a second step, these buildings were inspected for changes based on a second Pléiades satellite image acquired on 13.08.2017 based on visual interpretation. Changes were also classified into 5 categories and aggregated by administrative wards (first dataset: adm) and a hexagon grid of 250 meter length (second dataset: hex).

    The full workflow of the generation of this dataset, including a detailled description of its contents and a discussion on its potential use is published by Braun et al. 2020: Changes in the building stock of DaNang between 2015 and 2017

    Contents

    Both datasets (adm and hex) are stored as ESRI shapefiles which can be used in common Geographic Information Systems (GIS) and consist of the following parts:

    shp: polygon geometries (geometries of the administrative boundaries and hexagons)

    dbf: attribute table (containing the number of buildings per class for 2015 and 2017 and the underlying changes (e.g. number of new buildings, number of demolished buildings, ect.)

    shx: index file combining the geometries with the attributes

    cpg: encoding of the attributes (UTF-8)

    prj: spatial reference of the datasets (UTM zone 49 North, EPSG:32649) for ArcGIS

    qpj: spatial reference of the datasets (UTM zone 49 North, EPSG:32649) for QGIS

    lyr: symbology suggestion for the polygons(predefined is the number of local type shophouses in 2017) for ArcGIS

    qml: symbology suggestion for the polygons (predefined is the number of new buildings between 2015 and 2017) for QGIS

    Citation and documentation

    To cite this dataset, please refer to the publication

    Braun, A.; Warth, G.; Bachofer, F.; Quynh Bui, T.T.; Tran, H.; Hochschild, V. (2020): Changes in the Building Stock of Da Nang between 2015 and 2017. Data, 5, 42. doi:10.3390/data5020042

    This article contains a detailed description of the dataset, the defined building type classes and the types of changes which were analyzed. Furthermore, the article makes recommendations on the use of the datasets and discusses potential error sources.

  15. d

    Data release: Process-based predictions of lake water temperature in the...

    • datasets.ai
    • data.usgs.gov
    • +3more
    55
    Updated Sep 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of the Interior (2024). Data release: Process-based predictions of lake water temperature in the Midwest US [Dataset]. https://datasets.ai/datasets/data-release-process-based-predictions-of-lake-water-temperature-in-the-midwest-us-36d1e
    Explore at:
    55Available download formats
    Dataset updated
    Sep 28, 2024
    Dataset authored and provided by
    Department of the Interior
    Area covered
    Midwestern United States, United States
    Description

    Climate change has been shown to influence lake temperatures in different ways. To better understand the diversity of lake responses to climate change and give managers tools to manage individual lakes, we focused on improving prediction accuracy for daily water temperature profiles in 7,150 lakes in Minnesota and Wisconsin during 1980-2019.

    The data are organized into these items:

    1. Spatial data - A lake metadata file, and one shapefile of polygons for all 7,150 lakes in this study (.shp, .shx, .dbf, and .prj files)
    2. Model configurations - Model parameters and metadata used to configure models (1 JSON file, with metadata for each of 7,150 lakes, and one zip file with each lake's glm2.nml file)
    3. Temperature observations - Data formatted as model inputs for training, calibrating, or evaluating temperature models
    4. Model inputs - Data used to drive predictive models (35 zip files with ice-flags; 35 zip files with daily meteorological data)
    5. Prediction data - Predictions calibrated and uncalibrated PB models (35 zip files)
    6. Predicted habitat - Data formatted for ecological use

    7. This study was funded by the Department of the Interior Northeast and North Central Climate Adaptation Science Centers. Access to computing facilities was provided by USGS Core Science Analytics and Synthesis Advanced Research Computing, USGS Yeti Supercomputer (https://doi.org/10.5066/F7D798MJ).

  16. Dataset of passive microwave SSM / I and SSMIS brightness temperature in...

    • tpdc.ac.cn
    • data.tpdc.ac.cn
    zip
    Updated May 5, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Snow National (2022). Dataset of passive microwave SSM / I and SSMIS brightness temperature in China (1987-2015) [Dataset]. https://www.tpdc.ac.cn/view/googleSearch/dataDetail?metadataId=fc525ad4-035b-4bc8-9120-da0405edda02
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 5, 2022
    Dataset provided by
    Tanzania Petroleum Development Corporationhttp://tpdc.co.tz/
    Authors
    Snow National
    Area covered
    Description

    This dataset mainly includes the twice a day (ascending-descending orbit) brightness temperature (K) of the space-borne microwave radiometers SSM / I and SSMIS carried by the US Defense Meteorological Satellite Program satellites (DMSP-F08, DMSP-F11, DMSP-F13, and DMSP-F17), time coverage from September 15, 1987 to December 31, 2015. The SSM/I brightness temperature of DMSP-F08, DMSP-F11 and DMSP-F13 include 7 channels: 19.35H, 19.35V, 22.24V, 37.05H, 37.05V, 85.50H and 85.50V; The SSMIS brightness temperature observation of DMSP-F17 consists of seven channels: 19.35H, 19.35V, 22.24V, 37.05H, 37.05V, 91.66H and 91.66v. Among them, DMSP-F08 satellite brightness temperature coverage time is from September 15, 1987 to December 31, 1991; DMSP-F11 satellite brightness temperature coverage time is from January 1, 1992 to December 31, 1995; The coverage time of DMSP-F13 satellite brightness temperature is from January 1, 1996 to April 29, 2009; The coverage time of DMSP-F17 satellite brightness temperature is from January 1, 2009 to December 31, 2015. 1. File format and naming: The brightness temperature is stored separately in units of years, and each directory is composed of remote sensing data files of each frequency, and the SSMIS data also contains the .TIM time information file. The data file names and their naming rules are as follows: EASE-Fnn-ML / HyyyydddA / D.subset.ccH / V (remote sensing data) EASE-Fnn-ML / HyyyydddA / D.subset.TIM (time information file) Among them: EASE stands for EASE-Grid projection method; Fnn stands for satellite number (F08, F11, F13, F17); ML / H stands for multi-channel low-resolution and multi-channel high-resolution respectively; yyyy represents the year; ddd represents Julian Day of the year (1-365 / 366); A / D stands for ascending (A) and descending (D) respectively; subset represents brightness temperature data in China; cc represents frequency (19.35GHz, 22.24 GHz, 37.05GHz, (85.50GHz, 91.66GHz); H / V stands for horizontal polarization (H) and vertical polarization (V), respectively. 2. Coordinate system and projection: The projection method of this data set is EASE-Grid, which is an equal area secant cylindrical projection, and the double standard parallels are 30 ° north and south. For more information about EASE-GRID, please refer to http://www.ncgia.ucsb.edu/globalgrids-book/ease_grid/. If you need to convert the EASE-Grid projection to Geographic projection, please refer to the ease2geo.prj file, the content is as follows: Input projection cylindrical units meters parameters 6371228 6371228 1 / * Enter projection type (1, 2, or 3) 0 00 00 / * Longitude of central meridian 30 00 00 / * Latitude of standard parallel Output Projection GEOGRAPHIC Spheroid KRASovsky Units dd parameters end 3. Data format: Stored as integer binary, Row number: 308 *166,each data occupies 2 bytes. The actual data stored in this dataset is the brightness temperature * 10. After reading the data, you need to divide by 10 to get the real brightness temperature. 4. Data resolution: Spatial resolution: 25.067525km, 12.5km (SSM / I 85GHz, SSMIS 91GHz) Time resolution: daily, from 1978 to 2015. 5. Spatial range: Longitude: 60.1 ° -140.0 ° east longitude; Latitude: 14.9 ° -55.0 ° north latitude. 6. Data reading: Remote sensing image data files in each set of data can be opened in ArcMap, ENVI and ERDAS software.

  17. Data from: Global prevalence of non-perennial rivers and streams

    • figshare.com
    zip
    Updated Jun 3, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathis Messager; Bernhard Lehner (2021). Global prevalence of non-perennial rivers and streams [Dataset]. http://doi.org/10.6084/m9.figshare.14633022.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 3, 2021
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Mathis Messager; Bernhard Lehner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Global prevalence of non-perennial rivers and streamsJune 2021prepared by Mathis L. Messager (mathis.messager@mail.mcgill.ca)Bernhard Lehner (bernhard.lehner@mcgill.ca)1. Overview and background 2. Repository content3. Data format and projection4. License and citations4.1 License agreement4.2 Citations and acknowledgements1. Overview and backgroundThis documentation describes the data produced for the research article: Messager, M. L., Lehner, B., Cockburn, C., Lamouroux, N., Pella, H., Snelder, T., Tockner, K., Trautmann, T., Watt, C. & Datry, T. (2021). Global prevalence of non-perennial rivers and streams. Nature. https://doi.org/10.1038/s41586-021-03565-5In this study, we developed a statistical Random Forest model to produce the first reach-scale estimate of the global distribution of non-perennial rivers and streams. For this purpose, we linked quality-checked observed streamflow data from 5,615 gauging stations (on 4,428 perennial and 1,187 non-perennial reaches) with 113 candidate environmental predictors available globally. Predictors included variables describing climate, physiography, land cover, soil, geology, and groundwater as well as estimates of long-term naturalised (i.e., without anthropogenic water use in the form of abstractions or impoundments) mean monthly and mean annual flow (MAF), derived from a global hydrological model (WaterGAP 2.2; Müller Schmied et al. 2014). Following model training and validation, we predicted the probability of flow intermittence for all river reaches in the RiverATLAS database (Linke et al. 2019), a digital representation of the global river network at high spatial resolution.The data repository includes two datasets resulting from this study:1. a geometric network of the global river system where each river segment is associated with:i. 113 hydro-environmental predictors used in model development and predictions, andii. the probability and class of flow intermittence predicted by the model.2. point locations of the 5,516 gauging stations used in model training/testing, where each station is associated with a line segment representing a reach in the river network, and a set of metadata.These datasets have been generated with source code located at messamat.github.io/globalirmap/.Note that, although several attributes initially included in RiverATLAS version 1.0 have been updated for this study, the dataset provided here is not an established new version of RiverATLAS. 2. Repository contentThe data repository has the following structure (for usage, see section 3. Data Format and Projection; GIRES stands for Global Intermittent Rivers and Ephemeral Streams):— GIRES_v10_gdb.zip/ : file geodatabase in ESRI® geodatabase format containing two feature classes (zipped) |——— GIRES_v10_rivers : river network lines |——— GIRES_v10_stations : points with streamflow summary statistics and metadata— GIRES_v10_shp.zip/ : directory containing ten shapefiles (zipped) Same content as GIRES_v10_gdb.zip for users that cannot read ESRI geodatabases (tiled by region due to size limitations). |——— GIRES_v10_rivers_af.shp : Africa |——— GIRES_v10_rivers_ar.shp : North American Arctic |——— GIRES_v10_rivers_as.shp : Asia |——— GIRES_v10_rivers_au.shp : Australasia|——— GIRES_v10_rivers_eu.shp : Europe|——— GIRES_v10_rivers_gr.shp : Greenland|——— GIRES_v10_rivers_na.shp : North America|——— GIRES_v10_rivers_sa.shp : South America|——— GIRES_v10_rivers_si.shp : Siberia|——— GIRES_v10_stations.shp : points with streamflow summary statistics and metadata— Other_technical_documentations.zip/ : directory containing three documentation files (zipped)|——— HydroATLAS_TechDoc_v10.pdf : documentation for river network framework|——— RiverATLAS_Catalog_v10.pdf : documentation for river network hydro-environmental attributes|——— Readme_GSIM_part1.txt : documentation for gauging stations from the Global Streamflow Indices and Metadata (GSIM) archive— README_Technical_documentation_GIRES_v10.pdf : full documentation for this repository3. Data format and projectionThe geometric network (lines) and gauging stations (points) datasets are distributed both in ESRI® file geodatabase and shapefile formats. The file geodatabase contains all data and is the prime, recommended format. Shapefiles are provided as a copy for users that cannot read the geodatabase. Each shapefile consists of five main files (.dbf, .sbn, .sbx, .shp, .shx), and projection information is provided in an ASCII text file (.prj). The attribute table can be accessed as a stand-alone file in dBASE format (.dbf) which is included in the Shapefile format. These datasets are available electronically in compressed zip file format. To use the data files, the zip files must first be decompressed.All data layers are provided in geographic (latitude/longitude) projection, referenced to datum WGS84. In ESRI® software this projection is defined by the geographic coordinate system GCS_WGS_1984 and datum D_WGS_1984 (EPSG: 4326).4. License and citations4.1 License agreement This documentation and datasets are licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (CC-BY-4.0 License). For all regulations regarding license grants, copyright, redistribution restrictions, required attributions, disclaimer of warranty, indemnification, liability, waiver of damages, and a precise definition of licensed materials, please refer to the License Agreement (https://creativecommons.org/licenses/by/4.0/legalcode). For a human-readable summary of the license, please see https://creativecommons.org/licenses/by/4.0/.4.2 Citations and acknowledgements.Citations and acknowledgements of this dataset should be made as follows:Messager, M. L., Lehner, B., Cockburn, C., Lamouroux, N., Pella, H., Snelder, T., Tockner, K., Trautmann, T., Watt, C. & Datry, T. (2021). Global prevalence of non-perennial rivers and streams. Nature. https://doi.org/10.1038/s41586-021-03565-5 We kindly ask users to cite this study in any published material produced using it. If possible, online links to this repository (https://doi.org/10.6084/m9.figshare.14633022) should also be provided.

  18. d

    Metabolism estimates for 356 U.S. rivers (2007-2017): 2b. Site catchment...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Jul 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Metabolism estimates for 356 U.S. rivers (2007-2017): 2b. Site catchment boundaries [Dataset]. https://catalog.data.gov/dataset/metabolism-estimates-for-356-u-s-rivers-2007-2017-2b-site-catchment-boundaries
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    U.S. Geological Survey
    Area covered
    United States
    Description

    This dataset provides shapefile outlines of the catchments contributing to sites where metabolism was or could have been estimated. The format is a shapefile for all sites combined (.shp, .shx, .dbf, and .prj files). This dataset is part of a larger data release of metabolism model inputs and outputs for 356 streams and rivers across the United States (https://doi.org/10.5066/F70864KX). The complete release includes: modeled estimates of gross primary productivity, ecosystem respiration, and the gas exchange coefficient; model input data and alternative input data; model fit and diagnostic information; site catchment boundaries and site point locations; and potential predictors of metabolism such as discharge and light availability.

  19. d

    Geodatabase of ultramafic soils of the Americas

    • dataone.org
    • data.niaid.nih.gov
    • +3more
    Updated Apr 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Catherine Hulshof (2024). Geodatabase of ultramafic soils of the Americas [Dataset]. http://doi.org/10.5061/dryad.4xgxd25gj
    Explore at:
    Dataset updated
    Apr 18, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Catherine Hulshof
    Time period covered
    Jan 1, 2023
    Area covered
    Americas
    Description

    This is a compiled geospatial dataset in ESRI polygon shapefile format of ultramafic soils of the Americas showing the location of ultramafic soils in Canada, the United States of America, Mexico, Guatemala, Cuba, Dominican Republic, Puerto Rico, Costa Rica, Colombia, Argentina, Chile, Venezuela, Ecuador, Brazil, Suriname, French Guiana, and Bolivia. The R code used to compile the dataset as well as an image of the compiled dataset are also included. , The data are derived from ten geospatial datasets. Original datasets were subset to include only ultramafic areas, datasets were assigned a common projection (WGS84), attribute tables were reconciled to a common set of fields, and the datasets were combined., , README: Geodatabase of ultramafic soils of the Americas

    Author: Catherine Hulshof, Virginia Commonwealth University, cmhulshof@vcu.edu

    Abstract: This is a compiled geospatial dataset in ESRI polygon shapefile format of ultramafic soils of many countries in the Americas showing the location of ultramafic soils in Canada, the United States of America, Guatemala, Cuba, Dominican Republic, Puerto Rico, Costa Rica, Colombia, Argentina, Chile, Venezuela, Ecuador, Brazil, Suriname, French Guiana, and Bolivia. The data are derived from nine geospatial datasets. Original datasets were subset to include only ultramafic areas, datasets were assigned a common projection (WGS84), attribute tables were reconciled to a common set of fields, and the datasets were combined.

    Contents: The data are in ESRI shapefile format and thus have four components with extensions .shp, .shx, .prj, and .dbf. The .shp file contains the feature geometries, the .prj file contains the geographic coordin...

  20. COVID-19 US County JHU Data & Demographics

    • kaggle.com
    Updated Mar 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heads or Tails (2023). COVID-19 US County JHU Data & Demographics [Dataset]. https://www.kaggle.com/headsortails/covid19-us-county-jhu-data-demographics/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 1, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Heads or Tails
    Area covered
    United States
    Description

    Context

    The United States have recently become the country with the most reported cases of 2019 Novel Coronavirus (COVID-19). This dataset contains daily updated number of reported cases & deaths in the US on the state and county level, as provided by the Johns Hopkins University. In addition, I provide matching demographic information for US counties.

    Content

    The dataset consists of two main csv files: covid_us_county.csv and us_county.csv. See the column descriptions below for more detailed information. In addition, I've added US county shape files for geospatial plots: us_county.shp/dbf/prj/shx.

    • covid_us_county.csv: COVID-19 cases and deaths which will be updated daily. The data is provided by the Johns Hopkins University through their excellent github repo. I combined the separate "confirmed cases" and "deaths" files into a single table, removed a few (I think to be) redundant geo identifier columns, and reshaped the data into long format with a single date column. The earliest recorded cases are from 2020-01-22.

    • us_counties.csv: Demographic information on the US county level based on the (most recent) 2014-18 release of the Amercian Community Survey. Derived via the great tidycensus package.

    Column Description

    COVID-19 dataset covid_us_county.csv:

    • fips: County code in numeric format (i.e. no leading zeros). A small number of cases have NA values here, but can still be used for state-wise aggregation. Currently, this only affect the states of Massachusetts and Missouri.

    • county: Name of the US county. This is NA for the (aggregated counts of the) territories of American Samoa, Guam, Northern Mariana Islands, Puerto Rico, and Virgin Islands.

    • state: Name of US state or territory.

    • state_code: Two letter abbreviation of US state (e.g. "CA" for "California"). This feature has NA values for the territories listed above.

    • lat and long: coordinates of the county or territory.

    • date: Reporting date.

    • cases & deaths: Cumulative numbers for cases & deaths.

    Demographic dataset us_counties.csv:

    • fips, county, state, state_code: same as above. The county names are slightly different, but mostly the difference is that this dataset has the word "County" added. I recommend to join on fips.

    • male & female: Population numbers for male and female.

    • population: Total population for the county. Provided as convenience feature; is always the sum of male + female.

    • female_percentage: Another convenience feature: female / population in percent.

    • median_age: Overall median age for the county.

    Acknowledgements

    Data provided for educational and academic research purposes by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE).

    Licence

    The github repo states that:

    This GitHub repo and its contents herein, including all data, mapping, and analysis, copyright 2020 Johns Hopkins University, all rights reserved, is provided to the public strictly for educational and academic research purposes. The Website relies upon publicly available data from multiple sources, that do not always agree. The Johns Hopkins University hereby disclaims any and all representations and warranties with respect to the Website, including accuracy, fitness for use, and merchantability. Reliance on the Website for medical guidance or use of the Website in commerce is strictly prohibited.
    

    Version history

    • In version 1, a small number of cases had values of `county == "Unassigned". Those have been superseded.
    • Version 5: added US county shape files
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Institute of Standards and Technology (2025). Simulation Files (.prj and .cvf) for Virus Particle Exposure in Residences (ViPER) Webtool [Dataset]. https://catalog.data.gov/dataset/simulation-files-prj-and-cvf-for-virus-particle-exposure-in-residences-viper-webtool
Organization logo

Simulation Files (.prj and .cvf) for Virus Particle Exposure in Residences (ViPER) Webtool

Explore at:
Dataset updated
Mar 14, 2025
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description

This dataset is comprised of the .prj and .cvf files used to build the database for the Virus Particle Exposure in Residences (ViPER) Webtool, a single zone indoor air quality and ventilation analysis tool developed by the National Institute of Standards and Technology (NIST).

Search
Clear search
Close search
Google apps
Main menu