100+ datasets found
  1. EPA FRS Facilities Combined File CSV Download for the State of Arkansas

    • catalog.data.gov
    Updated Nov 29, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Environmental Information (OEI) - Office of Information Collection (OIC) (2020). EPA FRS Facilities Combined File CSV Download for the State of Arkansas [Dataset]. https://catalog.data.gov/dataset/epa-frs-facilities-combined-file-csv-download-for-the-state-of-arkansas
    Explore at:
    Dataset updated
    Nov 29, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Area covered
    Arkansas
    Description

    The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.

  2. Indian Latitude and Longitude

    • kaggle.com
    Updated Jan 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anurag Peddi (2021). Indian Latitude and Longitude [Dataset]. https://www.kaggle.com/datasets/anurag1817/indian-latlong
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 29, 2021
    Dataset provided by
    Kaggle
    Authors
    Anurag Peddi
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Area covered
    India
    Description

    A Dataset which consists of the latitude and longitude information of the 29 Indian states.

  3. EPA FRS Facilities Combined File CSV Download for the State of Texas

    • catalog.data.gov
    Updated Nov 29, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Environmental Information (OEI) - Office of Information Collection (OIC) (2020). EPA FRS Facilities Combined File CSV Download for the State of Texas [Dataset]. https://catalog.data.gov/dataset/epa-frs-facilities-combined-file-csv-download-for-the-state-of-texas
    Explore at:
    Dataset updated
    Nov 29, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Area covered
    Texas
    Description

    The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.

  4. A

    The New York Times Coronavirus (Covid-19) Cases and Deaths in the United...

    • data.amerigeoss.org
    csv
    Updated Mar 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UN Humanitarian Data Exchange (2023). The New York Times Coronavirus (Covid-19) Cases and Deaths in the United States [Dataset]. https://data.amerigeoss.org/sl/dataset/nyt-covid-19-data
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 30, 2023
    Dataset provided by
    UN Humanitarian Data Exchange
    Area covered
    United States
    Description

    The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.

    Since late January, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.

    We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.

    The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.

    United States Data

    Data on cumulative coronavirus cases and deaths can be found in two files for states and counties.

    Each row of data reports cumulative counts based on our best reporting up to the moment we publish an update. We do our best to revise earlier entries in the data when we receive new information.

    Both files contain FIPS codes, a standard geographic identifier, to make it easier for an analyst to combine this data with other data sets like a map file or population data.

    State-Level Data

    State-level data can be found in the us-states.csv file.

    date,state,fips,cases,deaths
    2020-01-21,Washington,53,1,0
    ...
    

    County-Level Data

    County-level data can be found in the us-counties.csv file.

    date,county,state,fips,cases,deaths
    2020-01-21,Snohomish,Washington,53061,1,0
    ...
    

    In some cases, the geographies where cases are reported do not map to standard county boundaries. See the list of geographic exceptions for more detail on these.

    Github Repository

    This dataset contains COVID-19 data for the United States of America made available by The New York Times on github at https://github.com/nytimes/covid-19-data

  5. 2015-2016 NSDUH State Estimates – Individual Excel and CSV Files by Outcome

    • catalog.data.gov
    • odgavaprod.ogopendata.com
    Updated Sep 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Substance Abuse and Mental Health Services Administration (2025). 2015-2016 NSDUH State Estimates – Individual Excel and CSV Files by Outcome [Dataset]. https://catalog.data.gov/dataset/2015-2016-nsduh-state-estimates-individual-excel-and-csv-files-by-outcome
    Explore at:
    Dataset updated
    Sep 7, 2025
    Dataset provided by
    Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
    Description

    2015-2016 NSDUH State Estimates – Individual Excel and CSV Files by Outcome

  6. US States Ranked by Population 2024

    • kaggle.com
    Updated Jul 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ibrar Ali (2024). US States Ranked by Population 2024 [Dataset]. https://www.kaggle.com/datasets/dataanalyst001/us-states-ranked-by-population-2024
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 4, 2024
    Dataset provided by
    Kaggle
    Authors
    Ibrar Ali
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United States
    Description

    This dataset provides a detailed overview of the population statistics for each U.S. state for the years 2023 and 2024. It includes the population count, growth rate, percentage of the U.S. population, and population density per square mile.

  7. All U.S State Of The Union Speeches (1790-2019)

    • kaggle.com
    zip
    Updated Sep 11, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    jyron (2019). All U.S State Of The Union Speeches (1790-2019) [Dataset]. https://www.kaggle.com/datasets/jyronw/us-state-of-the-union-addresses-1790-2019
    Explore at:
    zip(3645771 bytes)Available download formats
    Dataset updated
    Sep 11, 2019
    Authors
    jyron
    Area covered
    United States
    Description

    Context

    The State of the Union Address (S.O.T.U) is an annual message delivered by the President of the United States to a joint session of the United States Congress at the beginning of each calendar year in office. The message typically includes a budget message and an economic report of the nation, and also allows the President to propose a legislative agenda and national priorities.

    Content

    This dataset is a CSV file with columns President, Year, Title, and Text. The Text column contains a list of string formatted sentences comprised of the text of each S.O.T.U.

    Acknowledgements

    Thanks Wikidata! - Data sourced from wikidata pages: https://www.wikidata.org/w/index.php?title=Q28371311&oldid=992890506

    Inspiration

    • How does Presidential Popularity relate to S.OT.U sentiment analysis for a given year?
    • How has the vocabulary of presidents changed throughout the 200+ year document history?
    • Determine the significant historical events occurring during a given year based on the address of that year, or of future/preceding years.
  8. g

    Coronavirus (Covid-19) Data in the United States

    • github.com
    • openicpsr.org
    • +2more
    csv
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New York Times, Coronavirus (Covid-19) Data in the United States [Dataset]. https://github.com/nytimes/covid-19-data
    Explore at:
    csvAvailable download formats
    Dataset provided by
    New York Times
    License

    https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE

    Description

    The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.

    Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.

    We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.

    The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.

  9. h

    united-states-license-plate-dataset

    • huggingface.co
    Updated Jul 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unidata (2025). united-states-license-plate-dataset [Dataset]. https://huggingface.co/datasets/UniDataPro/united-states-license-plate-dataset
    Explore at:
    Dataset updated
    Jul 1, 2025
    Authors
    Unidata
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Dataset of license plate recognition

    Dataset offers 89,986 images of vehicles featuring license plates from the USA, making it an excellent resource for tasks involving OCR (Optical Character Recognition), license plate identification, and vehicle registration data extraction. Each image is accompanied by a CSV file that provides the corresponding plate text and country code, ideal for developing and testing text recognition systems. With this dataset, researchers and developers can… See the full description on the dataset page: https://huggingface.co/datasets/UniDataPro/united-states-license-plate-dataset.

  10. U

    State Class Transition Spreadsheet (Area of Land Transition into Each Class...

    • data.usgs.gov
    Updated Jun 24, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tamara Wilson; Elliott Matchett; Kristin Byrd; Erin Conlisk; Matthew Reiter; Lorraine Flint; Alan Flint; Monica Moritsch; Cynthia Wallace (2021). State Class Transition Spreadsheet (Area of Land Transition into Each Class per Year, per Scenario) [Dataset]. http://doi.org/10.5066/P9BSZM8R
    Explore at:
    Dataset updated
    Jun 24, 2021
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Tamara Wilson; Elliott Matchett; Kristin Byrd; Erin Conlisk; Matthew Reiter; Lorraine Flint; Alan Flint; Monica Moritsch; Cynthia Wallace
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    2011 - 2101
    Description

    This spreadsheet dataset (.csv file) contains annual modeled output of land-use and land-cover change transitions in square kilometers (km2) by specified transition group, scenario, timestep, WEAP hydrologic zone, and 4 sub-regions within the broader California Central Valley, modeled using the LUCAS ST-SIM for the period 2011-2101 across 5 future scenarios. Four of the scenarios were developed as part of the Central Valley Landscape Conservation Project. The 4 original scenarios include a Bad-Business-As-Usual (BBAU; high water availability, poor management), California Dreamin’ (DREAM; high water availability, good management), Central Valley Dustbowl (DUST; low water availability, poor management), and Everyone Equally Miserable (EEM; low water availability, good management). These scenarios represent alternative plausible futures, capturing a range of climate variability, land management activities, and habitat restoration goals. We parameterized our models based on close inte ...

  11. EPA FRS Facilities Single File CSV Download for the State of Wisconsin

    • catalog.data.gov
    Updated Nov 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Environmental Information (OEI) - Office of Information Collection (OIC) (2020). EPA FRS Facilities Single File CSV Download for the State of Wisconsin [Dataset]. https://catalog.data.gov/dataset/epa-frs-facilities-single-file-csv-download-for-the-state-of-wisconsin
    Explore at:
    Dataset updated
    Nov 29, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Area covered
    Wisconsin
    Description

    The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.

  12. d

    Data for: Of the first five US states with food waste bans, Massachusetts...

    • datadryad.org
    • search.dataone.org
    zip
    Updated Aug 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fiorentia Zoi Anglou; Robert Evan Sanders; Ioannis Stamatopoulos (2024). Data for: Of the first five US states with food waste bans, Massachusetts alone has reduced landfill waste [Dataset]. http://doi.org/10.5061/dryad.bzkh189h4
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 27, 2024
    Dataset provided by
    Dryad
    Authors
    Fiorentia Zoi Anglou; Robert Evan Sanders; Ioannis Stamatopoulos
    Time period covered
    Jan 4, 2024
    Area covered
    Massachusetts, United States
    Description

    The raw data for this paper have been received by individual states in PDF or Excel files. (For each state there might be several PDF or Excel files for each year.) In the data we uploaded on GitHub, we transferred these raw data (the various pdfs and excels) into a single CSV file and have created a standardized waste outcome---specifically, state-generated, municipal solid waste (MSW) disposal. In the README file, we include more details regarding all the other supporting data and code we have used.

  13. DeepBase: A Deep Learning-based Daily Baseflow Data across the United States...

    • springernature.figshare.com
    txt
    Updated Jan 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Parnian Ghaneei; Hamid Moradkhani (2025). DeepBase: A Deep Learning-based Daily Baseflow Data across the United States [Dataset]. http://doi.org/10.6084/m9.figshare.27312927.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 8, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Parnian Ghaneei; Hamid Moradkhani
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    Daily baseflow data, along with input datasets for 1661 basins for the hydrologica years from 1981 to 2022, can be downloaded in CSV format from the DeepBase repository on FigShare. The baseflow datafiles for the basins are zipped into archives named ‘Daily_Baseflow_Cluster[cluster_number].zip’, corresponding to their respective clusters. All the static inputs for 1661 basins are provided in a csv file named ‘Static_Inputs.csv’. The statistic attributes for the static inputs, calculated for each cluster, are provided in the file ‘14Clusters_statistics.csv’. All the dynamic forcings for 1661 basins are provided in csv files with the format of ‘Daymet_[basin_id].csv’ and are zipped into an archive named ‘Daily_DayMet_Forcings.zip’. The USGS gauge IDs of training basins (mentioned as gauged basins) are provided at ‘530basins_ids.txt’. The associated shapefiles for each cluster, including the polygons of the basins titled ‘DeepBase_Clusters.zip’ along with the PDF version of the cluster map titled ‘DeepBase_Clusters_map.pdf’ are accessible via the DeepBase repository.

  14. g

    Hierarchy of addresses RÚIAN data distributed by the country in the CSV...

    • geoportal.gov.cz
    • data.gov.cz
    • +1more
    xml
    Updated Sep 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Hierarchy of addresses RÚIAN data distributed by the country in the CSV format [Dataset]. https://geoportal.gov.cz/php/micka/record/basic/CZ-00025712-CUZK_SERIES-MD_RUIAN-CSV-HIE-ST?dlang=eng
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Sep 19, 2025
    Variables measured
    https://www.slovnikcuzk.eu/termin.php?tid=3663&l=obec, https://www.slovnikcuzk.eu/termin.php?tid=4007&l=adresa, https://www.slovnikcuzk.eu/termin.php?tid=3930&l=cast-obce, http://inspire.ec.europa.eu/metadata-codelist/SpatialScope/national, https://www.slovnikcuzk.eu/termin.php?tid=2057&l=popisne-cislo-budovy, https://www.slovnikcuzk.eu/termin.php?tid=2022&l=evidencni-cislo-budovy, https://www.slovnikcuzk.eu/termin.php?tid=2050&l=orientacni-cislo-budovy, https://www.slovnikcuzk.eu/termin.php?tid=3782&l=uzemne-spravni-jednotka, https://www.slovnikcuzk.eu/termin.php?tid=1236&l=definicni-bod--reprezentacni-bod--centroid
    Description

    Dataset contains information on relationship between selected territorial elements and units of territorial registration. Data is specified in seven CSV files for the whole Czech Republic. File adresni-mista-vazby-cr.csv contains links of address points to the following elements – street, municipality part, town district (MOMC), Prague city district (MOP), town district of Prague (SPRAVOBV), municipality, municipality with an authorized municipal office (POU), municipality with extended competence (ORP), higher territorial self-governing entity (VÚSC) and election district (VO). File vazby-cr.csv contains links between elements municipality part, municipality, POU, ORP, VUSC, cohesion region (REGSOUDR) up to the element of state. File vazby-hlm-praha.csv contains modularity of elements in the city of Prague: MOMC, SPRAVOBV, municipality, POU, ORP, VUSC, REGSOUDR and state. File vazby-katastr-uzemi-cr.csv contains modularity of basic urban units (ZSJ) into cadastral units (KATUZ) and municipalities. File vazby-momc-statutarni-mesta.csv contains modularity of territorial elements in territorialy structured statutory cities: MOMC, MOP, obec, POU, ORP, VUSC, REGSOUDR and state. File vazby-okresy-cr.csv contains links between elements of municipality part, municipality, county, region (old – defined in 1960) and state. File vazby-ulice-obce-s-ulicni-siti.csv contains links of streets to the municipality. Dataset is provided as Open Data (licence CC-BY 4.0). Data is based on RÚIAN (Register of Territorial Identification, Addresses and Real Estates). Files are created during the first day of each month with data valid to the last day of previous month. The whole dataset is compressed (ZIP) for downloading. More in the Act No. 111/2009 Coll., on the Basic Registers, in Decree No. 359/2011 Coll., on the Basic Register of Territorial Identification, Addresses and Real Estates.

  15. o

    HarDWR - Raw Water Rights Records

    • osti.gov
    Updated Oct 31, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Caccese, Robert; Fisher-Vanden, Karen; Fowler, Lara; Grogan, Danielle; Lammers, Richard; Lisk, Matthew; Olmstead, Sheila; Peklak, Darrah; Zheng, Jiameng; Zuidema, Shan (2020). HarDWR - Raw Water Rights Records [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/2475305
    Explore at:
    Dataset updated
    Oct 31, 2020
    Dataset provided by
    USDOE Office of Science (SC), Biological and Environmental Research (BER)
    MultiSector Dynamics - Living, Intuitive, Value-adding, Environment
    Authors
    Caccese, Robert; Fisher-Vanden, Karen; Fowler, Lara; Grogan, Danielle; Lammers, Richard; Lisk, Matthew; Olmstead, Sheila; Peklak, Darrah; Zheng, Jiameng; Zuidema, Shan
    Description

    A dataset within the Harmonized Database of Western U.S. Water Rights (HarDWR). For a detailed description of the database, please see the meta-record v2.0. Changelog v2.0 - Switched source data from collecting records from each state independently to using the WestDAAT dataset v1.0 - Initial public release Description In order to hold a water right in the western United States, an entity, (e.g., an individual, corporation, municipality, sovereign government, or non-profit) must register a physical document with the state's water regulatory agency. State water agencies each maintain their own database containing all registered water right documents within the state, along with relevant metadata such as the point of diversion and place of use of the water. All western U.S. states have digitized their individual water rights databases, as well as geospatial data defining the areas in which water rights are managed. Each state maintains and provides their own water rights data in accordance with individual state regulations and standards. In addition, while all states make their water rights publicly available, each provides their records in unique formats, meaning that file types, field availability, and terms vary from state to state. This leads to additional challenges to managing resources which crossmore » state lines, or conducting consistent multi-state water analyses. For the first version of HarDWR, we collected the water rights databases from 11 Western States of the United States. In order to preform regional analyses with the collected data, the raw records had to be harmonized into one single format. The Water Data Exchange (WaDE) is a program dedicated to the sharing of water-related data for the Western U.S. in a singular consistent format. Created by the Western States Water Council (WSWC) to facilitate the collection and dissemination of water data among WSWC's member states and the public, WaDE provides an important service for those interested in water resource planning and management in their focus region. Of the services which WaDE provides, the one of the most interesting is the WestDAAT dataset, which is a collection of water rights data provided by the 18 WSWC member states that have been standardized into a single format, much like we had done on a more limited scale with HarDWR v1. For this version of HarDWR we decided to use WestDAAT, specifically a snapshot created in Feburary 2024, as our water rights source data. A full explanation of the benefits gained from this switch can be found in the description of the updated Harmonized Water Rights Records v2.0, but in short it has allowed us to focus more of our efforts on answering research questions and gaining a more realistic understanding of how water rights are allocated. For more information on how the data for WestDAAT was collected, please see the WaDE data summary. Terms of Use While WaDE works directly with the state agencies to collect and standardize the water rights records, the ultimate authority for the water rights data remains the individual states. Each state, and their respective water right authorities, have made their water right records available for non-commercial reference uses. In addition, the states make no guarantees as to the completeness, accuracy, or timeliness of their respective databases, let alone the modifications which we, the authors of this paper, have made to the collected records. None of the states should be held liable for using this data outside of its intended use. As several of the states update their water rights databases daily, the information provided here is not the latest possible, and should not be used for legal purposes. WestDAAT itself has irregular updates. Additional questions about the data the source states provided should be directed to the respective state agencies (see methods.csv and organization.csv files described below). In addition, although data was presented here was not collected directly from the states, several states requested specifically worked disclaimers when sharing their data. These disclaimers are included here as an acknowledgement from where the water rights data is primarily sourced. Colorado: "The data made available here has been modified for use from its original source, which is the State of Colorado. THE STATE OF COLORADO MAKES NO REPRESENTATIONS OR WARRANTY AS TO THE COMPLETENESS, ACCURACY, TIMELINESS, OR CONTENT OF ANY DATA MADE AVAILABLE THROUGH THIS SITE. THE STATE OF COLORADO EXPRESSLY DISCLAIMS ALL WARRANTIES, WHETHER EXPRESS OR IMPLIED, INCLUDING ANY IMPLIED WARRANTIES OF MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE. The data is subject to change as modifications and updates are complete. It is understood that the information contained in the Web feed is being used at one's own risk." Montana: "The Montana State Library provides this product/service for informational purposes only. The Library did not produce it for, nor is it suitable for legal, engineering, or surveying purposes. Consumers of this information should review or consult the primary data and information sources to ascertain the viability of the information for their purposes. The Library provides these data in good faith but does not represent or warrant its accuracy, adequacy, or completeness. In no event shall the Library be liable for any incorrect results or analysis; any direct, indirect, special, or consequential damages to any party; or any lost profits arising out of or in connection with the use or the inability to use the data or the services provided. The Library makes these data and services available as a convenience to the public, and for no other purpose. The Library reserves the right to change or revise published data and/or services at any time." Oregon: "This product is for informational purposes and may not have been prepared for, or be suitable for legal, engineering, or surveying purposes. Users of this information should review or consult the primary data and information sources to ascertain the usability of the information." File Descriptions The unmodified February, 2024 WestDAAT snapshot is composed of nine files. Below is a brief description of each file, as well as how they were utilized for HarDWR. WaDEDataDictionaryTerms.xlsx: As the file's name implies, this is a data dictionary for all of the below named files. This file describes the column names for each of the following files, with the exception of citation.txt which does not have any columns. The descriptions for each file are divided by tab,with the same name as their associated file, within this document. allocationamount.csv: The "main" file of the group, it contains the water right records for each state. Of particular note, each water right is broken down into one or more water allocations. Allocations may be withdrawn from one or more locations, or even multiple allocations associated with a particular location. This is a more subtle and realistic representation of how water is used than what was available in the first version of HarDWR. For the records from some states, this can mean that multiple allocations listed under a single right will appear as rows within this file. citation.txt: A combination of contact information for WaDE personnel, disclaimer about how the data should be used, and guidelines for citing WestDAAT. methods.csv: A file describing the source and method by which WaDE collected water rights data from each state. organization.csv: A file listing the water rights authoritative agencies for each state. sites.csv: This file provides the geographic, and other descriptors, of the physical location of allocations, called 'sites'. To reiterate, it is possible for one allocation to be associated with multiple sites, as well as one site to be associated with multiple allocations. The two descriptors which we were most interested in where the site's coordinates, as well as whether the site was classified as a Point of Diversion (POD) or a Place of Use (POU). As a general rule, PODs are geographic points, while POUs are areas typically represented as property boundaries or irregularly shaped polygons. sites_pouGeometry.csv: For those allocations with a POU site, this file contains the defining points for the associated polygons. variables.csv: A file describing the units in which an allocation's water amount is reported within WestDAAT. This information is essentially a repeat of the 'AllocationFlow_CFS' and 'AllocationVolume_AF' columns within allocationamount.csv, at least for our purposes. watersources: This file describes the source of water from which each site extracts from. For our purposes, this table was used to determine whether the water came from Surface Water, Groundwater, or Unspecified Water.« less

  16. US Department of Veterans Affairs - State Summary_Connecticut

    • datalumos.org
    delimited
    Updated Apr 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Department of Veterans Affairs (2025). US Department of Veterans Affairs - State Summary_Connecticut [Dataset]. http://doi.org/10.3886/E228163V1
    Explore at:
    delimitedAvailable download formats
    Dataset updated
    Apr 29, 2025
    Dataset authored and provided by
    United States Department of Veterans Affairshttp://va.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2019 - 2021
    Area covered
    Connecticut
    Description

    Veteran data in .csv files. Includes population/demographic data of age distribution, period of service, income, and education. Also includes population projections. Compares Connecticut to national data.

  17. d

    Postal Codes Dataset for United States, US

    • datahub.io
    csv
    Updated Oct 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Postal Codes Dataset for United States, US [Dataset]. https://datahub.io/logistics/postal-codes-us
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 1, 2024
    Area covered
    United States
    Description

    Postal Codes Dataset for United States, US including name of the city, town, or place, various administrative divisions and alternative city names.

  18. Z

    PIPr: A Dataset of Public Infrastructure as Code Programs

    • data.niaid.nih.gov
    • zenodo.org
    Updated Nov 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salvaneschi, Guido (2023). PIPr: A Dataset of Public Infrastructure as Code Programs [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8262770
    Explore at:
    Dataset updated
    Nov 28, 2023
    Dataset provided by
    Sokolowski, Daniel
    Spielmann, David
    Salvaneschi, Guido
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    Programming Languages Infrastructure as Code (PL-IaC) enables IaC programs written in general-purpose programming languages like Python and TypeScript. The currently available PL-IaC solutions are Pulumi and the Cloud Development Kits (CDKs) of Amazon Web Services (AWS) and Terraform. This dataset provides metadata and initial analyses of all public GitHub repositories in August 2022 with an IaC program, including their programming languages, applied testing techniques, and licenses. Further, we provide a shallow copy of the head state of those 7104 repositories whose licenses permit redistribution. The dataset is available under the Open Data Commons Attribution License (ODC-By) v1.0. Contents:

    metadata.zip: The dataset metadata and analysis results as CSV files. scripts-and-logs.zip: Scripts and logs of the dataset creation. LICENSE: The Open Data Commons Attribution License (ODC-By) v1.0 text. README.md: This document. redistributable-repositiories.zip: Shallow copies of the head state of all redistributable repositories with an IaC program. This artifact is part of the ProTI Infrastructure as Code testing project: https://proti-iac.github.io. Metadata The dataset's metadata comprises three tabular CSV files containing metadata about all analyzed repositories, IaC programs, and testing source code files. repositories.csv:

    ID (integer): GitHub repository ID url (string): GitHub repository URL downloaded (boolean): Whether cloning the repository succeeded name (string): Repository name description (string): Repository description licenses (string, list of strings): Repository licenses redistributable (boolean): Whether the repository's licenses permit redistribution created (string, date & time): Time of the repository's creation updated (string, date & time): Time of the last update to the repository pushed (string, date & time): Time of the last push to the repository fork (boolean): Whether the repository is a fork forks (integer): Number of forks archive (boolean): Whether the repository is archived programs (string, list of strings): Project file path of each IaC program in the repository programs.csv:

    ID (string): Project file path of the IaC program repository (integer): GitHub repository ID of the repository containing the IaC program directory (string): Path of the directory containing the IaC program's project file solution (string, enum): PL-IaC solution of the IaC program ("AWS CDK", "CDKTF", "Pulumi") language (string, enum): Programming language of the IaC program (enum values: "csharp", "go", "haskell", "java", "javascript", "python", "typescript", "yaml") name (string): IaC program name description (string): IaC program description runtime (string): Runtime string of the IaC program testing (string, list of enum): Testing techniques of the IaC program (enum values: "awscdk", "awscdk_assert", "awscdk_snapshot", "cdktf", "cdktf_snapshot", "cdktf_tf", "pulumi_crossguard", "pulumi_integration", "pulumi_unit", "pulumi_unit_mocking") tests (string, list of strings): File paths of IaC program's tests testing-files.csv:

    file (string): Testing file path language (string, enum): Programming language of the testing file (enum values: "csharp", "go", "java", "javascript", "python", "typescript") techniques (string, list of enum): Testing techniques used in the testing file (enum values: "awscdk", "awscdk_assert", "awscdk_snapshot", "cdktf", "cdktf_snapshot", "cdktf_tf", "pulumi_crossguard", "pulumi_integration", "pulumi_unit", "pulumi_unit_mocking") keywords (string, list of enum): Keywords found in the testing file (enum values: "/go/auto", "/testing/integration", "@AfterAll", "@BeforeAll", "@Test", "@aws-cdk", "@aws-cdk/assert", "@pulumi.runtime.test", "@pulumi/", "@pulumi/policy", "@pulumi/pulumi/automation", "Amazon.CDK", "Amazon.CDK.Assertions", "Assertions_", "HashiCorp.Cdktf", "IMocks", "Moq", "NUnit", "PolicyPack(", "ProgramTest", "Pulumi", "Pulumi.Automation", "PulumiTest", "ResourceValidationArgs", "ResourceValidationPolicy", "SnapshotTest()", "StackValidationPolicy", "Testing", "Testing_ToBeValidTerraform(", "ToBeValidTerraform(", "Verifier.Verify(", "WithMocks(", "[Fact]", "[TestClass]", "[TestFixture]", "[TestMethod]", "[Test]", "afterAll(", "assertions", "automation", "aws-cdk-lib", "aws-cdk-lib/assert", "aws_cdk", "aws_cdk.assertions", "awscdk", "beforeAll(", "cdktf", "com.pulumi", "def test_", "describe(", "github.com/aws/aws-cdk-go/awscdk", "github.com/hashicorp/terraform-cdk-go/cdktf", "github.com/pulumi/pulumi", "integration", "junit", "pulumi", "pulumi.runtime.setMocks(", "pulumi.runtime.set_mocks(", "pulumi_policy", "pytest", "setMocks(", "set_mocks(", "snapshot", "software.amazon.awscdk.assertions", "stretchr", "test(", "testing", "toBeValidTerraform(", "toMatchInlineSnapshot(", "toMatchSnapshot(", "to_be_valid_terraform(", "unittest", "withMocks(") program (string): Project file path of the testing file's IaC program Dataset Creation scripts-and-logs.zip contains all scripts and logs of the creation of this dataset. In it, executions/executions.log documents the commands that generated this dataset in detail. On a high level, the dataset was created as follows:

    A list of all repositories with a PL-IaC program configuration file was created using search-repositories.py (documented below). The execution took two weeks due to the non-deterministic nature of GitHub's REST API, causing excessive retries. A shallow copy of the head of all repositories was downloaded using download-repositories.py (documented below). Using analysis.ipynb, the repositories were analyzed for the programs' metadata, including the used programming languages and licenses. Based on the analysis, all repositories with at least one IaC program and a redistributable license were packaged into redistributable-repositiories.zip, excluding any node_modules and .git directories. Searching Repositories The repositories are searched through search-repositories.py and saved in a CSV file. The script takes these arguments in the following order:

    Github access token. Name of the CSV output file. Filename to search for. File extensions to search for, separated by commas. Min file size for the search (for all files: 0). Max file size for the search or * for unlimited (for all files: *). Pulumi projects have a Pulumi.yaml or Pulumi.yml (case-sensitive file name) file in their root folder, i.e., (3) is Pulumi and (4) is yml,yaml. https://www.pulumi.com/docs/intro/concepts/project/ AWS CDK projects have a cdk.json (case-sensitive file name) file in their root folder, i.e., (3) is cdk and (4) is json. https://docs.aws.amazon.com/cdk/v2/guide/cli.html CDK for Terraform (CDKTF) projects have a cdktf.json (case-sensitive file name) file in their root folder, i.e., (3) is cdktf and (4) is json. https://www.terraform.io/cdktf/create-and-deploy/project-setup Limitations The script uses the GitHub code search API and inherits its limitations:

    Only forks with more stars than the parent repository are included. Only the repositories' default branches are considered. Only files smaller than 384 KB are searchable. Only repositories with fewer than 500,000 files are considered. Only repositories that have had activity or have been returned in search results in the last year are considered. More details: https://docs.github.com/en/search-github/searching-on-github/searching-code The results of the GitHub code search API are not stable. However, the generally more robust GraphQL API does not support searching for files in repositories: https://stackoverflow.com/questions/45382069/search-for-code-in-github-using-graphql-v4-api Downloading Repositories download-repositories.py downloads all repositories in CSV files generated through search-respositories.py and generates an overview CSV file of the downloads. The script takes these arguments in the following order:

    Name of the repositories CSV files generated through search-repositories.py, separated by commas. Output directory to download the repositories to. Name of the CSV output file. The script only downloads a shallow recursive copy of the HEAD of the repo, i.e., only the main branch's most recent state, including submodules, without the rest of the git history. Each repository is downloaded to a subfolder named by the repository's ID.

  19. Metadata record for: A rasterized building footprint dataset for the United...

    • springernature.figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scientific Data Curation Team (2023). Metadata record for: A rasterized building footprint dataset for the United States [Dataset]. http://doi.org/10.6084/m9.figshare.12444776.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Scientific Data Curation Team
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    United States
    Description

    This dataset contains key characteristics about the data described in the Data Descriptor A rasterized building footprint dataset for the United States. Contents:

        1. human readable metadata summary table in CSV format
    
    
        2. machine readable metadata file in JSON format
    
  20. AOI polygon fire statistics CSV files

    • nwcc-nrcs.hub.arcgis.com
    Updated Nov 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    USDA NRCS ArcGIS Online (2024). AOI polygon fire statistics CSV files [Dataset]. https://nwcc-nrcs.hub.arcgis.com/datasets/6928853d9d7c450a84984d4c66f95e9c
    Explore at:
    Dataset updated
    Nov 24, 2024
    Dataset provided by
    Natural Resources Conservation Servicehttp://www.nrcs.usda.gov/
    United States Department of Agriculturehttp://usda.gov/
    Authors
    USDA NRCS ArcGIS Online
    Description

    Annual and time-period fire statistics in CSV format for the AOIs of the NWCC active forecast stations. The statistics are based on NIFC fire historical and current perimeters and MTBS burn severity data. This release contains NIFC data from 1996 to current (July 10, 2025) and MTBS data from 1996 to 2022. Annual statsitics were generated for the time period of 1996 to 2025. Time-period statistics were generated from 1998 to 2022 with a 5 years time interval. The time periods are: 2018-2022 (last 5 years), 2013-2022 (last 10 years), 2008-2022 (last 15 years), 2003-2022 (last 20 years), and 1998-2022 (last 25 years).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. EPA Office of Environmental Information (OEI) - Office of Information Collection (OIC) (2020). EPA FRS Facilities Combined File CSV Download for the State of Arkansas [Dataset]. https://catalog.data.gov/dataset/epa-frs-facilities-combined-file-csv-download-for-the-state-of-arkansas
Organization logo

EPA FRS Facilities Combined File CSV Download for the State of Arkansas

Explore at:
Dataset updated
Nov 29, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Area covered
Arkansas
Description

The Facility Registry System (FRS) identifies facilities, sites, or places subject to environmental regulation or of environmental interest to EPA programs or delegated states. Using vigorous verification and data management procedures, FRS integrates facility data from program national systems, state master facility records, tribal partners, and other federal agencies and provides the Agency with a centrally managed, single source of comprehensive and authoritative information on facilities.

Search
Clear search
Close search
Google apps
Main menu