100+ datasets found

United States COVID-19 County Level of Community Transmission Historical...
catalog.data.gov
Updated Oct 19, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Centers for Disease Control and Prevention (2022). United States COVID-19 County Level of Community Transmission Historical Changes [Dataset]. https://catalog.data.gov/dataset/united-states-covid-19-county-level-of-community-transmission-historical-changes
Explore at:
Dataset updated
Oct 19, 2022
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Area covered
United States
Description
Announcement Beginning October 20, 2022, CDC will report and publish aggregate case and death data from jurisdictional and state partners on a weekly basis rather than daily. As a result, community transmission levels data reported on data.cdc.gov will be updated weekly on Thursdays, typically by 8 PM ET, instead of daily. This public use dataset has 7 data elements reflecting historical data for community transmission levels for all available counties. This dataset contains historical data for the county level of community transmission and includes updated data submitted by states and jurisdictions. Each day, the dataset is appended to contain the most recent day's data. This dataset includes data from January 1, 2021. Transmission level is set to low, moderate, substantial, or high using the calculation rules below. Currently, CDC provides the public with two versions of COVID-19 county-level community transmission level data: this dataset with the levels for each county from January 1, 2021 (Historical Changes dataset) and a dataset with the levels as originally posted (Originally Posted dataset), updated daily with the most recent day’s data. Methods for calculating county level of community transmission indicator The County Level of Community Transmission indicator uses two metrics: (1) total new COVID-19 cases per 100,000 persons in the last 7 days and (2) percentage of positive SARS-CoV-2 diagnostic nucleic acid amplification tests (NAAT) in the last 7 days. For each of these metrics, CDC classifies transmission values as low, moderate, substantial, or high (below and here). If the values for each of these two metrics differ (e.g., one indicates moderate and the other low), then the higher of the two should be used for decision-making. CDC core metrics of and thresholds for community transmission levels of SARS-CoV-2 Total New Case Rate Metric: "New cases per 100,000 persons in the past 7 days" is calculated by adding the number of new cases in the county (or other administrative level) in the last 7 days divided by the population in the county (or other administrative level) and multiplying by 100,000. "New cases per 100,000 persons in the past 7 days" is considered to have transmission level of Low (0-9.99); Moderate (10.00-49.99); Substantial (50.00-99.99); and High (greater than or equal to 100.00). Test Percent Positivity Metric: "Percentage of positive NAAT in the past 7 days" is calculated by dividing the number of positive tests in the county (or other administrative level) during the last 7 days by the total number of tests resulted over the last 7 days. "Percentage of positive NAAT in the past 7 days" is considered to have transmission level of Low (less than 5.00); Moderate (5.00-7.99); Substantial (8.00-9.99); and High (greater than or equal to 10.00). If the two metrics suggest different transmission levels, the higher level is selected. If one metric is missing, the other metric is used for the indicator. Transmission categories include: Low Transmission Threshold: Counties with fewer than 10 total cases per 100,000 population in the past 7 days, and a NAAT percent test positivity in the past 7 days below 5%; Moderate Transmission Threshold: Counties with 10-49 total cases per 100,000 population in the past 7 days or a NAAT test percent positivity in the past 7 days of 5.0-7.99%; Substantial Transmission Threshold: Counties with 50-99 total cases per 100,000 population in the past 7 days or a NAAT test percent positivity in the past 7 days of 8.0-9.99%; High Transmission Threshold: Counties with 100
N
Median Household Income Variation by Family Size in State Line City, IN:...
neilsberg.com
csv, json
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2024). Median Household Income Variation by Family Size in State Line City, IN: Comparative analysis across 7 household sizes [Dataset]. https://www.neilsberg.com/research/datasets/1b7a0822-73fd-11ee-949f-3860777c1fe6/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Jan 11, 2024
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
State Line City
Variables measured
Household size, Median Household Income
Measurement technique
The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 7 household sizes (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out how household income varies with the size of the family unit. For additional information about these estimations, please contact us via email at research@neilsberg.com
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

The dataset presents median household incomes for various household sizes in State Line City, IN, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.

Key observations

Of the 7 household sizes (1 person to 7-or-more person households) reported by the census bureau, State Line City did not include 4, 5, 6, or 7-person households. Across the different household sizes in State Line City the mean income is $64,968, and the standard deviation is $33,244. The coefficient of variation (CV) is 51.17%. This high CV indicates high relative variability, suggesting that the incomes vary significantly across different sizes of households.

In the most recent year, 2021, The smallest household size for which the bureau reported a median household income was 1-person households, with an income of $36,144. It then further increased to $101,336 for 3-person households, the largest household size for which the bureau reported a median household income.

https://i.neilsberg.com/ch/state-line-city-in-median-household-income-by-household-size.jpeg" alt="State Line City, IN median household income, by household size (in 2022 inflation-adjusted dollars)">

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

Household Sizes:

1-person households

2-person households

3-person households

4-person households

5-person households

6-person households

7-or-more-person households

Variables / Data Columns

Household Size: This column showcases 7 household sizes ranging from 1-person households to 7-or-more-person households (As mentioned above).

Median Household Income: Median household income, in 2022 inflation-adjusted dollars for the specific household size.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

Recommended for further research

This dataset is a part of the main dataset for State Line City median household income. You can refer the same here
Almanac API - Ranking by Geography Type within a State
catalog.data.gov
datadiscoverystudio.org
+5more
Updated Mar 11, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Telecommunication and Information Administration, Department of Commerce (2021). Almanac API - Ranking by Geography Type within a State [Dataset]. https://catalog.data.gov/dataset/almanac-api-ranking-by-geography-type-within-a-state
Explore at:
Dataset updated
Mar 11, 2021
Dataset provided by
United States Department of Commercehttp://www.commerce.gov/
Description
This API is designed to find the rankings by any geography type within the state with a specific census metric (population or household) and ranking metric (any of the metrics from provider, demographic, technology or speed). Only the top ten and bottom ten rankings would be returned through the API if the result set is greater than 500; otherwise full ranking list be returned.
United States COVID-19 Community Levels by County
data.cdc.gov
healthdata.gov
+1more
application/rdfxml +5
Updated Nov 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CDC COVID-19 Response (2023). United States COVID-19 Community Levels by County [Dataset]. https://data.cdc.gov/Public-Health-Surveillance/United-States-COVID-19-Community-Levels-by-County/3nnm-4jni
Explore at:
application/rdfxml, application/rssxml, csv, tsv, xml, jsonAvailable download formats
Dataset updated
Nov 2, 2023
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Authors
CDC COVID-19 Response
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Area covered
United States
Description
Reporting of Aggregate Case and Death Count data was discontinued May 11, 2023, with the expiration of the COVID-19 public health emergency declaration. Although these data will continue to be publicly available, this dataset will no longer be updated.

This archived public use dataset has 11 data elements reflecting United States COVID-19 community levels for all available counties.

The COVID-19 community levels were developed using a combination of three metrics — new COVID-19 admissions per 100,000 population in the past 7 days, the percent of staffed inpatient beds occupied by COVID-19 patients, and total new COVID-19 cases per 100,000 population in the past 7 days. The COVID-19 community level was determined by the higher of the new admissions and inpatient beds metrics, based on the current level of new cases per 100,000 population in the past 7 days. New COVID-19 admissions and the percent of staffed inpatient beds occupied represent the current potential for strain on the health system. Data on new cases acts as an early warning indicator of potential increases in health system strain in the event of a COVID-19 surge.

Using these data, the COVID-19 community level was classified as low, medium, or high.

COVID-19 Community Levels were used to help communities and individuals make decisions based on their local context and their unique needs. Community vaccination coverage and other local information, like early alerts from surveillance, such as through wastewater or the number of emergency department visits for COVID-19, when available, can also inform decision making for health officials and individuals.

For the most accurate and up-to-date data for any county or state, visit the relevant health department website. COVID Data Tracker may display data that differ from state and local websites. This can be due to differences in how data were collected, how metrics were calculated, or the timing of web updates.

Archived Data Notes:

This dataset was renamed from "United States COVID-19 Community Levels by County as Originally Posted" to "United States COVID-19 Community Levels by County" on March 31, 2022.

March 31, 2022: Column name for county population was changed to “county_population”. No change was made to the data points previous released.

March 31, 2022: New column, “health_service_area_population”, was added to the dataset to denote the total population in the designated Health Service Area based on 2019 Census estimate.

March 31, 2022: FIPS codes for territories American Samoa, Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands were re-formatted to 5-digit numeric for records released on 3/3/2022 to be consistent with other records in the dataset.

March 31, 2022: Changes were made to the text fields in variables “county”, “state”, and “health_service_area” so the formats are consistent across releases.

March 31, 2022: The “%” sign was removed from the text field in column “covid_inpatient_bed_utilization”. No change was made to the data. As indicated in the column description, values in this column represent the percentage of staffed inpatient beds occupied by COVID-19 patients (7-day average).

March 31, 2022: Data values for columns, “county_population”, “health_service_area_number”, and “health_service_area” were backfilled for records released on 2/24/2022. These columns were added since the week of 3/3/2022, thus the values were previously missing for records released the week prior.

April 7, 2022: Updates made to data released on 3/24/2022 for Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands to correct a data mapping error.

April 21, 2022: COVID-19 Community Level (CCL) data released for counties in Nebraska for the week of April 21, 2022 have 3 counties identified in the high category and 37 in the medium category. CDC has been working with state officials to verify the data submitted, as other data systems are not providing alerts for substantial increases in disease transmission or severity in the state.

May 26, 2022: COVID-19 Community Level (CCL) data released for McCracken County, KY for the week of May 5, 2022 have been updated to correct a data processing error. McCracken County, KY should have appeared in the low community level category during the week of May 5, 2022. This correction is reflected in this update.

May 26, 2022: COVID-19 Community Level (CCL) data released for several Florida counties for the week of May 19th, 2022, have been corrected for a data processing error. Of note, Broward, Miami-Dade, Palm Beach Counties should have appeared in the high CCL category, and Osceola County should have appeared in the medium CCL category. These corrections are reflected in this update.

May 26, 2022: COVID-19 Community Level (CCL) data released for Orange County, New York for the week of May 26, 2022 displayed an erroneous case rate of zero and a CCL category of low due to a data source error. This county should have appeared in the medium CCL category.

June 2, 2022: COVID-19 Community Level (CCL) data released for Tolland County, CT for the week of May 26, 2022 have been updated to correct a data processing error. Tolland County, CT should have appeared in the medium community level category during the week of May 26, 2022. This correction is reflected in this update.

June 9, 2022: COVID-19 Community Level (CCL) data released for Tolland County, CT for the week of May 26, 2022 have been updated to correct a misspelling. The medium community level category for Tolland County, CT on the week of May 26, 2022 was misspelled as “meduim” in the data set. This correction is reflected in this update.

June 9, 2022: COVID-19 Community Level (CCL) data released for Mississippi counties for the week of June 9, 2022 should be interpreted with caution due to a reporting cadence change over the Memorial Day holiday that resulted in artificially inflated case rates in the state.

July 7, 2022: COVID-19 Community Level (CCL) data released for Rock County, Minnesota for the week of July 7, 2022 displayed an artificially low case rate and CCL category due to a data source error. This county should have appeared in the high CCL category.

July 14, 2022: COVID-19 Community Level (CCL) data released for Massachusetts counties for the week of July 14, 2022 should be interpreted with caution due to a reporting cadence change that resulted in lower than expected case rates and CCL categories in the state.

July 28, 2022: COVID-19 Community Level (CCL) data released for all Montana counties for the week of July 21, 2022 had case rates of 0 due to a reporting issue. The case rates have been corrected in this update.

July 28, 2022: COVID-19 Community Level (CCL) data released for Alaska for all weeks prior to July 21, 2022 included non-resident cases. The case rates for the time series have been corrected in this update.

July 28, 2022: A laboratory in Nevada reported a backlog of historic COVID-19 cases. As a result, the 7-day case count and rate will be inflated in Clark County, NV for the week of July 28, 2022.

August 4, 2022: COVID-19 Community Level (CCL) data was updated on August 2, 2022 in error during performance testing. Data for the week of July 28, 2022 was changed during this update due to additional case and hospital data as a result of late reporting between July 28, 2022 and August 2, 2022. Since the purpose of this data set is to provide point-in-time views of COVID-19 Community Levels on Thursdays, any changes made to the data set during the August 2, 2022 update have been reverted in this update.

August 4, 2022: COVID-19 Community Level (CCL) data for the week of July 28, 2022 for 8 counties in Utah (Beaver County, Daggett County, Duchesne County, Garfield County, Iron County, Kane County, Uintah County, and Washington County) case data was missing due to data collection issues. CDC and its partners have resolved the issue and the correction is reflected in this update.

August 4, 2022: Due to a reporting cadence change, case rates for all Alabama counties will be lower than expected. As a result, the CCL levels published on August 4, 2022 should be interpreted with caution.

August 11, 2022: COVID-19 Community Level (CCL) data for the week of August 4, 2022 for South Carolina have been updated to correct a data collection error that resulted in incorrect case data. CDC and its partners have resolved the issue and the correction is reflected in this update.

August 18, 2022: COVID-19 Community Level (CCL) data for the week of August 11, 2022 for Connecticut have been updated to correct a data ingestion error that inflated the CT case rates. CDC, in collaboration with CT, has resolved the issue and the correction is reflected in this update.

August 25, 2022: A laboratory in Tennessee reported a backlog of historic COVID-19 cases. As a result, the 7-day case count and rate may be inflated in many counties and the CCLs published on August 25, 2022 should be interpreted with caution.

August 25, 2022: Due to a data source error, the 7-day case rate for St. Louis County, Missouri, is reported as zero in the COVID-19 Community Level data released on August 25, 2022. Therefore, the COVID-19 Community Level for this county should be interpreted with caution.

September 1, 2022: Due to a reporting issue, case rates for all Nebraska counties will include 6 days of data instead of 7 days in the COVID-19 Community Level (CCL) data released on September 1, 2022. Therefore, the CCLs for all Nebraska counties should be interpreted with caution.

September 8, 2022: Due to a data processing error, the case rate for Philadelphia County, Pennsylvania,
w
Dataset of books called Discover USA : experience the best of the USA
workwithdata.com
Updated Apr 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Discover USA : experience the best of the USA [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Discover+USA+%3A+experience+the+best+of+the+USA
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
This dataset is about books. It has 1 row and is filtered where the book is Discover USA : experience the best of the USA. It features 7 columns including author, publication date, language, and book publisher.
m
USA Cheque Image OCR Datasets
data.macgence.com
mp3
Updated May 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Macgence (2024). USA Cheque Image OCR Datasets [Dataset]. https://data.macgence.com/dataset/usa-cheque-image-ocr-datasets
Explore at:
mp3Available download formats
Dataset updated
May 31, 2024
Dataset authored and provided by
Macgence
License
https://data.macgence.com/terms-and-conditionshttps://data.macgence.com/terms-and-conditions
Time period covered
2025
Area covered
Worldwide
Variables measured
Outcome, Call Type, Transcriptions, Audio Recordings, Speaker Metadata, Conversation Topics
Description
High-quality USA cheque image OCR datasets designed for accurate text extraction and recognition. Perfect for training and testing advanced OCR solutions.
N
state in U.S. Ranked by Pacific Islander Population // 2025 Edition
neilsberg.com
csv, json
Updated Jan 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neilsberg Research (2025). state in U.S. Ranked by Pacific Islander Population // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/lists/states-in-united-states-by-pacific-islander-population/
Explore at:
csv, jsonAvailable download formats
Dataset updated
Jan 23, 2025
Dataset authored and provided by
Neilsberg Research
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Variables measured
Pacific Islander Population, Pacific Islander Population as Percent of Total Population of state in United States, Pacific Islander Population as Percent of Total Pacific Islander Population of United States
Measurement technique
To measure the rank and respective trends, we initially gathered data from the five most recent American Community Survey (ACS) 5-Year Estimates. We then analyzed and categorized the data for each of the racial categories identified by the U.S. Census Bureau. Based on the required racial category classification, we calculated the rank. For geographies with no population reported for the chosen race, we did not assign a rank and excluded them from the list. It is possible that a small population exists but was not reported or captured due to limitations or variations in Census data collection and reporting. We ensured that the population estimates used in this dataset pertain exclusively to the identified racial categories and do not rely on any ethnicity classification, unless explicitly required.For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
Dataset funded by
Neilsberg Research
Description
About this dataset

Context

This list ranks the 50 state in the United States by Native Hawaiian and Other Pacific Islander (NHPI) population, as estimated by the United States Census Bureau. It also highlights population changes in each state over the past five years.

Content

When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 5-Year Estimates, including:

2019-2023 American Community Survey 5-Year Estimates

2018-2022 American Community Survey 5-Year Estimates

2017-2021 American Community Survey 5-Year Estimates

2016-2020 American Community Survey 5-Year Estimates

2015-2019 American Community Survey 5-Year Estimates

Variables / Data Columns

Rank by Pacific Islander Population: This column displays the rank of state in the United States by their Native Hawaiian and Other Pacific Islander (NHPI) population, using the most recent ACS data available.

state: The state for which the rank is shown in the previous column.

Pacific Islander Population: The Pacific Islander population of the state is shown in this column.

% of Total state Population: This shows what percentage of the total state population identifies as Pacific Islander. Please note that the sum of all percentages may not equal one due to rounding of values.

% of Total U.S. Pacific Islander Population: This tells us how much of the entire United States Pacific Islander population lives in that state. Please note that the sum of all percentages may not equal one due to rounding of values.

5 Year Rank Trend: TThis column displays the rank trend across the last 5 years.

Good to know

Margin of Error

Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

Custom data

If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

Inspiration

Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
Counties
catalog.data.gov
datasets.ai
+3more
Updated Jul 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States Census Bureau (USCB) (Point of Contact) (2025). Counties [Dataset]. https://catalog.data.gov/dataset/counties2
Explore at:
Dataset updated
Jul 17, 2025
Dataset provided by
United States Census Bureauhttp://census.gov/
Description
The Counties dataset was updated on October 31, 2023 from the United States Census Bureau (USCB) and is part of the U.S. Department of Transportation (USDOT)/Bureau of Transportation Statistics (BTS) National Transportation Atlas Database (NTAD). This resource is a member of a series. The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The primary legal divisions of most states are termed counties. In Louisiana, these divisions are known as parishes. In Alaska, which has no counties, the equivalent entities are the organized boroughs, city and boroughs, municipalities, and for the unorganized area, census areas. The latter are delineated cooperatively for statistical purposes by the State of Alaska and the Census Bureau. In four states (Maryland, Missouri, Nevada, and Virginia), there are one or more incorporated places that are independent of any county organization and thus constitute primary divisions of their states. These incorporated places are known as independent cities and are treated as equivalent entities for purposes of data presentation. The District of Columbia and Guam have no primary divisions, and each area is considered an equivalent entity for purposes of data presentation. The Census Bureau treats the following entities as equivalents of counties for purposes of data presentation: Municipios in Puerto Rico, Districts and Islands in American Samoa, Municipalities in the Commonwealth of the Northern Mariana Islands, and Islands in the U.S. Virgin Islands. The entire area of the United States, Puerto Rico, and the Island Areas is covered by counties or equivalent entities. The boundaries for counties and equivalent entities are mostly as of January 1, 2023, as reported through the Census Bureau's Boundary and Annexation Survey (BAS). A data dictionary, or other source of attribute information, is accessible at https://doi.org/10.21949/1529015
C
National Hydrography Data - NHD and 3DHP
data.cnra.ca.gov
data.ca.gov
+2more
Updated Jul 16, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Water Resources (2025). National Hydrography Data - NHD and 3DHP [Dataset]. https://data.cnra.ca.gov/dataset/national-hydrography-dataset-nhd
Explore at:
pdf(4856863), zip(39288832), website, pdf(182651), pdf(1634485), pdf(437025), pdf(3684753), csv(12977), pdf(1175775), pdf, zip(4657694), zip(578260992), arcgis geoservices rest api, zip(13901824), pdf(9867020), pdf(3932070), web videos, zip(1647291), zip(15824984), zip(73817620), zip(128966494), zip(972664), zip(10029073), pdf(1436424)Available download formats
Dataset updated
Jul 16, 2025
Dataset authored and provided by
California Department of Water Resources
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
The USGS National Hydrography Dataset (NHD) downloadable data collection from The National Map (TNM) is a comprehensive set of digital spatial data that encodes information about naturally occurring and constructed bodies of surface water (lakes, ponds, and reservoirs), paths through which water flows (canals, ditches, streams, and rivers), and related entities such as point features (springs, wells, stream gages, and dams). The information encoded about these features includes classification and other characteristics, delineation, geographic name, position and related measures, a "reach code" through which other information can be related to the NHD, and the direction of water flow. The network of reach codes delineating water and transported material flow allows users to trace movement in upstream and downstream directions. In addition to this geographic information, the dataset contains metadata that supports the exchange of future updates and improvements to the data. The NHD supports many applications, such as making maps, geocoding observations, flow modeling, data maintenance, and stewardship. For additional information on NHD, go to https://www.usgs.gov/core-science-systems/ngp/national-hydrography.

DWR was the steward for NHD and Watershed Boundary Dataset (WBD) in California. We worked with other organizations to edit and improve NHD and WBD, using the business rules for California. California's NHD improvements were sent to USGS for incorporation into the national database. The most up-to-date products are accessible from the USGS website. Please note that the California portion of the National Hydrography Dataset is appropriate for use at the 1:24,000 scale.

For additional derivative products and resources, including the major features in geopackage format, please go to this page: https://data.cnra.ca.gov/dataset/nhd-major-features Archives of previous statewide extracts of the NHD going back to 2018 may be found at https://data.cnra.ca.gov/dataset/nhd-archive.

In September 2022, USGS officially notified DWR that the NHD would become static as USGS resources will be devoted to the transition to the new 3D Hydrography Program (3DHP). 3DHP will consist of LiDAR-derived hydrography at a higher resolution than NHD. Upon completion, 3DHP data will be easier to maintain, based on a modern data model and architecture, and better meet the requirements of users that were documented in the Hydrography Requirements and Benefits Study (2016). The initial releases of 3DHP include NHD data cross-walked into the 3DHP data model. It will take several years for the 3DHP to be built out for California. Please refer to the resources on this page for more information.

The FINAL,STATIC version of the National Hydrography Dataset for California was published for download by USGS on December 27, 2023. This dataset can no longer be edited by the state stewards. The next generation of national hydrography data is the USGS 3D Hydrography Program (3DHP).

Questions about the California stewardship of these datasets may be directed to nhd_stewardship@water.ca.gov.
m
USA Tax Form Image OCR Datasets
data.macgence.com
mp3
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Macgence (2024). USA Tax Form Image OCR Datasets [Dataset]. https://data.macgence.com/dataset/usa-tax-form-image-ocr-datasets
Explore at:
mp3Available download formats
Dataset updated
Jun 1, 2024
Dataset authored and provided by
Macgence
License
https://data.macgence.com/terms-and-conditionshttps://data.macgence.com/terms-and-conditions
Time period covered
2025
Area covered
Worldwide, United States
Variables measured
Outcome, Call Type, Transcriptions, Audio Recordings, Speaker Metadata, Conversation Topics
Description
Efficiently extract data from USA tax forms with this high-quality OCR dataset. Ideal for automation, AI training, and streamlining tax data processing.
n
State of Utah Acquired LiDAR Data - Wasatch Front - Dataset - CKAN
nationaldataplatform.org
Updated Feb 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). State of Utah Acquired LiDAR Data - Wasatch Front - Dataset - CKAN [Dataset]. https://nationaldataplatform.org/catalog/dataset/state-of-utah-acquired-lidar-data-wasatch-front
Explore at:
Dataset updated
Feb 28, 2024
Area covered
Utah, Wasatch Front, Wasatch Range
Description
The State of Utah, including the Utah Automated Geographic Reference Center, Utah Geological Survey, and the Utah Division of Emergency Management, along with local and federal partners, including Salt Lake County and local cities, the Federal Emergency Management Agency, the U.S. Geological Survey, and the U.S. Environmental Protection Agency, have funded and collected over 8380 km2 (3236 mi2) of high-resolution (0.5 or 1 meter) Lidar data across the state since 2011, in support of a diverse set of flood mapping, geologic, transportation, infrastructure, solar energy, and vegetation projects. The datasets include point cloud, first return digital surface model (DSM), and bare-earth digital terrain/elevation model (DEM) data, along with appropriate metadata (XML, project tile indexes, and area completion reports). This 0.5-meter 2013-2014 Wasatch Front dataset includes most of the Salt Lake and Utah Valleys (Utah), and the Wasatch (Utah and Idaho), and West Valley fault zones (Utah). Other recently acquired State of Utah data include the 2011 Utah Geological Survey Lidar dataset covering Cedar and Parowan Valleys, the east shore/wetlands of Great Salt Lake, the Hurricane fault zone, the west half of Ogden Valley, North Ogden, and part of the Wasatch Plateau in Utah.
c
30x30 Conserved Areas, Terrestrial (2024)
californianature.ca.gov
data.cnra.ca.gov
+3more
Updated Aug 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CA Nature Organization (2024). 30x30 Conserved Areas, Terrestrial (2024) [Dataset]. https://www.californianature.ca.gov/datasets/30x30-conserved-areas-terrestrial-2024
Explore at:
Dataset updated
Aug 30, 2024
Dataset authored and provided by
CA Nature Organization
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered

Description
The Terrestrial 30x30 Conserved Areas map layer was developed by the CA Nature working group, providing a statewide perspective on areas managed for the protection or enhancement of biodiversity. Understanding the spatial distribution and extent of these durably protected and managed areas is a vital aspect of tracking and achieving the “30x30” goal of conserving 30% of California's lands and waters by 2030.Terrestrial and Freshwater Data• The California Protected Areas Database (CPAD), developed and managed by GreenInfo Network, is the most comprehensive collection of data on open space in California. CPAD data consists of Holdings, a single parcel or small group of parcels, such that the spatial features of CPAD correspond to ownership boundaries. • The California Conservation Easement Database (CCED), managed by GreenInfo Network, aggregates data on lands with easements. Conservation Easements are legally recorded interests in land in which a landholder sells or relinquishes certain development rights to their land in perpetuity. Easements are often used to ensure that lands remain as open space, either as working farm or ranch lands, or areas for biodiversity protection. Easement restrictions typically remain with the land through changes in ownership. • The Protected Areas Database of the United States (PAD-US), hosted by the United States Geological Survey (USGS), is developed in coordination with multiple federal, state, and non-governmental organization (NGO) partners. PAD-US, through the Gap Analysis Project (GAP), uses a numerical coding system in which GAP codes 1 and 2 correspond to management strategies with explicit emphasis on protection and enhancement of biodiversity. PAD-US is not specifically aligned to parcel boundaries and as such, boundaries represented within it may not align with other data sources. • Numerous datasets representing designated boundaries for entities such as National Parks and Monuments, Wild and Scenic Rivers, Wilderness Areas, and others, were downloaded from publicly available sources, typically hosted by the managing agency.Methodology1. CPAD and CCED represent the most accurate location and ownership information for parcels in California which contribute to the preservation of open space and cultural and biological resources.2. Superunits are collections of parcels (Holdings) within CPAD which share a name, manager, and access policy. Most Superunits are also managed with a generally consistent strategy for biodiversity conservation. Examples of Superunits include Yosemite National Park, Giant Sequoia National Monument, and Anza-Borrego Desert State Park. 3. Some Superunits, such as those owned and managed by the Bureau of Land Management, U.S. Forest Service, or National Park Service , are intersected by one or more designations, each of which may have a distinct management emphasis with regards to biodiversity. Examples of such designations are Wilderness Areas, Wild and Scenic Rivers, or National Monuments.4. CPAD Superunits and CCED easements were intersected with all designation boundary files to create the operative spatial units for conservation analysis, henceforth 'Conservation Units,' which make up the Terrestrial 30x30 Conserved Areas map layer. Each easement was functionally considered to be a Superunit. 5. Each Conservation Unit was intersected with the PAD-US dataset in order to determine the management emphasis with respect to biodiversity, i.e., the GAP code. Because PAD-US is national in scope and not specifically parcel aligned with California assessors' surveys, a direct spatial extraction of GAP codes from PAD-US would leave tens of thousands of GAP code data slivers within the 30x30 Conserved Areas map. Consequently, a generalizing approach was adopted, such that any Conservation Unit with greater than 80% areal overlap with a single GAP code was uniformly assigned that code. Additionally, the total area of GAP codes 1 and 2 were summed for the remaining uncoded Conservation Units. If this sum was greater than 80% of the unit area, the Conservation Unit was coded as GAP 2. 6. Subsequent to this stage of analysis, certain Conservation Units remained uncoded, either due to the lack of a single GAP code (or combined GAP codes 1&2) overlapping 80% of the area, or because the area was not sufficiently represented in the PAD-US dataset. 7. These uncoded Conservation Units were then broken down into their constituent, finer resolution Holdings, which were then analyzed according to the above workflow. 8. Areas remaining uncoded following the two-step process of coding at the Superunit and then Holding levels were assigned a GAP code of 4. This is consistent with the definition of GAP Code 4: areas unknown to have a biodiversity management focus. 9. Greater than 90% of all areas in the Terrestrial 30x30 Conserved Areas map layer were GAP coded at the level of CPAD Superunits intersected by designation boundaries, the coarsest land units of analysis. By adopting these coarser analytical units, the Terrestrial 30X30 Conserved Areas map layer avoids hundreds of thousands of spatial slivers that result from intersecting designations with smaller, more numerous parcel records. In most cases, individual parcels reflect the management scenario and GAP status of the umbrella Superunit and other spatially coincident designations.Tracking Conserved AreasThe total acreage of conserved areas will increase as California works towards its 30x30 goal. Some changes will be due to shifts in legal protection designations or management status of specific lands and waters. However, shifts may also result from new data representing improvements in our understanding of existing biodiversity conservation efforts. The California Nature Project is expected to generate a great deal of excitement regarding the state's trajectory towards achieving the 30x30 goal. We also expect it to spark discussion about how to shape that trajectory, and how to strategize and optimize outcomes. We encourage landowners, managers, and stakeholders to investigate how their lands are represented in the Terrestrial 30X30 Conserved Areas Map Layer. This can be accomplished by using the Conserved Areas Explorer web application, developed by the CA Nature working group. Users can zoom into the locations they understand best and share their expertise with us to improve the data representing the status of conservation efforts at these sites. The Conserved Areas Explorer presents a tremendous opportunity to strengthen our existing data infrastructure and the channels of communication between land stewards and data curators, encouraging the transfer of knowledge and improving the quality of data. CPAD, CCED, and PAD-US are built from the ground up. Data is derived from available parcel information and submissions from those who own and manage the land. So better data starts with you. Do boundary lines require updating? Is the GAP code inconsistent with a Holding’s conservation status? If land under your care can be better represented in the Terrestrial 30X30 Conserved Areas map layer, please use this link to initiate a review. The results of these reviews will inform updates to the California Protected Areas Database, California Conservation Easement Database, and PAD-US as appropriate for incorporation into future updates to CA Nature and tracking progress to 30x30.
d
US Restaurant POI dataset with metadata
datarade.ai
.csv
Updated Jul 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Geolytica (2022). US Restaurant POI dataset with metadata [Dataset]. https://datarade.ai/data-products/us-restaurant-poi-dataset-with-metadata-geolytica
Explore at:
.csvAvailable download formats
Dataset updated
Jul 30, 2022
Dataset authored and provided by
Geolytica
Area covered
United States of America
Description
Point of Interest (POI) is defined as an entity (such as a business) at a ground location (point) which may be (of interest). We provide high-quality POI data that is fresh, consistent, customizable, easy to use and with high-density coverage for all countries of the world.

This is our process flow:

Our machine learning systems continuously crawl for new POI data Our geoparsing and geocoding calculates their geo locations Our categorization systems cleanup and standardize the datasets Our data pipeline API publishes the datasets on our data store

A new POI comes into existence. It could be a bar, a stadium, a museum, a restaurant, a cinema, or store, etc.. In today's interconnected world its information will appear very quickly in social media, pictures, websites, press releases. Soon after that, our systems will pick it up.

POI Data is in constant flux. Every minute worldwide over 200 businesses will move, over 600 new businesses will open their doors and over 400 businesses will cease to exist. And over 94% of all businesses have a public online presence of some kind tracking such changes. When a business changes, their website and social media presence will change too. We'll then extract and merge the new information, thus creating the most accurate and up-to-date business information dataset across the globe.

We offer our customers perpetual data licenses for any dataset representing this ever changing information, downloaded at any given point in time. This makes our company's licensing model unique in the current Data as a Service - DaaS Industry. Our customers don't have to delete our data after the expiration of a certain "Term", regardless of whether the data was purchased as a one time snapshot, or via our data update pipeline.

Customers requiring regularly updated datasets may subscribe to our Annual subscription plans. Our data is continuously being refreshed, therefore subscription plans are recommended for those who need the most up to date data. The main differentiators between us vs the competition are our flexible licensing terms and our data freshness.

Data samples may be downloaded at https://store.poidata.xyz/us
H
Air Quality-Lung Cancer Data
dataverse.harvard.edu
Updated Jan 31, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mithun Acharjee; Kumer Pial Das; Young S.Stanley (2020). Air Quality-Lung Cancer Data [Dataset]. http://doi.org/10.7910/DVN/HMOEJO
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/HMOEJO
Dataset updated
Jan 31, 2020
Dataset provided by
Harvard Dataverse
Authors
Mithun Acharjee; Kumer Pial Das; Young S.Stanley
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Data comes from two different sources. Population-based lung cancer incidence rates for the period 2010-2014 (most updated data) were abstracted from National Cancer Institute state cancer profiles (Schwartz et al. 1996).This national county-level database of cancer data is collected by state public health surveillance systems. The domain specific county level environmental quality index (EQI) data for the period 2000-2005 were abstracted from United States Environmental Protection Agency (USEPA) profile. Complete descriptions of the datasets used in the EQI are provided in Lobdell’s paper (Lobdell 2011). Data were merged based on the Federal Information Processing Standards (FIPS) code. Out of 3144 counties in United States this study has available information for 2602 counties: Data was not available for four states namely Kansas, Michigan, Minnesota and Nevada due to state legislation and regulations which prohibit the release of county-level data to outside entities, county whose lung cancer mortality information is missing were omitted from the data set, the Union county, Florida is an outlier in terms of mortality information which was deleted from the data set, in the process of local control analysis this study experiences two (cluster 28 and 29) non-informative clusters (non-informative cluster is one for which either treatment or control group information is missing). For analysis, non-informative clusters information was deleted from the data set. Three types of variables are used in this study: (i) lung cancer mortality as an outcome variable (ii) binary treatment indicator is the PM2.5 high (greater than 10.59 mg/m3) vs. low (less than 10.59 mg/m3) (iii) three potential X confounder for clustering namely land EQI, sociodemographic EQI and built EQI. For each index, higher values correspond to poorer environmental quality (Jagai et al. 2017). As PM2.5 is one of the indicators for measuring air EQI, that is why we do not consider the air EQI to avoid confounding effects.
m
USA Insurance Claim Image OCR Datasets
data.macgence.com
mp3
Updated Jun 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Macgence (2024). USA Insurance Claim Image OCR Datasets [Dataset]. https://data.macgence.com/dataset/usa-insurance-claim-image-ocr-datasets
Explore at:
mp3Available download formats
Dataset updated
Jun 3, 2024
Dataset authored and provided by
Macgence
License
https://data.macgence.com/terms-and-conditionshttps://data.macgence.com/terms-and-conditions
Time period covered
2025
Area covered
Worldwide
Variables measured
Outcome, Call Type, Transcriptions, Audio Recordings, Speaker Metadata, Conversation Topics
Description
Explore Macgence’s USA Insurance Claim OCR dataset—high-quality annotated images to train and evaluate document automation AI systems.
d
Johns Hopkins COVID-19 Case Tracker
data.world
csv, zip
Updated Aug 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Associated Press (2025). Johns Hopkins COVID-19 Case Tracker [Dataset]. https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker
Explore at:
zip, csvAvailable download formats
Dataset updated
Aug 9, 2025
Authors
The Associated Press
Time period covered
Jan 22, 2020 - Mar 9, 2023
Area covered
Description
Updates

Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.

CDC Weekly case and death counts (national and state level)

CDC County level cases and deaths

HHS New hospital admissions

CDC NowCast COVID variant proportions (national and regional level)

April 9, 2020

The population estimate data for New York County, NY has been updated to include all five New York City counties (Kings County, Queens County, Bronx County, Richmond County and New York County). This has been done to match the Johns Hopkins COVID-19 data, which aggregates counts for the five New York City counties to New York County.

April 20, 2020

Johns Hopkins death totals in the US now include confirmed and probable deaths in accordance with CDC guidelines as of April 14. One significant result of this change was an increase of more than 3,700 deaths in the New York City count. This change will likely result in increases for death counts elsewhere as well. The AP does not alter the Johns Hopkins source data, so probable deaths are included in this dataset as well.

April 29, 2020

The AP is now providing timeseries data for counts of COVID-19 cases and deaths. The raw counts are provided here unaltered, along with a population column with Census ACS-5 estimates and calculated daily case and death rates per 100,000 people. Please read the updated caveats section for more information.

September 1st, 2020

Johns Hopkins is now providing counts for the five New York City counties individually.

February 12, 2021

The Ohio Department of Health recently announced that as many as 4,000 COVID-19 deaths may have been underreported through the state’s reporting system, and that the "daily reported death counts will be high for a two to three-day period."

Because deaths data will be anomalous for consecutive days, we have chosen to freeze Ohio's rolling average for daily deaths at the last valid measure until Johns Hopkins is able to back-distribute the data. The raw daily death counts, as reported by Johns Hopkins and including the backlogged death data, will still be present in the new_deaths column.

February 16, 2021

- Johns Hopkins has reconciled Ohio's historical deaths data with the state.

Overview

The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.

The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.

The AP is updating this dataset hourly at 45 minutes past the hour.

To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

Queries

Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic

Filter cases by state here

Rank states by their status as current hotspots. Calculates the 7-day rolling average of new cases per capita in each state: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=481e82a4-1b2f-41c2-9ea1-d91aa4b3b1ac

Find recent hotspots within your state by running a query to calculate the 7-day rolling average of new cases by capita in each county: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=b566f1db-3231-40fe-8099-311909b7b687&showTemplatePreview=true

Join county-level case data to an earlier dataset released by AP on local hospital capacity here. To find out more about the hospital capacity dataset, see the full details.

Pull the 100 counties with the highest per-capita confirmed cases here

Rank all the counties by the highest per-capita rate of new cases in the past 7 days here. Be aware that because this ranks per-capita caseloads, very small counties may rise to the very top, so take into account raw caseload figures as well.

Interactive

The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.

@(https://datawrapper.dwcdn.net/nRyaf/15/)

Interactive Embed Code

<iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>

Caveats

This data represents the number of cases and deaths reported by each state and has been collected by Johns Hopkins from a number of sources cited on their website.

In some cases, deaths or cases of people who've crossed state lines -- either to receive treatment or because they became sick and couldn't return home while traveling -- are reported in a state they aren't currently in, because of state reporting rules.

In some states, there are a number of cases not assigned to a specific county -- for those cases, the county name is "unassigned to a single county"

This data should be credited to Johns Hopkins University's COVID-19 tracking project. The AP is simply making it available here for ease of use for reporters and members.

Caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

Population estimates at the county level are drawn from 2014-18 5-year estimates from the American Community Survey.

The Urban/Rural classification scheme is from the Center for Disease Control and Preventions's National Center for Health Statistics. It puts each county into one of six categories -- from Large Central Metro to Non-Core -- according to population and other characteristics. More details about the classifications can be found here.

Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here

Attribution

This data should be credited to Johns Hopkins University COVID-19 tracking project
American College Catalog Study Database, 1975-2011 - Archival Version
search.gesis.org
Updated Feb 17, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brint, Steven (2021). American College Catalog Study Database, 1975-2011 - Archival Version [Dataset]. http://doi.org/10.3886/ICPSR34851
Explore at:
Unique identifier
https://doi.org/10.3886/ICPSR34851
Dataset updated
Feb 17, 2021
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
GESIS search
Authors
Brint, Steven
License
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de450955https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de450955
Description
Abstract (en): The American College Catalog Study Database (CCS) contains academic data on 286 four-year colleges and universities in the United States. CCS is one of two databases produced by the Colleges and Universities 2000 project based at the University of California-Riverside. The CCS database comprises a sampled subset of institutions from the related Institutional Data Archive (IDA) on American Higher Education (ICPSR 34874). Coding for CCS was based on college catalogs obtained from College Source, Inc. The data are organized in a panel design, with measurements taken at five-year intervals: academic years 1975-76, 1980-81, 1985-86, 1990-91, 1995-96, 2000-01, 2005-06, and 2010-11. The database is based on information reported in each institution's college catalog, and includes data regarding changes in major academic units (schools and colleges), departments, interdisciplinary programs, and general education requirements. For schools and departments, changes in structure were coded, including new units, name changes, splits in units, units moved to new schools, reconstituted units, consolidated units, departments reduced to program status, and eliminated units. The American College Catalog Study Database (CCS) is intended to allow researchers to examine changes in the structure of institutionalized knowledge in four-year colleges and universities within the United States. For information on the study design, including detailed coding conventions, please see the Original P.I. Documentation section of the ICPSR Codebook. The data are not weighted. Dataset 1, Characteristics Variables, contains three weight variables (IDAWT, CCSWT, and CASEWEIGHT) which users may wish to apply during analysis. For additional information on weights, please see the Original P.I. Documentation section of the ICPSR Codebook. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Checked for undocumented or out-of-range codes.. Response Rates: Approximately 75 percent of IDA institutions are included in CCS. For additional information on response rates, please see the Original P.I. Documentation section of the ICPSR Codebook. Four-year not-for-profit colleges and universities in the United States. Smallest Geographic Unit: state CCS includes 286 institutions drawn from the IDA sample of 384 United States four-year colleges and universities. CCS contains every IDA institution for which a full set of catalogs could be located at the initiation of the project in 2000. CCS contains seven datasets that can be linked through an institutional identification number variable (PROJ_ID). Since the data are organized in a panel format, it is also necessary to use a second variable (YEAR) to link datasets. For a brief description of each CCS dataset, please see Appendix B within the Original P.I. Documentation section of the ICPSR Codebook.There are date discrepancies between the data and the Original P.I. Documentation. Study Time Periods and Collection Dates reflect dates that are present in the data. No additional information was provided.Please note that the related data collection featuring the Institutional Data Archive on American Higher Education, 1970-2011, will be available as ICPSR 34874. Additional information on the American College Catalog Study Database (CCS) and the Institutional Data Archive (IDA) database can be found on the Colleges and Universities 2000 Web site.
d
PREDIK Data-Driven: Geospatial Data | USA | Tailor-made datasets: Foot...
datarade.ai
Updated Oct 13, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Predik Data-driven (2021). PREDIK Data-Driven: Geospatial Data | USA | Tailor-made datasets: Foot traffic & Places Data [Dataset]. https://datarade.ai/data-products/predik-data-driven-geospatial-data-usa-tailor-made-datas-predik-data-driven
Explore at:
.json, .csv, .xls, .sqlAvailable download formats
Dataset updated
Oct 13, 2021
Dataset authored and provided by
Predik Data-driven
Area covered
United States
Description
This Location Data & Foot traffic dataset available for all countries include enriched raw mobility data and visitation at POIs to answer questions such as:

-How often do people visit a location? (daily, monthly, absolute, and averages). -What type of places do they visit ? (parks, schools, hospitals, etc) -Which social characteristics do people have in a certain POI? - Breakdown by type: residents, workers, visitors. -What's their mobility like enduring night hours & day hours?
-What's the frequency of the visits partition by day of the week and hour of the day?

Extra insights -Visitors´ relative income Level. -Visitors´ preferences as derived by their visits to shopping, parks, sports facilities, churches, among others.

Overview & Key Concepts Each record corresponds to a ping from a mobile device, at a particular moment in time and at a particular latitude and longitude. We procure this data from reliable technology partners, which obtain it through partnerships with location-aware apps. All the process is compliant with applicable privacy laws.

We clean and process these massive datasets with a number of complex, computer-intensive calculations to make them easier to use in different data science and machine learning applications, especially those related to understanding customer behavior.

Featured attributes of the data Device speed: based on the distance between each observation and the previous one, we estimate the speed at which the device is moving. This is particularly useful to differentiate between vehicles, pedestrians, and stationery observations.

Night base of the device: we calculate the approximated location of where the device spends the night, which is usually their home neighborhood.

Day base of the device: we calculate the most common daylight location during weekdays, which is usually their work location.

Income level: we use the night neighborhood of the device, and intersect it with available socioeconomic data, to infer the device’s income level. Depending on the country, and the availability of good census data, this figure ranges from a relative wealth index to a currency-calculated income.

POI visited: we intersect each observation with a number of POI databases, to estimate check-ins to different locations. POI databases can vary significantly, in scope and depth, between countries.

Category of visited POI: for each observation that can be attributable to a POI, we also include a standardized location category (park, hospital, among others). Coverage: Worldwide.

Delivery schemas We can deliver the data in three different formats:

Full dataset: one record per mobile ping. These datasets are very large, and should only be consumed by experienced teams with large computing budgets.

Visitation stream: one record per attributable visit. This dataset is considerably smaller than the full one but retains most of the more valuable elements in the dataset. This helps understand who visited a specific POI, characterize and understand the consumer's behavior.

Audience profiles: one record per mobile device in a given period of time (usually monthly). All the visitation stream is aggregated by category. This is the most condensed version of the dataset and is very useful to quickly understand the types of consumers in a particular area and to create cohorts of users.
Soil Moisture Profiles and Temperature Data from SoilSCAPE Sites, USA -...
data.nasa.gov
data.staging.idas-ds1.appdat.jsc.nasa.gov
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2025). Soil Moisture Profiles and Temperature Data from SoilSCAPE Sites, USA - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/soil-moisture-profiles-and-temperature-data-from-soilscape-sites-usa-f2a77
Explore at:
Dataset updated
Apr 1, 2025
Dataset provided by
NASAhttp://nasa.gov/
Area covered
United States
Description
This data set contains in-situ soil moisture profile and soil temperature data collected at 20-minute intervals at SoilSCAPE (Soil moisture Sensing Controller and oPtimal Estimator) project sites in four states (California, Arizona, Oklahoma, and Michigan) in the United States. SoilSCAPE used wireless sensor technology to acquire high temporal resolution soil moisture and temperature data at up to 12 sites over varying durations since August 2011. At its maximum, the network consisted of over 200 wireless sensor installations (nodes), with a range of 6 to 27 nodes per site. The soil moisture sensors (EC-5 and 5-TM from Decagon Devices) were installed at three to four depths, nominally at 5, 20, and 50 cm below the surface. Soil conditions (e.g., hard soil or rocks) may have limited sensor placement. Temperature sensors were installed at 5 cm depth at six of the sites. Data collection started in August 2011 and continues at eight sites through the present. The data enables estimation of local-scale soil moisture at high temporal resolution and validation of remote sensing estimates of soil moisture at regional (airborne, e.g. NASA's Airborne Microwave Observation of Subcanopy and Subsurface Mission - AirMOSS) and national (spaceborne, e.g. NASA's Soil Moisture Active Passive - SMAP) scales.
o
Counties - United States of America
public.opendatasoft.com
bfortune.opendatasoft.com
csv, excel, geojson +1
Updated Jun 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Counties - United States of America [Dataset]. https://public.opendatasoft.com/explore/dataset/georef-united-states-of-america-county/
Explore at:
excel, json, geojson, csvAvailable download formats
Dataset updated
Jun 6, 2024
License
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
Area covered
United States
Description
This dataset is part of the Geographical repository maintained by Opendatasoft. This dataset contains data for counties and equivalent entities in United States of America. The primary legal divisions of most states are termed counties. In Louisiana, these divisions are known as parishes. In Alaska, which has no counties, the equivalent entities are the organized boroughs, city and boroughs, municipalities, and for the unorganized area, census areas. The latter are delineated cooperatively for statistical purposes by the State of Alaska and the Census Bureau. In four states (Maryland, Missouri, Nevada, and Virginia), there are one or more incorporated places that are independent of any county organization and thus constitute primary divisions of their states. These incorporated places are known as independent cities and are treated as equivalent entities for purposes of data presentation. The District of Columbia and Guam have no primary divisions, and each area is considered an equivalent entity for purposes of data presentation. The Census Bureau treats the following entities as equivalents of counties for purposes of data presentation: Municipios in Puerto Rico, Districts and Islands in American Samoa, Municipalities in the Commonwealth of the Northern Mariana Islands, and Islands in the U.S. Virgin Islands. The entire area of the United States, Puerto Rico, and the Island Areas is covered by counties or equivalent entities.Processors and tools are using this data. Enhancements Add ISO 3166-3 codes. Simplify geometries to provide better performance across the services. Add administrative hierarchy.

Facebook

Twitter

Click to copy link

Link copied

Cite

Centers for Disease Control and Prevention (2022). United States COVID-19 County Level of Community Transmission Historical Changes [Dataset]. https://catalog.data.gov/dataset/united-states-covid-19-county-level-of-community-transmission-historical-changes

United States COVID-19 County Level of Community Transmission Historical Changes

Explore at:

3 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Oct 19, 2022

Dataset provided by

Centers for Disease Control and Preventionhttp://www.cdc.gov/

Area covered

United States

Description

Announcement Beginning October 20, 2022, CDC will report and publish aggregate case and death data from jurisdictional and state partners on a weekly basis rather than daily. As a result, community transmission levels data reported on data.cdc.gov will be updated weekly on Thursdays, typically by 8 PM ET, instead of daily. This public use dataset has 7 data elements reflecting historical data for community transmission levels for all available counties. This dataset contains historical data for the county level of community transmission and includes updated data submitted by states and jurisdictions. Each day, the dataset is appended to contain the most recent day's data. This dataset includes data from January 1, 2021. Transmission level is set to low, moderate, substantial, or high using the calculation rules below. Currently, CDC provides the public with two versions of COVID-19 county-level community transmission level data: this dataset with the levels for each county from January 1, 2021 (Historical Changes dataset) and a dataset with the levels as originally posted (Originally Posted dataset), updated daily with the most recent day’s data. Methods for calculating county level of community transmission indicator The County Level of Community Transmission indicator uses two metrics: (1) total new COVID-19 cases per 100,000 persons in the last 7 days and (2) percentage of positive SARS-CoV-2 diagnostic nucleic acid amplification tests (NAAT) in the last 7 days. For each of these metrics, CDC classifies transmission values as low, moderate, substantial, or high (below and here). If the values for each of these two metrics differ (e.g., one indicates moderate and the other low), then the higher of the two should be used for decision-making. CDC core metrics of and thresholds for community transmission levels of SARS-CoV-2 Total New Case Rate Metric: "New cases per 100,000 persons in the past 7 days" is calculated by adding the number of new cases in the county (or other administrative level) in the last 7 days divided by the population in the county (or other administrative level) and multiplying by 100,000. "New cases per 100,000 persons in the past 7 days" is considered to have transmission level of Low (0-9.99); Moderate (10.00-49.99); Substantial (50.00-99.99); and High (greater than or equal to 100.00). Test Percent Positivity Metric: "Percentage of positive NAAT in the past 7 days" is calculated by dividing the number of positive tests in the county (or other administrative level) during the last 7 days by the total number of tests resulted over the last 7 days. "Percentage of positive NAAT in the past 7 days" is considered to have transmission level of Low (less than 5.00); Moderate (5.00-7.99); Substantial (8.00-9.99); and High (greater than or equal to 10.00). If the two metrics suggest different transmission levels, the higher level is selected. If one metric is missing, the other metric is used for the indicator. Transmission categories include: Low Transmission Threshold: Counties with fewer than 10 total cases per 100,000 population in the past 7 days, and a NAAT percent test positivity in the past 7 days below 5%; Moderate Transmission Threshold: Counties with 10-49 total cases per 100,000 population in the past 7 days or a NAAT test percent positivity in the past 7 days of 5.0-7.99%; Substantial Transmission Threshold: Counties with 50-99 total cases per 100,000 population in the past 7 days or a NAAT test percent positivity in the past 7 days of 8.0-9.99%; High Transmission Threshold: Counties with 100

Clear search

Close search

Google apps

Main menu

United States COVID-19 County Level of Community Transmission Historical...

Median Household Income Variation by Family Size in State Line City, IN:...

About this dataset

Content

Inspiration

Recommended for further research

Almanac API - Ranking by Geography Type within a State

United States COVID-19 Community Levels by County

Dataset of books called Discover USA : experience the best of the USA

USA Cheque Image OCR Datasets

state in U.S. Ranked by Pacific Islander Population // 2025 Edition

About this dataset

Content

Inspiration

Counties

National Hydrography Data - NHD and 3DHP

USA Tax Form Image OCR Datasets

State of Utah Acquired LiDAR Data - Wasatch Front - Dataset - CKAN

30x30 Conserved Areas, Terrestrial (2024)

US Restaurant POI dataset with metadata

Air Quality-Lung Cancer Data

USA Insurance Claim Image OCR Datasets

Johns Hopkins COVID-19 Case Tracker

Updates

- Johns Hopkins has reconciled Ohio's historical deaths data with the state.

Overview

Queries

Interactive

Interactive Embed Code

Caveats

Attribution

American College Catalog Study Database, 1975-2011 - Archival Version

PREDIK Data-Driven: Geospatial Data | USA | Tailor-made datasets: Foot...

Soil Moisture Profiles and Temperature Data from SoilSCAPE Sites, USA -...

Counties - United States of America

United States COVID-19 County Level of Community Transmission Historical ChangesSee More Versions

United States COVID-19 County Level of Community Transmission Historical Changes