100+ datasets found
  1. contribute-a-dataset

    • huggingface.co
    Updated Jul 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huggingface Projects (2023). contribute-a-dataset [Dataset]. https://huggingface.co/datasets/huggingface-projects/contribute-a-dataset
    Explore at:
    Dataset updated
    Jul 15, 2023
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Huggingface Projects
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    huggingface-projects/contribute-a-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  2. T

    United States Government Spending To GDP

    • tradingeconomics.com
    • pl.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS, United States Government Spending To GDP [Dataset]. https://tradingeconomics.com/united-states/government-spending-to-gdp
    Explore at:
    excel, xml, csv, jsonAvailable download formats
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 1900 - Dec 31, 2024
    Area covered
    United States
    Description

    Government spending in the United States was last recorded at 39.7 percent of GDP in 2024 . This dataset provides - United States Government Spending To Gdp- actual values, historical data, forecast, chart, statistics, economic calendar and news.

  3. Population Health (BRFSS: HRQOL)

    • kaggle.com
    Updated Dec 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Population Health (BRFSS: HRQOL) [Dataset]. https://www.kaggle.com/datasets/thedevastator/unlock-population-health-needs-with-brfss-hrqol
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 14, 2022
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    Population Health (BRFSS: HRQOL)

    Examining Trends, Disparities and Determinants of Health in the US Population

    By Health [source]

    About this dataset

    The Behavioral Risk Factor Surveillance System (BRFSS) offers an expansive collection of data on the health-related quality of life (HRQOL) from 1993 to 2010. Over this time period, the Health-Related Quality of Life dataset consists of a comprehensive survey reflecting the health and well-being of non-institutionalized US adults aged 18 years or older. The data collected can help track and identify unmet population health needs, recognize trends, identify disparities in healthcare, determine determinants of public health, inform decision making and policy development, as well as evaluate programs within public healthcare services.

    The HRQOL surveillance system has developed a compact set of HRQOL measures such as a summary measure indicating unhealthy days which have been validated for population health surveillance purposes and have been widely implemented in practice since 1993. Within this study's dataset you will be able to access information such as year recorded, location abbreviations & descriptions, category & topic overviews, questions asked in surveys and much more detailed information including types & units regarding data values retrieved from respondents along with their sample sizes & geographical locations involved!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset tracks the Health-Related Quality of Life (HRQOL) from 1993 to 2010 using data from the Behavioral Risk Factor Surveillance System (BRFSS). This dataset includes information on the year, location abbreviation, location description, type and unit of data value, sample size, category and topic of survey questions.

    Using this dataset on BRFSS: HRQOL data between 1993-2010 will allow for a variety of analyses related to population health needs. The compact set of HRQOL measures can be used to identify trends in population health needs as well as determine disparities among various locations. Additionally, responses to survey questions can be used to inform decision making and program and policy development in public health initiatives.

    Research Ideas

    • Analyzing trends in HRQOL over the years by location to identify disparities in health outcomes between different populations and develop targeted policy interventions.
    • Developing new models for predicting HRQOL indicators at a regional level, and using this information to inform medical practice and public health implementation efforts.
    • Using the data to understand differences between states in terms of their HRQOL scores and establish best practices for healthcare provision based on that understanding, including areas such as access to care, preventative care services availability, etc

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    See the dataset description for more information.

    Columns

    File: rows.csv | Column name | Description | |:-------------------------------|:----------------------------------------------------------| | Year | Year of survey. (Integer) | | LocationAbbr | Abbreviation of location. (String) | | LocationDesc | Description of location. (String) | | Category | Category of survey. (String) | | Topic | Topic of survey. (String) | | Question | Question asked in survey. (String) | | DataSource | Source of data. (String) | | Data_Value_Unit | Unit of data value. (String) | | Data_Value_Type | Type of data value. (String) | | Data_Value_Footnote_Symbol | Footnote symbol for data value. (String) | | Data_Value_Std_Err | Standard error of the data value. (Float) | | Sample_Size | Sample size used in sample. (Integer) | | Break_Out | Break out categories used. (String) | | Break_Out_Category | Type break out assessed. (String) | | **GeoLocation*...

  4. Data from: A U.S. Lead Exposure Hotspots Analysis

    • catalog.data.gov
    Updated Feb 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2024). A U.S. Lead Exposure Hotspots Analysis [Dataset]. https://catalog.data.gov/dataset/a-u-s-lead-exposure-hotspots-analysis
    Explore at:
    Dataset updated
    Feb 17, 2024
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Area covered
    United States
    Description

    This is the dataset used for the U.S. lead exposure risk hotspot analysis in Zartarian et al., 2024, ES&T. The data dictionary files explain the contents of the 2 included zipped data folders. The Figures 1&2 zipped folder contains the data for Figures 1 and 2, the Supplement A figures, and the data for all tables in the paper. The Supplement B zipped folder contains the Random Forest modeling methodology in Supplement B and corresponding data. This folder also includes the full national dataset version of Random Forest model version 1 and 2 used in the analysis (in .csv format). This dataset is associated with the following publication: Zartarian Morrison, V., J. Xue, A. Poulakos, R. Tornero-Velez, L. Stanek, E. Snyder, V. Helms Garrison, K. Egan, and J. Courtney. A U.S. Lead Exposure Hotspots Analysis. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 7(7): 3311-3321, (2024).

  5. Miss America Titleholders

    • kaggle.com
    Updated Nov 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Miss America Titleholders [Dataset]. https://www.kaggle.com/datasets/thedevastator/miss-america-titleholders-a-comprehensive-datase
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 17, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Miss America Titleholders

    Miss America over the years

    About this dataset

    Every year, young women from across the United States compete for the title of Miss America. The competition is open to women between the ages of 17 and 25, and includes a talent portion, an interview, and a swimsuit competition (which was removed in 2018). The winner is crowned by the previous year's titleholder and goes on to tour the nation for about 20,000 miles a month, promoting her particular platform of interest.

    The Miss America dataset contains information on all Miss America titleholders from 1921 to 2022. It includes columns for the year of the pageant, the name of the crowned winner, her state or district represented, awards won, talent performed, and notes about her win

    How to use the dataset

    This dataset contains information on Miss America titleholders from 1921 to 2022. The data includes the name of the winner, her state or district, the city she represented, her talent, and the year she won

    Research Ideas

    • Miss America could be used to study changes in American culture over time. For example, the decline in the swimsuit competition could be seen as a sign of increasing body positivity in the US.
    • The dataset could be used to study the effect of winning Miss America has on a woman's career. Does winning lead to more opportunities?
    • The dataset could be used to study geographical patterns inMiss America winners. For example, are there any states that have produced more winners than others?

    Acknowledgements

    License

    License: Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original.

    Columns

    File: miss_america_titleholders.csv | Column name | Description | |:----------------------|:-----------------------------------------------------------------------| | year | The year the Miss America pageant was held. (Integer) | | crowned | The name of the Miss America titleholder. (String) | | winner | The name of the Miss America winner. (String) | | state_or_district | The state or district represented by the Miss America winner. (String) | | city | The city represented by the Miss America winner. (String) | | awards | The awards won by the Miss America winner. (String) | | talent | The talent performed by the Miss America winner. (String) | | notes | Notes about the Miss America winner. (String) |

    File: eurovision_winners.csv | Column name | Description | |:--------------|:-------------------------------------------------------------------------| | Year | The year the pageant was held. (Integer) | | Date | The date the pageant was held. (Date) | | Host City | The city where the pageant was held. (String) | | Winner | The name of the pageant winner. (String) | | Song | The song performed by the pageant winner. (String) | | Performer | The name of the performer of the pageant winner's song. (String) | | Points | The number of points the pageant winner received. (Integer) | | Margin | The margin of points between the pageant winner and runner-up. (Integer) | | Runner-up | The name of the pageant runner-up. (String) |

  6. N

    Lead, SD Median Income by Age Groups Dataset: A Comprehensive Breakdown of...

    • neilsberg.com
    csv, json
    Updated Feb 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Lead, SD Median Income by Age Groups Dataset: A Comprehensive Breakdown of Lead Annual Median Income Across 4 Key Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e9414ab9-f353-11ef-8577-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 25, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Lead
    Variables measured
    Income for householder under 25 years, Income for householder 65 years and over, Income for householder between 25 and 44 years, Income for householder between 45 and 64 years
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It delineates income distributions across four age groups (Under 25 years, 25 to 44 years, 45 to 64 years, and 65 years and over) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents the distribution of median household income among distinct age brackets of householders in Lead. Based on the latest 2019-2023 5-Year Estimates from the American Community Survey, it displays how income varies among householders of different ages in Lead. It showcases how household incomes typically rise as the head of the household gets older. The dataset can be utilized to gain insights into age-based household income trends and explore the variations in incomes across households.

    Key observations: Insights from 2023

    In terms of income distribution across age cohorts, in Lead, the median household income stands at $74,444 for householders within the 25 to 44 years age group, followed by $74,135 for the 45 to 64 years age group. Notably, householders within the 65 years and over age group, had the lowest median household income at $44,219.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. All incomes have been adjusting for inflation and are presented in 2023-inflation-adjusted dollars.

    Age groups classifications include:

    • Under 25 years
    • 25 to 44 years
    • 45 to 64 years
    • 65 years and over

    Variables / Data Columns

    • Age Of The Head Of Household: This column presents the age of the head of household
    • Median Household Income: Median household income, in 2023 inflation-adjusted dollars for the specific age group

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Lead median household income by age. You can refer the same here

  7. T

    United States GDP Growth Contribution Consumer Spending

    • tradingeconomics.com
    • jp.tradingeconomics.com
    • +12more
    csv, excel, json, xml
    Updated Mar 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States GDP Growth Contribution Consumer Spending [Dataset]. https://tradingeconomics.com/united-states/gdp-growth-contribution-consumer-spending
    Explore at:
    xml, excel, json, csvAvailable download formats
    Dataset updated
    Mar 15, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 30, 1947 - Mar 31, 2025
    Area covered
    United States
    Description

    GDP Growth Contribution Consumer Spending in the United States decreased to 1.21 percentage points in the first quarter of 2025 from 2.70 percentage points in the fourth quarter of 2024. This dataset includes a chart with historical data for the United States GDP Growth Contribution Consumer Spending.

  8. Data from: Contribution of Offshore Wind to the Power Grid: U.S. Air Quality...

    • catalog.data.gov
    • gimi9.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Contribution of Offshore Wind to the Power Grid: U.S. Air Quality Implications [Dataset]. https://catalog.data.gov/dataset/contribution-of-offshore-wind-to-the-power-grid-u-s-air-quality-implications
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Offshore wind (OSW) is an established technology in Europe, but it has not yet gained market share in the United States (U.S.). There is, however, increasing interest in and action supporting OSW development from many coastal states, predominantly along the Atlantic coast. As OSW grows in the U.S., as seems likely, it will displace existing and future generation assets. Depending on the energy resources used by those generators, emissions from the electric power sector will change. This research explores combinations of two energy sector drivers, OSW costs and carbon dioxide (CO2) mitigation stringency, to measure the changes in the energy mix and quantify OSW’s impact on the resulting emissions. This dataset is not publicly accessible because: This is a very large access file in a format specifically for the times model. It can be accessed through the following means: Contact Carol Lenox at lenox.carol@epa.gov. Format: The data resides in a number of excel files and in the corresponding TIMES model. This dataset is associated with the following publication: Browning, M., and C. Lenox. Contribution of Offshore Wind to the Power Grid: U.S. Air Quality Implications. Applied Energy. Elsevier B.V., Amsterdam, NETHERLANDS, 276: 115474, (2020).

  9. COVID19 Additional Data

    • kaggle.com
    Updated Apr 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Orzhiang (2020). COVID19 Additional Data [Dataset]. https://www.kaggle.com/datasets/orzhiang/covid19-additional-data/versions/11
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 9, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Orzhiang
    Description

    This is a collection of dataset that I personally think it is useful in analysing COVID19 data. Since all of the data comes from the internet and majority of them originated from World Bank, I am use some Kaggle users has already uploaded similar data. However, I think it makes my life (and perhaps yours) easier by compiling all of these data together.

    The following are some remarks for the dataset-

    Dataset TitleDescriptions
    Other source of COVID19 Caseshttps://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset#time_series_covid_19_confirmed.csv
    Mortality Tablehttps://www.kaggle.com/robikscube/world-health-organization-who-mortality-database
    Economic Freedom Indexhttps://www.kaggle.com/lewisduncan93/the-economic-freedom-index
    World Bank Development Indicatorshttps://www.kaggle.com/theworldbank/world-development-indicators
    Weather Datahttps://www.kaggle.com/hbfree/covid19formattedweatherjan22march24
    Government Responsehttps://www.bsg.ox.ac.uk/research/research-projects/oxford-covid-19-government-response-tracker
    Containment and Mitigation Measureshttps://www.kaggle.com/paultimothymooney/covid-19-containment-and-mitigation-measures/
    World Happiness Reporthttps://www.kaggle.com/londeen/world-happiness-report-2020
    Weather Data 2https://www.kaggle.com/noaa/gsod
    US Data Prior to 2020-03-09https://www.kaggle.com/johnjdavisiv/jhu-covid19-data-with-us-state-data-prior-to-mar-9
    OCED Hospital Bed per 1000 inhabitantshttps://www.kaggle.com/cpmpml/oecd-hospital-beds-per-1000-inhabitant
    Covid 19 data by the US Stateshttps://www.kaggle.com/scirpus/covid-by-state
    COVID 19 Demographic predictorshttps://www.kaggle.com/nightranger77/covid19-demographic-predictors
    Country Infohttps://www.kaggle.com/koryto/countryinfo
    Population by locationhttps://www.kaggle.com/dgrechka/covid19-global-forecasting-locations-population
    00 COVID19 Country Mapping TableA mapping table serve as a link between world bank country name & country code with the country name used in COVID19 Competition. It makes linking the COVID19 data and World Bank data much easier.
    01 Population_API_SP.POP.TOTLhttps://data.worldbank.org/indicator/sp.pop.totl
    01_1 China Demographic DataSource:
    http://www.chamiji.com/2019chinaprovincepopulation
    http://www.stats.gov.cn/tjsj/ndsj/2017/indexeh.htm
    http://data.stats.gov.cn/english/easyquery.htm?cn=C01
    http://www.gov.cn/test/2007-08/07/content_708271.htm
  10. T

    United States GDP Growth Contribution Investment

    • tradingeconomics.com
    • pt.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS, United States GDP Growth Contribution Investment [Dataset]. https://tradingeconomics.com/united-states/gdp-growth-contribution-investment
    Explore at:
    csv, xml, json, excelAvailable download formats
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 30, 1947 - Mar 31, 2025
    Area covered
    United States
    Description

    GDP Growth Contribution Investment in the United States increased to 3.60 percentage points in the first quarter of 2025 from -1.03 percentage points in the fourth quarter of 2024. This dataset includes a chart with historical data for the United States GDP Growth Contribution Investment.

  11. United States COVID-19 Community Levels by County

    • data.cdc.gov
    • data.virginia.gov
    • +1more
    application/rdfxml +5
    Updated Mar 3, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CDC COVID-19 Response (2022). United States COVID-19 Community Levels by County [Dataset]. https://data.cdc.gov/Public-Health-Surveillance/United-States-COVID-19-Community-Levels-by-County/3nnm-4jni
    Explore at:
    application/rdfxml, application/rssxml, csv, tsv, xml, jsonAvailable download formats
    Dataset updated
    Mar 3, 2022
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Authors
    CDC COVID-19 Response
    License

    https://www.usa.gov/government-workshttps://www.usa.gov/government-works

    Area covered
    United States
    Description

    Reporting of Aggregate Case and Death Count data was discontinued May 11, 2023, with the expiration of the COVID-19 public health emergency declaration. Although these data will continue to be publicly available, this dataset will no longer be updated.

    This archived public use dataset has 11 data elements reflecting United States COVID-19 community levels for all available counties.

    The COVID-19 community levels were developed using a combination of three metrics — new COVID-19 admissions per 100,000 population in the past 7 days, the percent of staffed inpatient beds occupied by COVID-19 patients, and total new COVID-19 cases per 100,000 population in the past 7 days. The COVID-19 community level was determined by the higher of the new admissions and inpatient beds metrics, based on the current level of new cases per 100,000 population in the past 7 days. New COVID-19 admissions and the percent of staffed inpatient beds occupied represent the current potential for strain on the health system. Data on new cases acts as an early warning indicator of potential increases in health system strain in the event of a COVID-19 surge.

    Using these data, the COVID-19 community level was classified as low, medium, or high.

    COVID-19 Community Levels were used to help communities and individuals make decisions based on their local context and their unique needs. Community vaccination coverage and other local information, like early alerts from surveillance, such as through wastewater or the number of emergency department visits for COVID-19, when available, can also inform decision making for health officials and individuals.

    For the most accurate and up-to-date data for any county or state, visit the relevant health department website. COVID Data Tracker may display data that differ from state and local websites. This can be due to differences in how data were collected, how metrics were calculated, or the timing of web updates.

    Archived Data Notes:

    This dataset was renamed from "United States COVID-19 Community Levels by County as Originally Posted" to "United States COVID-19 Community Levels by County" on March 31, 2022.

    March 31, 2022: Column name for county population was changed to “county_population”. No change was made to the data points previous released.

    March 31, 2022: New column, “health_service_area_population”, was added to the dataset to denote the total population in the designated Health Service Area based on 2019 Census estimate.

    March 31, 2022: FIPS codes for territories American Samoa, Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands were re-formatted to 5-digit numeric for records released on 3/3/2022 to be consistent with other records in the dataset.

    March 31, 2022: Changes were made to the text fields in variables “county”, “state”, and “health_service_area” so the formats are consistent across releases.

    March 31, 2022: The “%” sign was removed from the text field in column “covid_inpatient_bed_utilization”. No change was made to the data. As indicated in the column description, values in this column represent the percentage of staffed inpatient beds occupied by COVID-19 patients (7-day average).

    March 31, 2022: Data values for columns, “county_population”, “health_service_area_number”, and “health_service_area” were backfilled for records released on 2/24/2022. These columns were added since the week of 3/3/2022, thus the values were previously missing for records released the week prior.

    April 7, 2022: Updates made to data released on 3/24/2022 for Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands to correct a data mapping error.

    April 21, 2022: COVID-19 Community Level (CCL) data released for counties in Nebraska for the week of April 21, 2022 have 3 counties identified in the high category and 37 in the medium category. CDC has been working with state officials to verify the data submitted, as other data systems are not providing alerts for substantial increases in disease transmission or severity in the state.

    May 26, 2022: COVID-19 Community Level (CCL) data released for McCracken County, KY for the week of May 5, 2022 have been updated to correct a data processing error. McCracken County, KY should have appeared in the low community level category during the week of May 5, 2022. This correction is reflected in this update.

    May 26, 2022: COVID-19 Community Level (CCL) data released for several Florida counties for the week of May 19th, 2022, have been corrected for a data processing error. Of note, Broward, Miami-Dade, Palm Beach Counties should have appeared in the high CCL category, and Osceola County should have appeared in the medium CCL category. These corrections are reflected in this update.

    May 26, 2022: COVID-19 Community Level (CCL) data released for Orange County, New York for the week of May 26, 2022 displayed an erroneous case rate of zero and a CCL category of low due to a data source error. This county should have appeared in the medium CCL category.

    June 2, 2022: COVID-19 Community Level (CCL) data released for Tolland County, CT for the week of May 26, 2022 have been updated to correct a data processing error. Tolland County, CT should have appeared in the medium community level category during the week of May 26, 2022. This correction is reflected in this update.

    June 9, 2022: COVID-19 Community Level (CCL) data released for Tolland County, CT for the week of May 26, 2022 have been updated to correct a misspelling. The medium community level category for Tolland County, CT on the week of May 26, 2022 was misspelled as “meduim” in the data set. This correction is reflected in this update.

    June 9, 2022: COVID-19 Community Level (CCL) data released for Mississippi counties for the week of June 9, 2022 should be interpreted with caution due to a reporting cadence change over the Memorial Day holiday that resulted in artificially inflated case rates in the state.

    July 7, 2022: COVID-19 Community Level (CCL) data released for Rock County, Minnesota for the week of July 7, 2022 displayed an artificially low case rate and CCL category due to a data source error. This county should have appeared in the high CCL category.

    July 14, 2022: COVID-19 Community Level (CCL) data released for Massachusetts counties for the week of July 14, 2022 should be interpreted with caution due to a reporting cadence change that resulted in lower than expected case rates and CCL categories in the state.

    July 28, 2022: COVID-19 Community Level (CCL) data released for all Montana counties for the week of July 21, 2022 had case rates of 0 due to a reporting issue. The case rates have been corrected in this update.

    July 28, 2022: COVID-19 Community Level (CCL) data released for Alaska for all weeks prior to July 21, 2022 included non-resident cases. The case rates for the time series have been corrected in this update.

    July 28, 2022: A laboratory in Nevada reported a backlog of historic COVID-19 cases. As a result, the 7-day case count and rate will be inflated in Clark County, NV for the week of July 28, 2022.

    August 4, 2022: COVID-19 Community Level (CCL) data was updated on August 2, 2022 in error during performance testing. Data for the week of July 28, 2022 was changed during this update due to additional case and hospital data as a result of late reporting between July 28, 2022 and August 2, 2022. Since the purpose of this data set is to provide point-in-time views of COVID-19 Community Levels on Thursdays, any changes made to the data set during the August 2, 2022 update have been reverted in this update.

    August 4, 2022: COVID-19 Community Level (CCL) data for the week of July 28, 2022 for 8 counties in Utah (Beaver County, Daggett County, Duchesne County, Garfield County, Iron County, Kane County, Uintah County, and Washington County) case data was missing due to data collection issues. CDC and its partners have resolved the issue and the correction is reflected in this update.

    August 4, 2022: Due to a reporting cadence change, case rates for all Alabama counties will be lower than expected. As a result, the CCL levels published on August 4, 2022 should be interpreted with caution.

    August 11, 2022: COVID-19 Community Level (CCL) data for the week of August 4, 2022 for South Carolina have been updated to correct a data collection error that resulted in incorrect case data. CDC and its partners have resolved the issue and the correction is reflected in this update.

    August 18, 2022: COVID-19 Community Level (CCL) data for the week of August 11, 2022 for Connecticut have been updated to correct a data ingestion error that inflated the CT case rates. CDC, in collaboration with CT, has resolved the issue and the correction is reflected in this update.

    August 25, 2022: A laboratory in Tennessee reported a backlog of historic COVID-19 cases. As a result, the 7-day case count and rate may be inflated in many counties and the CCLs published on August 25, 2022 should be interpreted with caution.

    August 25, 2022: Due to a data source error, the 7-day case rate for St. Louis County, Missouri, is reported as zero in the COVID-19 Community Level data released on August 25, 2022. Therefore, the COVID-19 Community Level for this county should be interpreted with caution.

    September 1, 2022: Due to a reporting issue, case rates for all Nebraska counties will include 6 days of data instead of 7 days in the COVID-19 Community Level (CCL) data released on September 1, 2022. Therefore, the CCLs for all Nebraska counties should be interpreted with caution.

    September 8, 2022: Due to a data processing error, the case rate for Philadelphia County, Pennsylvania,

  12. Community Water System and Contributing Area Characteristics

    • datasets.ai
    • catalog.data.gov
    Updated Sep 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Environmental Protection Agency (2024). Community Water System and Contributing Area Characteristics [Dataset]. https://datasets.ai/datasets/community-water-system-and-contributing-area-characteristics
    Explore at:
    Dataset updated
    Sep 11, 2024
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Authors
    U.S. Environmental Protection Agency
    Description

    Operational, financial, and land use data to estimate drinking water treatment cost functions for 2006 calendar year. Data are organized for surface water and groundwater systems. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: US EPA's Office of Water, Office of Science and Technology, Engineering and Analysis Division is the holder of the survey data. US EPA's Office of Water maintains a database of point coordinates for surface water intakes and wells used for public water supply. Format: Our dataset is described in detail in Section 3 of the paper. We include a link to the 2006 Community Water System Survey that excludes the identifiers. Other data are confidential business information.

    This dataset is associated with the following publication: Price, J., and M. Heberling. The Effects of Agricultural and Urban Land Use on Drinking Water Treatment Costs: An Analysis of United States Community Water Systems. Water Economics and Policy. World Scientific Publishing Co. Pte. Ltd., 5 Toh Tuck Link, SINGAPORE, 6(4): 2050008, (2020).

  13. Gift Contributions to Reduce the Public Debt

    • fiscaldata.treasury.gov
    csv, json, xml
    Updated May 1, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. DEPARTMENT OF THE TREASURY (2020). Gift Contributions to Reduce the Public Debt [Dataset]. https://fiscaldata.treasury.gov/datasets/gift-contributions-reduce-debt-held-by-public/
    Explore at:
    xml, json, csvAvailable download formats
    Dataset updated
    May 1, 2020
    Dataset provided by
    United States Department of the Treasuryhttps://treasury.gov/
    Authors
    U.S. DEPARTMENT OF THE TREASURY
    Time period covered
    Sep 30, 1996 - Apr 30, 2025
    Description

    This dataset provides the monthly total of gift contributions received by the U.S. Treasury that were donated to reduce the public debt. These donations can include money, outstanding government obligations, and property that is sold for cash.

  14. Industrial Energy End Use in the U.S

    • kaggle.com
    Updated Dec 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Industrial Energy End Use in the U.S [Dataset]. https://www.kaggle.com/datasets/thedevastator/unlocking-industrial-energy-end-use-in-the-u-s
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 14, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Industrial Energy End Use in the U.S

    Facility-Level Combustion Energy Data

    By US Open Data Portal, data.gov [source]

    About this dataset

    This dataset contains in-depth facility-level information on industrial combustion energy use in the United States. It provides an essential resource for understanding consumption patterns across different sectors and industries, as reported by large emitters (>25,000 metric tons CO2e per year) under the U.S. EPA's Greenhouse Gas Reporting Program (GHGRP). Our records have been calculated using EPA default emissions factors and contain data on fuel type, location (latitude, longitude), combustion unit type and energy end use classified by manufacturing NAICS code. Additionally, our dataset reveals valuable insight into the thermal spectrum of low-temperature energy use from a 2010 Energy Information Administration Manufacturing Energy Consumption Survey (MECS). This information is critical to assessing industrial trends of energy consumption in manufacturing sectors and can serve as an informative baseline for efficient or renewable alternative plans of operation at these facilities. With this dataset you're just a few clicks away from analyzing research questions related to consumption levels across industries, waste issues associated with unconstrained fossil fuel burning practices and their environmental impacts

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset provides detailed information on industrial combustion energy end use in the United States. Knowing how certain industries use fuel can be valuable for those interested in reducing energy consumption and its associated environmental impacts.

    • To make the most out of this dataset, users should first become familiar with what's included by looking at the columns and their respective definitions. After becoming familiar with the data, users should start to explore areas of interest such as Fuel Type, Report Year, Primary NAICS Code, Emissions Indicators etc. The more granular and specific details you can focus on will help build a stronger analysis from which to draw conclusions from your data set.

    • Next steps could include filtering your data set down by region or end user type (such as direct related processes or indirect support activities). Segmenting your data set further can allow you to identify trends between fuel type used in different regions or compare emissions indicators between different processes within manufacturing industries etc. By taking a closer look through this lens you may be able to find valuable insights that can help inform better decision making when it comes to reducing energy consumption throughout industry in both public and private sectors alike.

    • if exploring specific trends within industry is not something that’s of particular interest to you but rather understanding general patterns among large emitters across regions then it may be beneficial for your analysis to group like-data together and take averages over larger samples which better represent total production across an area or multiple states (timeline varies depending on needs). This approach could open up new possibilities for exploring correlations between economic productivity metrics compared against industrial energy use over periods of time which could lead towards more formal investigations about where efforts are being made towards improved resource efficiency standards among certain industries/areas of production compared against other more inefficient sectors/regionsetc — all from what's already present here!

    By leveraging the information provided within this dataset users have access to many opportunities for finding all sorts of interesting yet practical insights which can have important impacts far beyond understanding just another singular statistic alone; so happy digging!

    Research Ideas

    • Analyzing the trends in combustion energy uses by region across different industries.
    • Predicting the potential of transitioning to clean and renewable sources of energy considering the current end-uses and their magnitude based on this data.
    • Creating an interactive web map application to visualize multiple industrial sites, including their energy sources and emissions data from this dataset combined with other sources (EPA’s GHGRP, MECS survey, etc)

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    **License: [CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication](https://creativecommons...

  15. T

    United States GDP Growth Contribution Exports

    • tradingeconomics.com
    • de.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS, United States GDP Growth Contribution Exports [Dataset]. https://tradingeconomics.com/united-states/gdp-growth-contribution-exports
    Explore at:
    json, xml, excel, csvAvailable download formats
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 30, 1947 - Mar 31, 2025
    Area covered
    United States
    Description

    GDP Growth Contribution Exports in the United States increased to 0.19 percentage points in the first quarter of 2025 from -0.01 percentage points in the fourth quarter of 2024. This dataset includes a chart with historical data for the United States GDP Growth Contribution Exports.

  16. N

    South Lead Hill, AR Age Group Population Dataset: A Complete Breakdown of...

    • neilsberg.com
    csv, json
    Updated Jul 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). South Lead Hill, AR Age Group Population Dataset: A Complete Breakdown of South Lead Hill Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/aab9310f-4983-11ef-ae5d-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Jul 24, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Arkansas, South Lead Hill
    Variables measured
    Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the South Lead Hill population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for South Lead Hill. The dataset can be utilized to understand the population distribution of South Lead Hill by age. For example, using this dataset, we can identify the largest age group in South Lead Hill.

    Key observations

    The largest age group in South Lead Hill, AR was for the group of age 45 to 49 years years with a population of 8 (13.56%), according to the ACS 2018-2022 5-Year Estimates. At the same time, the smallest age group in South Lead Hill, AR was the 80 to 84 years years with a population of 0 (0%). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group in consideration
    • Population: The population for the specific age group in the South Lead Hill is shown in this column.
    • % of Total Population: This column displays the population of each age group as a proportion of South Lead Hill total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for South Lead Hill Population by Age. You can refer the same here

  17. F

    Native American Facial Timeline Dataset | Facial Images from Past

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Native American Facial Timeline Dataset | Facial Images from Past [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-historical-native-american
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    United States
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Native American Facial Images from Past Dataset, meticulously curated to enhance face recognition models and support the development of advanced biometric identification systems, KYC models, and other facial recognition technologies.

    Facial Image Data

    This dataset comprises over 5,000+ images, divided into participant-wise sets with each set including:

    Historical Images: 22 different high-quality historical images per individual from the timeline of 10 years.
    Enrollment Image: One modern high-quality image for reference.

    Diversity and Representation

    The dataset includes contributions from a diverse network of individuals across Native American countries:

    Geographical Representation: Participants from countries including USA, Canada, Mexico and more.
    Demographics: Participants range from 18 to 70 years old, representing both males and females in 60:40 ratio, respectively.
    File Format: The dataset contains images in JPEG and HEIC file format.

    Quality and Conditions

    To ensure high utility and robustness, all images are captured under varying conditions:

    Lighting Conditions: Images are taken in different lighting environments to ensure variability and realism.
    Backgrounds: A variety of backgrounds are available to enhance model generalization.
    Device Quality: Photos are taken using the latest mobile devices to ensure high resolution and clarity.

    Metadata

    Each image set is accompanied by detailed metadata for each participant, including:

    Participant Identifier
    File Name
    Age at the time of capture
    Gender
    Country
    Demographic Information
    File Format

    This metadata is essential for training models that can accurately recognize and identify Native American faces across different demographics and conditions.

    Usage and Applications

    This facial image dataset is ideal for various applications in the field of computer vision, including but not limited to:

    Facial Recognition Models: Improving the accuracy and reliability of facial recognition systems.
    KYC Models: Streamlining the identity verification processes for financial and other services.
    Biometric Identity Systems: Developing robust biometric identification solutions.
    Age Prediction Models: Training models to accurately predict the age of individuals based on facial features.
    Generative AI Models: Training generative AI models to create realistic and diverse synthetic facial images.

    Secure and Ethical Collection

    Data Security: Data was securely stored and processed within our platform, ensuring data security and confidentiality.
    Ethical Guidelines: The biometric data collection process adhered to strict ethical guidelines, ensuring the privacy and consent of all participants.
    Participant Consent: All participants were informed of the purpose of collection and potential use of the data, as agreed through written consent.
    <h3 style="font-weight:

  18. d

    Satellite US Construction Materials Dataset Package (Cemex, Vulcan, Martin...

    • datarade.ai
    .csv
    Updated Jan 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Space Know (2023). Satellite US Construction Materials Dataset Package (Cemex, Vulcan, Martin Marietta) [Dataset]. https://datarade.ai/data-products/satellite-us-construction-materials-dataset-package-cemex-v-space-know
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jan 18, 2023
    Dataset authored and provided by
    Space Know
    Area covered
    United States
    Description

    This dataset package is focused on U.S construction materials and three construction companies: Cemex, Martin Marietta & Vulcan.

    In this package, SpaceKnow tracks manufacturing and processing facilities for construction material products all over the US. By tracking these facilities, we are able to give you near-real-time data on spending on these materials, which helps to predict residential and commercial real estate construction and spending in the US.

    The dataset includes 40 indices focused on asphalt, cement, concrete, and building materials in general. You can look forward to receiving country-level and regional data (activity in the North, East, West, and South of the country) and the aforementioned company data.

    SpaceKnow uses satellite (SAR) data to capture activity and building material manufacturing and processing facilities in the US.

    Data is updated daily, has an average lag of 4-6 days, and history back to 2017.

    The insights provide you with level and change data for refineries, storage, manufacturing, logistics, and employee parking-based locations.

    SpaceKnow offers 3 delivery options: CSV, API, and Insights Dashboard

    Available Indices Companies: Cemex (CX): Construction Materials (covers all manufacturing facilities of the company in the US), Concrete, Cement (refinery and storage) indices, and aggregates Martin Marietta (MLM): Construction Materials (covers all manufacturing facilities of the company in the US), Concrete, Cement (refinery and storage) indices, and aggregates Vulcan (VMC): Construction Materials (covers all manufacturing facilities of the company in the US), Concrete, Cement (refinery and storage) indices, and aggregates

    USA Indices:

    Aggregates USA Asphalt USA Cement USA Cement Refinery USA Cement Storage USA Concrete USA Construction Materials USA Construction Mining USA Construction Parking Lots USA Construction Materials Transfer Hub US Cement - Midwest, Northeast, South, West Cement Refinery - Midwest, Northeast, South, West Cement Storage - Midwest, Northeast, South, West

    Why get SpaceKnow's U.S Construction Materials Package?

    Monitor Construction Market Trends: Near-real-time insights into the construction industry allow clients to understand and anticipate market trends better.

    Track Companies Performance: Monitor the operational activities, such as the volume of sales

    Assess Risk: Use satellite activity data to assess the risks associated with investing in the construction industry.

    Index Methodology Summary Continuous Feed Index (CFI) is a daily aggregation of the area of metallic objects in square meters. There are two types of CFI indices; CFI-R index gives the data in levels. It shows how many square meters are covered by metallic objects (for example employee cars at a facility). CFI-S index gives the change in data. It shows how many square meters have changed within the locations between two consecutive satellite images.

    How to interpret the data SpaceKnow indices can be compared with the related economic indicators or KPIs. If the economic indicator is in monthly terms, perform a 30-day rolling sum and pick the last day of the month to compare with the economic indicator. Each data point will reflect approximately the sum of the month. If the economic indicator is in quarterly terms, perform a 90-day rolling sum and pick the last day of the 90-day to compare with the economic indicator. Each data point will reflect approximately the sum of the quarter.

    Where the data comes from SpaceKnow brings you the data edge by applying machine learning and AI algorithms to synthetic aperture radar and optical satellite imagery. The company’s infrastructure searches and downloads new imagery every day, and the computations of the data take place within less than 24 hours.

    In contrast to traditional economic data, which are released in monthly and quarterly terms, SpaceKnow data is high-frequency and available daily. It is possible to observe the latest movements in the construction industry with just a 4-6 day lag, on average.

    The construction materials data help you to estimate the performance of the construction sector and the business activity of the selected companies.

    The foundation of delivering high-quality data is based on the success of defining each location to observe and extract the data. All locations are thoroughly researched and validated by an in-house team of annotators and data analysts.

    See below how our Construction Materials index performs against the US Non-residential construction spending benchmark

    Each individual location is precisely defined to avoid noise in the data, which may arise from traffic or changing vegetation due to seasonal reasons.

    SpaceKnow uses radar imagery and its own unique algorithms, so the indices do not lose their significance in bad weather conditions such as rain or heavy clouds.

    → Reach out to get free trial

    ...

  19. d

    Health and Retirement Study (HRS)

    • search.dataone.org
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Health and Retirement Study (HRS) [Dataset]. http://doi.org/10.7910/DVN/ELEKOY
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D

  20. VDEQ Springs FIELD MEASUREMENTS

    • data.virginia.gov
    Updated Aug 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Virginia Department of Environmental Quality (2023). VDEQ Springs FIELD MEASUREMENTS [Dataset]. https://data.virginia.gov/dataset/vdeq-springs-field-measurements
    Explore at:
    zip, csv, kml, html, xlsx, txt, arcgis geoservices rest api, geojson, gdb, gpkgAvailable download formats
    Dataset updated
    Aug 31, 2023
    Dataset authored and provided by
    Virginia Department of Environmental Qualityhttps://deq.virginia.gov/
    Description
    The VDEQ Spring SITES database contains data describing the geographic locations and site attributes of natural springs throughout the commonwealth. This data coverage continues to evolve and contains only spring locations known to exist with a reasonable degree of certainty on the date of publication. The dataset does not replace site specific inventorying or receptor surveys but can be used as a starting point. VDEQ's initial geospatial dataset of approximately 325 springs was formed in 2008 by digitizing historical spring information sheets created by State Water Control Board geologists in the 1970s through early 1990s. Additional data has been consolidated from the EPA STORET database, the U.S. Geological Survey's Ground Water Site Inventory (GWSI) and Geographic Names Inventory System (GNIS), the Virginia Department of Health SDWIS database, the Virginia DEQ Virginia Water Use Data Set (VWUDS), the Commonwealth of Virginia Division of Water Resources and Power Bulletin No. 1: "Springs of Virginia" by Collins et al., 1930 as well as several VDWR&P Surface Water Supply bulletins from the 1940's - 1950's. A 1992 Virginia Department of Game and Inland Fisheries / Virginia Tech sponsored study by Helfrich et al. titled "Evaluation of the Natural Springs of Virginia: Fisheries Management Implications", a 2004 Rockbridge County groundwater resources report written by Frits van der Leeden, and several smaller datasets from consultants and citizens were evaluated and added to the database when confidence in locational accuracy was high or could be verified with aerial or LIDAR imagery. Significant contributions have been made throughout the years by VDEQ Groundwater Characterization staff site visits as well as other geologists working in the region including: Matt Heller at Virginia Division of Geology and Mineral Resources (VDMME), Wil Orndorff at the Virginia Department of Conservation and Recreation Karst Program (VDCR), and David Nelms and Dan Doctor of the U.S. Geological Survey (USGS). Substantial effort has been made to improve locational accuracy and remove duplication present between data sources. Hundreds of spring locations that were originally obtained using topographic maps or unknown methods were updated to sub-meter locational accuracy using post-processed differential GPS (PPGPS) and through the use of several generations of aerial imagery (2002-2017) obtained from Virginia's Geographic Information Network (VGIN) and 1-meter LIDAR, where available. Scores of new spring locations were also obtained by systematic quadrangle by quadrangle analysis in areas of the Shenandoah Valley where 1-meter LIDAR datasets where obtained from the U.S. Geological Survey. Future improvements to the dataset will result when statewide 1-meter LIDAR datasets becomes available and through continued field work by DEQ staff and other contributors working in the region. Please do not hesitate to contact the author to correct mistakes or to contribute to the database.

    The VDEQ Spring FIELD MEASUREMENTS database contains data describing field derived physio-chemical properties of spring discharges measured throughout the Commonwealth of Virginia. Field visits compiled in this dataset were performed from 1928 to 2019 by geologists with the State Water Control Board, the Virginia Division of Water and Power, the Virginia Department of Environmental Quality, and the U.S. Geological Survey with contributions from other sources as noted. Values of -9999 indicate that measurements were not performed for the referenced parameter. Please do not hesitate to contact the author to add data to the database or correct errors.


    The VDEQ_Spring_WQ database is a geodatabase containing groundwater sample information collected from springs throughout Virginia. Sample specific information include: location and site information, measured field parameters, and lab verified quantifications of major ionic concentrations, trace element concentrations, nutrient concentrations, and radiological data. The VDEQ_Spring_WQ database is a subset of the VDEQ GWCHEM database which is a flat-file geodatabase containing groundwater sample information from groundwater wells and springs throughout Virginia. Sample information has been correlated via DEQ Well # and projected using coordinates in VDEQ_Spring_SITES database. The GWCHEM database is comprised of historic groundwater sample data originally archived in the United States Geological Survey (USGS) National Water Information System (NWIS) and the Environmental Protection Agency (EPA) Storage and Retrieval (STORET) data warehouse. Archived STORET data originated as groundwater sample data collected and uploaded by Virginia State Water Control Board Personnel. While groundwater sample data in the STORET data warehouse are static, new groundwater sample data are periodically uploaded to NWIS and spring laboratory WQ data reflect NWIS downloaded on 9/30/2019. Recent groundwater sample data collected by Virginia Department of Environmental Quality (DEQ) personnel as part of the Ambient Groundwater Sampling Program are entered into the database as lab results are made available by the Division of Consolidated Laboratory Services (DCLS). When possible, charge balances were calculated for samples with reported values for major ions including (at a minimum) calcium, magnesium, potassium, sodium, bicarbonate, chloride, and sulfate. Reported values for Nitrate as N, carbonate, and fluoride were included in the charge balance calculation when available. Field determined values for bicarbonate and carbonate were used in the charge balance calculation when available. For much of the legacy DEQ groundwater sample data, bicarbonate values were derived from lab reported values of alkalinity (as mg/CaCO3) under the assumption that there was no contribution by carbonate to the reported alkalinity value. Charge balance values are reported in the "Charge Balance" column of the GWCHEM geodatabase. The closer the charge balance value is to unity (1), the lower the assumed charge balance error.In order to preserve the numerical capabilities of the database, non- numeric lab qualifiers were given the following numeric identifiers:- (minus sign) = less than the concentration specified to the right of the sign-11110 = estimated-22220 = presence verified but not quantified-33330 = radchem non-detect, below sslc-4440 = analyzed for but not detected-55550 = greater than the concentration to the right of the zero-66660 = sample held beyond normal holding time-77770 = quality control failure. Data not valid.-88880 = sample held beyond normal holding time. Sample analyzed for but not detected. Value stored is limit of detection for proces in use.-11120 = Value reported is less than the criteria of detection.-9999 = no data (parameter not quantified)

    A more in depth descprition and hydrogeologic analysis of the database can be found here
    An in Depth data fact sheet can be found here
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Huggingface Projects (2023). contribute-a-dataset [Dataset]. https://huggingface.co/datasets/huggingface-projects/contribute-a-dataset
Organization logo

contribute-a-dataset

huggingface-projects/contribute-a-dataset

Explore at:
227 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jul 15, 2023
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Huggingface Projects
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

huggingface-projects/contribute-a-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

Search
Clear search
Close search
Google apps
Main menu