54 datasets found
  1. US Health Insurance Dataset

    • kaggle.com
    Updated Feb 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anirban Datta (2020). US Health Insurance Dataset [Dataset]. https://www.kaggle.com/datasets/teertha/ushealthinsurancedataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Anirban Datta
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The venerable insurance industry is no stranger to data driven decision making. Yet in today's rapidly transforming digital landscape, Insurance is struggling to adapt and benefit from new technologies compared to other industries, even within the BFSI sphere (compared to the Banking sector for example.) Extremely complex underwriting rule-sets that are radically different in different product lines, many non-KYC environments with a lack of centralized customer information base, complex relationship with consumers in traditional risk underwriting where sometimes customer centricity runs reverse to business profit, inertia of regulatory compliance - are some of the unique challenges faced by Insurance Business.

    Despite this, emergent technologies like AI and Block Chain have brought a radical change in Insurance, and Data Analytics sits at the core of this transformation. We can identify 4 key factors behind the emergence of Analytics as a crucial part of InsurTech:

    • Big Data: The explosion of unstructured data in the form of images, videos, text, emails, social media
    • AI: The recent advances in Machine Learning and Deep Learning that can enable businesses to gain insight, do predictive analytics and build cost and time - efficient innovative solutions
    • Real time Processing: Ability of real time information processing through various data feeds (for ex. social media, news)
    • Increased Computing Power: a complex ecosystem of new analytics vendors and solutions that enable carriers to combine data sources, external insights, and advanced modeling techniques in order to glean insights that were not possible before.

    This dataset can be helpful in a simple yet illuminating study in understanding the risk underwriting in Health Insurance, the interplay of various attributes of the insured and see how they affect the insurance premium.

    Content

    This dataset contains 1338 rows of insured data, where the Insurance charges are given against the following attributes of the insured: Age, Sex, BMI, Number of Children, Smoker and Region. There are no missing or undefined values in the dataset.

    Inspiration

    This relatively simple dataset should be an excellent starting point for EDA, Statistical Analysis and Hypothesis testing and training Linear Regression models for predicting Insurance Premium Charges.

    Proposed Tasks: - Exploratory Data Analytics - Statistical hypothesis testing - Statistical Modeling - Linear Regression

  2. a

    Data from: No Health Insurance

    • boco-health-and-human-services-data-hub-bouldercounty.hub.arcgis.com
    Updated Mar 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Boulder County (2023). No Health Insurance [Dataset]. https://boco-health-and-human-services-data-hub-bouldercounty.hub.arcgis.com/datasets/no-health-insurance-1
    Explore at:
    Dataset updated
    Mar 8, 2023
    Dataset authored and provided by
    Boulder County
    Area covered
    Description

    Used for Human Services Index story map. Data is from American Community Survey 5-year estimate 2019-2023 (table S2701). Percentage of the population (civilian noninstitutionalized) with no health insurance was calculated for each census tract.

  3. ACS Health Insurance Coverage Variables - Centroids

    • coronavirus-resources.esri.com
    • covid-hub.gio.georgia.gov
    • +3more
    Updated Dec 7, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esri (2018). ACS Health Insurance Coverage Variables - Centroids [Dataset]. https://coronavirus-resources.esri.com/maps/7c69956008bb4019bbbe67ed9fb05dbb
    Explore at:
    Dataset updated
    Dec 7, 2018
    Dataset authored and provided by
    Esrihttp://esri.com/
    Area covered
    Description

    This layer shows health insurance coverage by type and by age group. This is shown by tract, county, and state centroids. This service is updated annually to contain the most currently released American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the count and percent uninsured. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2019-2023ACS Table(s): B27010 (Not all lines of this ACS table are available in this feature layer.)Data downloaded from: Census Bureau's API for American Community Survey Date of API call: December 12, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. For more information about ACS layers, visit the FAQ. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Click here to learn more about ACS data releases.Boundaries come from the US Census TIGER geodatabases, specifically, the National Sub-State Geography Database (named tlgdb_(year)_a_us_substategeo.gdb). Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines erased for cartographic and mapping purposes. For census tracts, the water cutouts are derived from a subset of the 2020 Areal Hydrography boundaries offered by TIGER. Water bodies and rivers which are 50 million square meters or larger (mid to large sized water bodies) are erased from the tract level boundaries, as well as additional important features. For state and county boundaries, the water and coastlines are derived from the coastlines of the 2023 500k TIGER Cartographic Boundary Shapefiles. These are erased to more accurately portray the coastlines and Great Lakes. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters).The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.

  4. d

    Dataplex: United Healthcare Transparency in Coverage | 76,000+ US Employers...

    • datarade.ai
    .json
    Updated Jan 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataplex (2025). Dataplex: United Healthcare Transparency in Coverage | 76,000+ US Employers | Insurance Data | Ideal for Healthcare Cost Analysis [Dataset]. https://datarade.ai/data-products/dataplex-united-healthcare-transparency-in-coverage-76-000-dataplex
    Explore at:
    .jsonAvailable download formats
    Dataset updated
    Jan 1, 2025
    Dataset authored and provided by
    Dataplex
    Area covered
    United States of America
    Description

    United Healthcare Transparency in Coverage Dataset

    Unlock the power of healthcare pricing transparency with our comprehensive United Healthcare Transparency in Coverage dataset. This invaluable resource provides unparalleled insights into healthcare costs, enabling data-driven decision-making for insurers, employers, researchers, and policymakers.

    Key Features:

    • Extensive Coverage: Access detailed pricing information for a wide range of medical procedures and services across the United States, covering approximately 76,000 employers.
    • Granular Data: Analyze costs at the provider, plan, and employer levels, allowing for in-depth comparisons and trend analysis.
    • Massive Scale: Over 400TB of data generated monthly, providing a wealth of information for comprehensive analysis.
    • Historical Perspective: Track pricing changes over time to identify patterns and forecast future trends.
    • Regular Updates: Stay current with the latest pricing information, ensuring your analyses are always based on the most recent data.

    Detailed Data Points:

    For each of the 76,000 employers, the dataset includes: 1. In-network negotiated rates for covered items and services 2. Historical out-of-network allowed amounts and billed charges 3. Cost-sharing information for specific items and services 4. Pricing data for medical procedures and services across providers, plans, and employers

    Use Cases

    For Insurers: - Benchmark your rates against competitors - Optimize network design and provider contracting - Develop more competitive and cost-effective insurance products

    For Employers: - Make informed decisions about health plan offerings - Negotiate better rates with insurers and providers - Implement cost-saving strategies for employee healthcare

    For Researchers: - Conduct in-depth studies on healthcare pricing variations - Analyze the impact of policy changes on healthcare costs - Investigate regional differences in healthcare pricing

    For Policymakers: - Develop evidence-based healthcare policies - Monitor the effectiveness of price transparency initiatives - Identify areas for potential cost-saving interventions

    Data Delivery

    Our flexible data delivery options ensure you receive the information you need in the most convenient format:

    • Custom Extracts: We can provide targeted datasets focusing on specific regions, procedures, or time periods.
    • Regular Reports: Receive scheduled updates tailored to your specific requirements.

    Why Choose Our Dataset?

    1. Expertise: Our team has extensive experience in healthcare data retrieval and analysis, ensuring high-quality, reliable data.
    2. Customization: We can tailor the dataset to meet your specific needs, whether you're interested in particular companies, regions, or procedures.
    3. Scalability: Our infrastructure is designed to handle the massive scale of this dataset (400TB+ monthly), allowing us to provide comprehensive coverage without compromise.
    4. Support: Our dedicated team is available to assist with data interpretation and technical support.

    Harness the power of healthcare pricing transparency to drive your business forward. Contact us today to discuss how our United Healthcare Transparency in Coverage dataset can meet your specific needs and unlock valuable insights for your organization.

  5. Medical Insurance Cost Dataset

    • kaggle.com
    Updated Aug 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mosap Abdel-Ghany (2025). Medical Insurance Cost Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/12853160
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 24, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mosap Abdel-Ghany
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains medical insurance cost information for 1338 individuals. It includes demographic and health-related variables such as age, sex, BMI, number of children, smoking status, and residential region in the US. The target variable is charges, which represents the medical insurance cost billed to the individual.

    The dataset is commonly used for:

    Regression modeling

    Health economics research

    Insurance pricing analysis

    Machine learning education and tutorials

    Columns

    age: Age of primary beneficiary (int)

    sex: Gender of beneficiary (male, female)

    bmi: Body Mass Index, a measure of body fat based on height and weight (float)

    children: Number of children covered by health insurance (int)

    smoker: Smoking status of the beneficiary (yes, no)

    region: Residential region in the US (northeast, northwest, southeast, southwest)

    charges: Medical insurance cost billed to the beneficiary (float)

    Potential Uses

    Build predictive models for medical costs Explore how smoking and BMI impact charges Teach students about regression and feature engineering Analyze healthcare affordability trends

  6. uninsured state

    • gis-for-racialequity.hub.arcgis.com
    Updated May 10, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Urban Observatory by Esri (2017). uninsured state [Dataset]. https://gis-for-racialequity.hub.arcgis.com/datasets/UrbanObservatory::uninsured-state
    Explore at:
    Dataset updated
    May 10, 2017
    Dataset provided by
    Esrihttp://esri.com/
    Authors
    Urban Observatory by Esri
    Area covered
    Description

    This layer shows the percentage of people without health insurance in the U.S. by state and county, from American Community Survey 5-year estimates: 2011-2015 (Table GCT2701). The map switches from state data to county data as the map zooms in. The national average was 13.0%, down from approximately 20% in 2005.A person’s ability to access health services has a profound effect on every aspect of his or her health. Many Americans do not have a primary care provider (PCP) or health center where they can receive regular medical services. People without medical insurance are more likely to lack a usual source of medical care, such as a PCP, and are more likely to skip routine medical care due to costs, increasing their risk for serious and disabling health conditions. When they do access health services, they are often burdened with large medical bills and out-of-pocket expenses. Increasing access to both routine medical care and medical insurance are vital steps in improving the health of all Americans.

  7. Uninsured Population Census Data 1-year estimates 2017-Current Statewide...

    • data.pa.gov
    csv, xlsx, xml
    Updated Aug 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pennsylvania Department of Human Services (DHS) (2020). Uninsured Population Census Data 1-year estimates 2017-Current Statewide Human Services and Insurance [Dataset]. https://data.pa.gov/Health/Uninsured-Population-Census-Data-1-year-estimates-/kq4j-u8v5
    Explore at:
    csv, xml, xlsxAvailable download formats
    Dataset updated
    Aug 20, 2020
    Dataset provided by
    Pennsylvania Department of Human Serviceshttps://www.pa.gov/agencies/dhs.html
    Authors
    Pennsylvania Department of Human Services (DHS)
    License

    https://www.usa.gov/government-workshttps://www.usa.gov/government-works

    Description

    The American Community Survey (ACS) helps local officials, community leaders, and businesses understand the changes taking place in their communities. It is the premier source for detailed population and housing information about our nation. This dataset provides estimates for Health Insurance Coverage in Pennsylvania and is summarized from summary table S2701: SELECTED CHARACTERISTICS OF HEALTH INSURANCE COVERAGE IN THE UNITED STATES.

    A blank cell within the dataset indicates that either no sample observations or too few sample observations were available to compute the statistic for that area.

    Margin of error (MOE). Some ACS products provide an MOE instead of confidence intervals. An MOE is the difference between an estimate and its upper or lower confidence bounds. Confidence bounds can be created by adding the margin of error to the estimate (for the upper bound) and subtracting the margin of error from the estimate (for the lower bound). All published ACS margins of error are based on a 90-percent confidence level.

    While an ACS 1-year estimate includes information collected over a 12-month period, an ACS 5-year estimate includes data collected over a 60-month period. In the case of ACS 1-year estimates, the period is the calendar year (e.g., the 2015 ACS covers the period from January 2015 through December 2015).

  8. p

    Uninsured Population Census Data 5-year estimates for release years...

    • data.pa.gov
    Updated Aug 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pennsylvania Department of Human Services (DHS) (2020). Uninsured Population Census Data 5-year estimates for release years 2017-Current County Human Services and Insurance [Dataset]. https://data.pa.gov/widgets/neqb-cw4e?mobile_redirect=true
    Explore at:
    xlsx, kmz, xml, kml, csv, application/geo+jsonAvailable download formats
    Dataset updated
    Aug 21, 2020
    Dataset authored and provided by
    Pennsylvania Department of Human Services (DHS)
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    The American Community Survey (ACS) helps local officials, community leaders, and businesses understand the changes taking place in their communities. It is the premier source for detailed population and housing information about our nation. This dataset provides estimates by county for Health Insurance Coverage and is summarized from summary table S2701: SELECTED CHARACTERISTICS OF HEALTH INSURANCE COVERAGE IN THE UNITED STATES. The 5-year estimates are used to provide detail on every county in Pennsylvania and includes breakouts by Age, Gender, Race, Ethnicity, Household Income, and the Ratio of Income to Poverty.

    An blank cell within the dataset indicates that either no sample observations or too few sample observations were available to compute the statistic for that area.

    Margin of error (MOE). Some ACS products provide an MOE instead of confidence intervals. An MOE is the difference between an estimate and its upper or lower confidence bounds. Confidence bounds can be created by adding the margin of error to the estimate (for the upper bound) and subtracting the margin of error from the estimate (for the lower bound). All published ACS margins of error are based on a 90-percent confidence level.

    While an ACS 1-year estimate includes information collected over a 12-month period, an ACS 5-year estimate includes data collected over a 60-month period. In the case of ACS 1-year estimates, the period is the calendar year (e.g., the 2015 ACS covers the period from January 2015 through December 2015).

    In the case of ACS multiyear estimates, the period is 5 calendar years (e.g., the 2011–2015 ACS estimates cover the period from January 2011 through December 2015). Therefore, ACS estimates based on data collected from 2011–2015 should not be labeled “2013,” even though that is the midpoint of the 5-year period.

    Multiyear estimates should be labeled to indicate clearly the full period of time (e.g., “The child poverty rate in 2011–2015 was X percent.”). They do not describe any specific day, month, or year within that time period.

  9. Where are the Uninsured?

    • data.amerigeoss.org
    esri rest, html
    Updated Jul 22, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ESRI (2020). Where are the Uninsured? [Dataset]. https://data.amerigeoss.org/sk/dataset/where-are-the-uninsured
    Explore at:
    html, esri restAvailable download formats
    Dataset updated
    Jul 22, 2020
    Dataset provided by
    Esrihttp://esri.com/
    Description

    Local, state, tribal, and federal agencies use health insurance coverage data to plan government programs, determine eligibility criteria, and encourage eligible people to participate in health insurance programs. This map shows where those with no health insurance live. Map opens in Houston, TX. Use the bookmarks or search to see other cities. Zoom out to see map render data for counties and states.


    Size of symbol depicts the count of those who are uninsured, color depicts the percent of those who are uninsured. Pop-up displays percentage by age group.

    This map uses these hosted feature layers containing the most recent American Community Survey data. These layers are part of the ArcGIS Living Atlas, and are updated every year when the American Community Survey releases new estimates, so values in the map always reflect the newest data available.

  10. 2024 American Community Survey: B27001 | Health Insurance Coverage Status by...

    • data.census.gov
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ACS, 2024 American Community Survey: B27001 | Health Insurance Coverage Status by Sex by Age (ACS 1-Year Estimates Detailed Tables) [Dataset]. https://data.census.gov/table/ACSDT1Y2024.B27001?q=Health+Insurance&g=040XX00US21
    Explore at:
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    ACS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2024
    Description

    Key Table Information.Table Title.Health Insurance Coverage Status by Sex by Age.Table ID.ACSDT1Y2024.B27001.Survey/Program.American Community Survey.Year.2024.Dataset.ACS 1-Year Estimates Detailed Tables.Source.U.S. Census Bureau, 2024 American Community Survey, 1-Year Estimates.Dataset Universe.The dataset universe of the American Community Survey (ACS) is the U.S. resident population and housing. For more information about ACS residence rules, see the ACS Design and Methodology Report. Note that each table describes the specific universe of interest for that set of estimates..Methodology.Unit(s) of Observation.American Community Survey (ACS) data are collected from individuals living in housing units and group quarters, and about housing units whether occupied or vacant. For more information about ACS sampling and data collection, see the ACS Design and Methodology Report..Geography Coverage.ACS data generally reflect the geographic boundaries of legal and statistical areas as of January 1 of the estimate year. For more information, see Geography Boundaries by Year.Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on 2020 Census data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Sampling.The ACS consists of two separate samples: housing unit addresses and group quarters facilities. Independent housing unit address samples are selected for each county or county-equivalent in the U.S. and Puerto Rico, with sampling rates depending on a measure of size for the area. For more information on sampling in the ACS, see the Accuracy of the Data document..Confidentiality.The Census Bureau has modified or suppressed some estimates in ACS data products to protect respondents' confidentiality. Title 13 United States Code, Section 9, prohibits the Census Bureau from publishing results in which an individual's data can be identified. For more information on confidentiality protection in the ACS, see the Accuracy of the Data document..Technical Documentation/Methodology.Information about the American Community Survey (ACS) can be found on the ACS website. Supporting documentation including code lists, subject definitions, data accuracy, and statistical testing, and a full list of ACS tables and table shells (without estimates) can be found on the Technical Documentation section of the ACS website.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section.Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables.Users must consider potential differences in geographic boundaries, questionnaire content or coding, or other methodological issues when comparing ACS data from different years. Statistically significant differences shown in ACS Comparison Profiles, or in data users' own analysis, may be the result of these differences and thus might not necessarily reflect changes to the social, economic, housing, or demographic characteristics being compared. For more information, see Comparing ACS Data..Weights.ACS estimates are obtained from a raking ratio estimation procedure that results in the assignment of two sets of weights: a weight to each sample person record and a weight to each sample housing unit record. Estimates of person characteristics are based on the person weight. Estimates of family, household, and housing unit characteristics are based on the housing unit weight. For any given geographic area, a characteristic total is estimated by summing the weights assigned to the persons, households, families or housing units possessing the characteristic in the geographic area. For more information on weighting and estimation in the ACS, see the Accuracy of the Data document.Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, the decennial census is the official source of population totals for April 1st of each decennial year. In between censuses, the Census Bureau's Population Estimates Program produces and disseminates the official estimates of the population for the nation, states, counties, cities, and towns...

  11. Medicaid and CHIP enrollees who received mental health or SUD services

    • catalog.data.gov
    • healthdata.gov
    • +1more
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Medicare & Medicaid Services (2025). Medicaid and CHIP enrollees who received mental health or SUD services [Dataset]. https://catalog.data.gov/dataset/medicaid-and-chip-enrollees-who-received-mental-health-or-sud-services
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset provided by
    Centers for Medicare & Medicaid Services
    Description

    This data set includes annual counts and percentages of Medicaid and Children’s Health Insurance Program (CHIP) enrollees who received mental health (MH) or substance use disorder (SUD) services, overall and by six subpopulation topics: age group, sex or gender identity, race and ethnicity, urban or rural residence, eligibility category, and primary language. These results were generated using Transformed Medicaid Statistical Information System (T-MSIS) Analytic Files (TAF) Release 1 data and the Race/Ethnicity Imputation Companion File. This data set includes Medicaid and CHIP enrollees in all 50 states, the District of Columbia, Puerto Rico, and the U.S. Virgin Islands, ages 12 to 64 at the end of the calendar year, who were not dually eligible for Medicare and were continuously enrolled with comprehensive benefits for 12 months, with no more than one gap in enrollment exceeding 45 days. Enrollees who received services for both an MH condition and SUD in the year are counted toward both condition categories. Enrollees in Guam, American Samoa, the Northern Mariana Islands, and select states with TAF data quality issues are not included. Results shown for the race and ethnicity subpopulation topic exclude enrollees in the U.S. Virgin Islands. Results shown for the primary language subpopulation topic exclude select states with data quality issues with the primary language variable in TAF. Some rows in the data set have a value of "DS," which indicates that data were suppressed according to the Centers for Medicare & Medicaid Services’ Cell Suppression Policy for values between 1 and 10. This data set is based on the brief: "Medicaid and CHIP enrollees who received mental health or SUD services in 2020." Enrollees are assigned to an age group subpopulation using age as of December 31st of the calendar year. Enrollees are assigned to a sex or gender identity subpopulation using their latest reported sex in the calendar year. Enrollees are assigned to a race and ethnicity subpopulation using the state-reported race and ethnicity information in TAF when it is available and of good quality; if it is missing or unreliable, race and ethnicity is indirectly estimated using an enhanced version of Bayesian Improved Surname Geocoding (BISG) (Race and ethnicity of the national Medicaid and CHIP population in 2020). Enrollees are assigned to an urban or rural subpopulation based on the 2010 Rural-Urban Commuting Area (RUCA) code associated with their home or mailing address ZIP code in TAF (Rural Medicaid and CHIP enrollees in 2020). Enrollees are assigned to an eligibility category subpopulation using their latest reported eligibility group code, CHIP code, and age in the calendar year. Enrollees are assigned to a primary language subpopulation based on their reported ISO language code in TAF (English/missing, Spanish, and all other language codes) (Primary Language). Please refer to the full brief for additional context about the methodology and detailed findings. Future updates to this data set will include more recent data years as the TAF data become available.

  12. f

    Supplementary Material for: Survival Disparities in Multiple Myeloma by...

    • karger.figshare.com
    • datasetcatalog.nlm.nih.gov
    tiff
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huang C.; Liu H.; Jia L.; Lu M.; Hu S. (2023). Supplementary Material for: Survival Disparities in Multiple Myeloma by Health Insurance Status among US Non-Elderly Adults: A SEER-Based Comparative Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.14338931.v1
    Explore at:
    tiffAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Karger Publishers
    Authors
    Huang C.; Liu H.; Jia L.; Lu M.; Hu S.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Background/Aim: The impacts of health insurance status on survival outcomes in multiple myeloma (MM) have not been addressed in depth. The present study was conducted to identify definite relationships of cancer-specific survival (CSS) and overall survival (OS) with health insurance status in MM patients. Methods: MM patients aged 18–64 years and with complete insurance records between January 1, 2007, and December 31, 2016, were identified from 18 Surveillance, Epidemiology, and End Results (SEER) Database registries. Health insurance condition was categorized as uninsured, any Medicaid, insured, and insured (no specifics). Relationships of health insurance condition with OS/CSS were identified through Kaplan-Meier, and uni-/multivariate Cox regressions using the hazard ratio and 95% confidence interval. Potential baseline confounding was adjusted using multiple propensity score (mPS). Results: Totally 17,981 patients were included, including 68.3% with private insurance and only 4.9% with uninsurance. Log-rank test uncovered significant difference between health insurance status and OS/CSS among MM patients. Patients with non-insurance or Medicaid coverage in comparison with private insurance tended to present poorer OS/CSS both in multivariate Cox regression and in mPS-adjusted model (non-insurance vs. private insurance [OS/CSS]: 1.33 [1.20–1.48]/1.13 [1.00–1.28] and 1.45 [1.25–1.69]/1.18 [1.04–1.33], respectively; Medicaid coverage vs. private insurance [OS/CSS]: 1.67 [1.56–1.78]/1.25 [1.16–1.36] and 1.76 [1.62–1.90]/1.23 [1.13–1.35], respectively). Conclusions: Our observational study of exposure-outcome associations suggests that insufficient or no insurance is moderately linked with OS among MM patients aged 18–64 years. Wide insurance coverage and health-care availability may strengthen some disparate outcomes. In the future, prospective cohort research is needed to further clarify concrete risks with insurance type, owing to the lack of definite division of insurance data in SEER.

  13. 2024 American Community Survey: C27001A | Health Insurance Coverage Status...

    • data.census.gov
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ACS, 2024 American Community Survey: C27001A | Health Insurance Coverage Status by Age (White Alone) (ACS 1-Year Estimates Detailed Tables) [Dataset]. https://data.census.gov/table/ACSDT1Y2024.C27001A?q=Table+C27001A-I
    Explore at:
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    ACS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2024
    Description

    Key Table Information.Table Title.Health Insurance Coverage Status by Age (White Alone).Table ID.ACSDT1Y2024.C27001A.Survey/Program.American Community Survey.Year.2024.Dataset.ACS 1-Year Estimates Detailed Tables.Source.U.S. Census Bureau, 2024 American Community Survey, 1-Year Estimates.Dataset Universe.The dataset universe of the American Community Survey (ACS) is the U.S. resident population and housing. For more information about ACS residence rules, see the ACS Design and Methodology Report. Note that each table describes the specific universe of interest for that set of estimates..Methodology.Unit(s) of Observation.American Community Survey (ACS) data are collected from individuals living in housing units and group quarters, and about housing units whether occupied or vacant. For more information about ACS sampling and data collection, see the ACS Design and Methodology Report..Geography Coverage.ACS data generally reflect the geographic boundaries of legal and statistical areas as of January 1 of the estimate year. For more information, see Geography Boundaries by Year.Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on 2020 Census data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Sampling.The ACS consists of two separate samples: housing unit addresses and group quarters facilities. Independent housing unit address samples are selected for each county or county-equivalent in the U.S. and Puerto Rico, with sampling rates depending on a measure of size for the area. For more information on sampling in the ACS, see the Accuracy of the Data document..Confidentiality.The Census Bureau has modified or suppressed some estimates in ACS data products to protect respondents' confidentiality. Title 13 United States Code, Section 9, prohibits the Census Bureau from publishing results in which an individual's data can be identified. For more information on confidentiality protection in the ACS, see the Accuracy of the Data document..Technical Documentation/Methodology.Information about the American Community Survey (ACS) can be found on the ACS website. Supporting documentation including code lists, subject definitions, data accuracy, and statistical testing, and a full list of ACS tables and table shells (without estimates) can be found on the Technical Documentation section of the ACS website.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section.Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables.Users must consider potential differences in geographic boundaries, questionnaire content or coding, or other methodological issues when comparing ACS data from different years. Statistically significant differences shown in ACS Comparison Profiles, or in data users' own analysis, may be the result of these differences and thus might not necessarily reflect changes to the social, economic, housing, or demographic characteristics being compared. For more information, see Comparing ACS Data..Weights.ACS estimates are obtained from a raking ratio estimation procedure that results in the assignment of two sets of weights: a weight to each sample person record and a weight to each sample housing unit record. Estimates of person characteristics are based on the person weight. Estimates of family, household, and housing unit characteristics are based on the housing unit weight. For any given geographic area, a characteristic total is estimated by summing the weights assigned to the persons, households, families or housing units possessing the characteristic in the geographic area. For more information on weighting and estimation in the ACS, see the Accuracy of the Data document.Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, the decennial census is the official source of population totals for April 1st of each decennial year. In between censuses, the Census Bureau's Population Estimates Program produces and disseminates the official estimates of the population for the nation, states, counties, cities, a...

  14. ACS Health Insurance Coverage Variables - Boundaries

    • data-isdh.opendata.arcgis.com
    • vaccine-confidence-program-cdcvax.hub.arcgis.com
    • +4more
    Updated Dec 7, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esri (2018). ACS Health Insurance Coverage Variables - Boundaries [Dataset]. https://data-isdh.opendata.arcgis.com/maps/a1574f4bb84f4da78b60fa0c8616eaa1
    Explore at:
    Dataset updated
    Dec 7, 2018
    Dataset authored and provided by
    Esrihttp://esri.com/
    Area covered
    Description

    This layer shows health insurance coverage by type and by age group. This is shown by tract, county, and state boundaries. This service is updated annually to contain the most currently released American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the percent uninsured. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2019-2023ACS Table(s): B27010 (Not all lines of this ACS table are available in this feature layer.)Data downloaded from: Census Bureau's API for American Community Survey Date of API call: December 12, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. For more information about ACS layers, visit the FAQ. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Click here to learn more about ACS data releases.Boundaries come from the US Census TIGER geodatabases, specifically, the National Sub-State Geography Database (named tlgdb_(year)_a_us_substategeo.gdb). Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines erased for cartographic and mapping purposes. For census tracts, the water cutouts are derived from a subset of the 2020 Areal Hydrography boundaries offered by TIGER. Water bodies and rivers which are 50 million square meters or larger (mid to large sized water bodies) are erased from the tract level boundaries, as well as additional important features. For state and county boundaries, the water and coastlines are derived from the coastlines of the 2023 500k TIGER Cartographic Boundary Shapefiles. These are erased to more accurately portray the coastlines and Great Lakes. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters).The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.

  15. Health Insurance Coverage 2018-2022 - COUNTIES

    • hub.arcgis.com
    • mce-data-uscensus.hub.arcgis.com
    Updated Feb 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    US Census Bureau (2024). Health Insurance Coverage 2018-2022 - COUNTIES [Dataset]. https://hub.arcgis.com/maps/595b3ef2fd6b4731aace199b6999bf1c
    Explore at:
    Dataset updated
    Feb 4, 2024
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    US Census Bureau
    Area covered
    Description

    This layer shows Health Insurance Coverage. This is shown by state and county boundaries. This service contains the 2018-2022 release of data from the American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show Percent of Population with No Health Insurance Coverage. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2018-2022ACS Table(s): B27010, DP03Data downloaded from: Census Bureau's API for American Community SurveyDate of API call: January 18, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:Boundaries come from the Cartographic Boundaries via US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates, and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto Rico. The Counties (and equivalent) layer contains 3221 records - all counties and equivalent, Washington D.C., and Puerto Rico municipios. See Areas Published. Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells.Margin of error (MOE) values of -555555555 in the API (or "*****" (five asterisks) on data.census.gov) are displayed as 0 in this dataset. The estimates associated with these MOEs have been controlled to independent counts in the ACS weighting and have zero sampling error. So, the MOEs are effectively zeroes, and are treated as zeroes in MOE calculations. Other negative values on the API, such as -222222222, -666666666, -888888888, and -999999999, all represent estimates or MOEs that can't be calculated or can't be published, usually due to small sample sizes. All of these are rendered in this dataset as null (blank) values.

  16. d

    Current Population Survey (CPS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

  17. T

    Socioeconomic Demographics

    • data.dumfriesva.gov
    • data.virginia.gov
    • +1more
    application/rdfxml +5
    Updated Jan 12, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Census (2022). Socioeconomic Demographics [Dataset]. https://data.dumfriesva.gov/Government/Socioeconomic-Demographics/cgre-23vp
    Explore at:
    csv, application/rssxml, application/rdfxml, xml, json, tsvAvailable download formats
    Dataset updated
    Jan 12, 2022
    Dataset authored and provided by
    U.S. Census
    Description

    This data set includes socioeconomic factors within the Town of Dumfries such as people in the labor force, people without health insurance, etc. This information comes from the most recent U.S. Census provided by the United States Census Bureau. Data will be updated accordingly with the schedule of the U.S Census. https://data.census.gov/cedsci/profile?g=1600000US5123760

  18. D

    Disability and Health Insurance - Seattle Neighborhoods

    • data.seattle.gov
    • catalog.data.gov
    • +1more
    csv, xlsx, xml
    Updated Oct 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Disability and Health Insurance - Seattle Neighborhoods [Dataset]. https://data.seattle.gov/dataset/Disability-and-Health-Insurance-Seattle-Neighborho/nxn5-xp4j
    Explore at:
    xml, csv, xlsxAvailable download formats
    Dataset updated
    Oct 22, 2024
    Area covered
    Seattle
    Description

    Table from the American Community Survey (ACS) 5-year series on disabilities and health insurance related topics for City of Seattle Council Districts, Comprehensive Plan Growth Areas and Community Reporting Areas. Table includes C21007 Age by Veteran Status by Poverty Status in the Past 12 Months by Disability Status, B27010 Types of Health Insurance Coverage by Age, B22010 Receipt of Food Stamps/SNAP by Disability Status for Households. Data is pulled from block group tables for the most recent ACS vintage and summarized to the neighborhoods based on block group assignment.


    Table created for and used in the Neighborhood Profiles application.

    Vintages: 2023
    ACS Table(s): C21007, B27010, B22010


    The United States Census Bureau's American Community Survey (ACS):
    This ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.

    Data Note from the Census:
    Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.

    Data Processing Notes:
    • Boundaries come from the US Census TIGER geodatabases, specifically, the National Sub-State Geography Database (named tlgdb(year)a_us_substategeo.gdb). Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines erased for cartographic and mapping purposes. For census tracts, the water cutouts are derived from a subset of the 2020 Areal Hydrography boundaries offered by TIGER. Water bodies and rivers which are 50 million square meters or larger (mid to large sized water bodies) are erased from the tract level boundaries, as well as additional important features. For state and county boundaries, the water and coastlines are derived from the coastlines of the 2020 500k TIGER Cartographic Boundary Shapefiles. These are erased to more accurately portray the coastlines and Great Lakes. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters).
    • The States layer contains 52 records - all US states, Washington D.C., and Puerto Rico
    • Census tracts with no population that occur in areas of water, such as oceans, are removed from this data

  19. f

    Data from: Comparative effectiveness of generic and brand-name medication...

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    • +1more
    Updated Mar 13, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Desai, Rishi J.; Dejene, Sara; Raofi, Saeid; Fischer, Michael A.; Connolly, John G.; Kesselheim, Aaron S.; Khan, Nazleen F.; Gagne, Joshua J.; Rogers, James R.; Sarpatwari, Ameet; Lii, Joyce; Dutcher, Sarah K.; Bohn, Justin (2019). Comparative effectiveness of generic and brand-name medication use: A database study of US health insurance claims [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000094887
    Explore at:
    Dataset updated
    Mar 13, 2019
    Authors
    Desai, Rishi J.; Dejene, Sara; Raofi, Saeid; Fischer, Michael A.; Connolly, John G.; Kesselheim, Aaron S.; Khan, Nazleen F.; Gagne, Joshua J.; Rogers, James R.; Sarpatwari, Ameet; Lii, Joyce; Dutcher, Sarah K.; Bohn, Justin
    Description

    BackgroundTo the extent that outcomes are mediated through negative perceptions of generics (the nocebo effect), observational studies comparing brand-name and generic drugs are susceptible to bias favoring the brand-name drugs. We used authorized generic (AG) products, which are identical in composition and appearance to brand-name products but are marketed as generics, as a control group to address this bias in an evaluation aiming to compare the effectiveness of generic versus brand medications.Methods and findingsFor commercial health insurance enrollees from the US, administrative claims data were derived from 2 databases: (1) Optum Clinformatics Data Mart (years: 2004–2013) and (2) Truven MarketScan (years: 2003–2015). For a total of 8 drug products, the following groups were compared using a cohort study design: (1) patients switching from brand-name products to AGs versus generics, and patients initiating treatment with AGs versus generics, where AG use proxied brand-name use, addressing negative perception bias, and (2) patients initiating generic versus brand-name products (bias-prone direct comparison) and patients initiating AG versus brand-name products (negative control). Using Cox proportional hazards regression after 1:1 propensity-score matching, we compared a composite cardiovascular endpoint (for amlodipine, amlodipine-benazepril, and quinapril), non-vertebral fracture (for alendronate and calcitonin), psychiatric hospitalization rate (for sertraline and escitalopram), and insulin initiation (for glipizide) between the groups. Inverse variance meta-analytic methods were used to pool adjusted hazard ratios (HRs) for each comparison between the 2 databases. Across 8 products, 2,264,774 matched pairs of patients were included in the comparisons of AGs versus generics. A majority (12 out of 16) of the clinical endpoint estimates showed similar outcomes between AGs and generics. Among the other 4 estimates that did have significantly different outcomes, 3 suggested improved outcomes with generics and 1 favored AGs (patients switching from amlodipine brand-name: HR [95% CI] 0.92 [0.88–0.97]). The comparison between generic and brand-name initiators involved 1,313,161 matched pairs, and no differences in outcomes were noted for alendronate, calcitonin, glipizide, or quinapril. We observed a lower risk of the composite cardiovascular endpoint with generics versus brand-name products for amlodipine and amlodipine-benazepril (HR [95% CI]: 0.91 [0.84–0.99] and 0.84 [0.76–0.94], respectively). For escitalopram and sertraline, we observed higher rates of psychiatric hospitalizations with generics (HR [95% CI]: 1.05 [1.01–1.10] and 1.07 [1.01–1.14], respectively). The negative control comparisons also indicated potentially higher rates of similar magnitude with AG compared to brand-name initiation for escitalopram and sertraline (HR [95% CI]: 1.06 [0.98–1.13] and 1.11 [1.05–1.18], respectively), suggesting that the differences observed between brand and generic users in these outcomes are likely explained by either residual confounding or generic perception bias. Limitations of this study include potential residual confounding due to the unavailability of certain clinical parameters in administrative claims data and the inability to evaluate surrogate outcomes, such as immediate changes in blood pressure, upon switching from brand products to generics.ConclusionsIn this study, we observed that use of generics was associated with comparable clinical outcomes to use of brand-name products. These results could help in promoting educational interventions aimed at increasing patient and provider confidence in the ability of generic medicines to manage chronic diseases.

  20. p

    Cervical Cancer Risk Classification - Dataset - CKAN

    • data.poltekkes-smg.ac.id
    Updated Oct 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Cervical Cancer Risk Classification - Dataset - CKAN [Dataset]. https://data.poltekkes-smg.ac.id/dataset/cervical-cancer-risk-classification
    Explore at:
    Dataset updated
    Oct 7, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cervical Cancer Risk Factors for Biopsy: This Dataset is Obtained from UCI Repository and kindly acknowledged! This file contains a List of Risk Factors for Cervical Cancer leading to a Biopsy Examination! About 11,000 new cases of invasive cervical cancer are diagnosed each year in the U.S. However, the number of new cervical cancer cases has been declining steadily over the past decades. Although it is the most preventable type of cancer, each year cervical cancer kills about 4,000 women in the U.S. and about 300,000 women worldwide. In the United States, cervical cancer mortality rates plunged by 74% from 1955 - 1992 thanks to increased screening and early detection with the Pap test. AGE Fifty percent of cervical cancer diagnoses occur in women ages 35 - 54, and about 20% occur in women over 65 years of age. The median age of diagnosis is 48 years. About 15% of women develop cervical cancer between the ages of 20 - 30. Cervical cancer is extremely rare in women younger than age 20. However, many young women become infected with multiple types of human papilloma virus, which then can increase their risk of getting cervical cancer in the future. Young women with early abnormal changes who do not have regular examinations are at high risk for localized cancer by the time they are age 40, and for invasive cancer by age 50. SOCIOECONOMIC AND ETHNIC FACTORS Although the rate of cervical cancer has declined among both Caucasian and African-American women over the past decades, it remains much more prevalent in African-Americans -- whose death rates are twice as high as Caucasian women. Hispanic American women have more than twice the risk of invasive cervical cancer as Caucasian women, also due to a lower rate of screening. These differences, however, are almost certainly due to social and economic differences. Numerous studies report that high poverty levels are linked with low screening rates. In addition, lack of health insurance, limited transportation, and language difficulties hinder a poor woman’s access to screening services. HIGH SEXUAL ACTIVITY Human papilloma virus (HPV) is the main risk factor for cervical cancer. In adults, the most important risk factor for HPV is sexual activity with an infected person. Women most at risk for cervical cancer are those with a history of multiple sexual partners, sexual intercourse at age 17 years or younger, or both. A woman who has never been sexually active has a very low risk for developing cervical cancer. Sexual activity with multiple partners increases the likelihood of many other sexually transmitted infections (chlamydia, gonorrhea, syphilis).Studies have found an association between chlamydia and cervical cancer risk, including the possibility that chlamydia may prolong HPV infection. FAMILY HISTORY Women have a higher risk of cervical cancer if they have a first-degree relative (mother, sister) who has had cervical cancer. USE OF ORAL CONTRACEPTIVES Studies have reported a strong association between cervical cancer and long-term use of oral contraception (OC). Women who take birth control pills for more than 5 - 10 years appear to have a much higher risk HPV infection (up to four times higher) than those who do not use OCs. (Women taking OCs for fewer than 5 years do not have a significantly higher risk.) The reasons for this risk from OC use are not entirely clear. Women who use OCs may be less likely to use a diaphragm, condoms, or other methods that offer some protection against sexual transmitted diseases, including HPV. Some research also suggests that the hormones in OCs might help the virus enter the genetic material of cervical cells. HAVING MANY CHILDREN Studies indicate that having many children increases the risk for developing cervical cancer, particularly in women infected with HPV. SMOKING Smoking is associated with a higher risk for precancerous changes (dysplasia) in the cervix and for progression to invasive cervical cancer, especially for women infected with HPV. IMMUNOSUPPRESSION Women with weak immune systems, (such as those with HIV / AIDS), are more susceptible to acquiring HPV. Immunocompromised patients are also at higher risk for having cervical precancer develop rapidly into invasive cancer. DIETHYLSTILBESTROL (DES) From 1938 - 1971, diethylstilbestrol (DES), an estrogen-related drug, was widely prescribed to pregnant women to help prevent miscarriages. The daughters of these women face a higher risk for cervical cancer. DES is no longer prsecribed.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Anirban Datta (2020). US Health Insurance Dataset [Dataset]. https://www.kaggle.com/datasets/teertha/ushealthinsurancedataset/code
Organization logo

US Health Insurance Dataset

Insurance Premium Charges in US with important details for risk underwriting.

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 16, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Anirban Datta
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Context

The venerable insurance industry is no stranger to data driven decision making. Yet in today's rapidly transforming digital landscape, Insurance is struggling to adapt and benefit from new technologies compared to other industries, even within the BFSI sphere (compared to the Banking sector for example.) Extremely complex underwriting rule-sets that are radically different in different product lines, many non-KYC environments with a lack of centralized customer information base, complex relationship with consumers in traditional risk underwriting where sometimes customer centricity runs reverse to business profit, inertia of regulatory compliance - are some of the unique challenges faced by Insurance Business.

Despite this, emergent technologies like AI and Block Chain have brought a radical change in Insurance, and Data Analytics sits at the core of this transformation. We can identify 4 key factors behind the emergence of Analytics as a crucial part of InsurTech:

  • Big Data: The explosion of unstructured data in the form of images, videos, text, emails, social media
  • AI: The recent advances in Machine Learning and Deep Learning that can enable businesses to gain insight, do predictive analytics and build cost and time - efficient innovative solutions
  • Real time Processing: Ability of real time information processing through various data feeds (for ex. social media, news)
  • Increased Computing Power: a complex ecosystem of new analytics vendors and solutions that enable carriers to combine data sources, external insights, and advanced modeling techniques in order to glean insights that were not possible before.

This dataset can be helpful in a simple yet illuminating study in understanding the risk underwriting in Health Insurance, the interplay of various attributes of the insured and see how they affect the insurance premium.

Content

This dataset contains 1338 rows of insured data, where the Insurance charges are given against the following attributes of the insured: Age, Sex, BMI, Number of Children, Smoker and Region. There are no missing or undefined values in the dataset.

Inspiration

This relatively simple dataset should be an excellent starting point for EDA, Statistical Analysis and Hypothesis testing and training Linear Regression models for predicting Insurance Premium Charges.

Proposed Tasks: - Exploratory Data Analytics - Statistical hypothesis testing - Statistical Modeling - Linear Regression

Search
Clear search
Close search
Google apps
Main menu