2 datasets found
  1. Wealth Segmentation of U.S. ZIP Codes Based on IRS

    • kaggle.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Namrata_Nyam (2025). Wealth Segmentation of U.S. ZIP Codes Based on IRS [Dataset]. http://doi.org/10.34740/kaggle/dsv/12424277
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Namrata_Nyam
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Wealth Segmentation of U.S. ZIP Codes Based on IRS Data

    This dataset provides a wealth-tier classification of U.S. ZIP codes for high income brackets using IRS income data and multivariate KMeans clustering. It can help with regional targeting, CRM enrichment, market analysis, or any data science task that benefits from understanding high income distribution across the U.S.

    💡 Source

    • IRS SOI ZIP Code Data (open source)
    • Aggregated across AGI brackets (stub 3–6)

    🧠 What’s Inside

    Each row represents a ZIP code with:

    • AGI (A00100), Total Income (A00200)
    • Capital Gains, Business Income, Tax Paid
    • Cluster assignment (0–2)
    • Wealth Tier label: Low, Medium, or High

    The cluster assignments are refined using distance to cluster centroids in normalized feature space to improve accuracy.

    💼 Use Cases

    • Segmenting markets for B2B/B2C outreach
    • CRM lead enrichment
    • Territory planning and resource allocation
    • Visualization and dashboard overlays
    ColumnDescription
    zipcodeU.S. ZIP code
    STATEFIPSFederal Information Processing Standard (FIPS) code for the state
    STATEU.S. state abbreviation (e.g., AL, CA)
    agi_stubAdjusted Gross Income bracket (1 = <$25K, ..., 6 = $200K+)
    A00100Adjusted Gross Income
    A02650Total income from all sources
    A10600Total tax payments
    A00200Wages and salaries
    MARS2Count of married joint returns
    N2Number of dependents
    A00900Business/professional net income
    mars1Count of single returns
    A26270Partnership and S-Corp income
    A09400Self-employment tax
    MARS4Head of household returns
    A85300Net investment income
    A00600Ordinary dividends
    A04475Qualified business income deduction
    A00650Qualified dividends
    A18500Real estate taxes paid
    ClusterNumeric cluster ID (0 = High, 1 = Medium, 2 = Low)
    Wealth_TierHuman-readable wealth tier label

    📬 Contact

    Created by Namrata Nyamagoudar(LinkedIn) for open-source analysis and enrichment use cases.

  2. g

    Zukunftserwartungen und Zukunftsverhalten (1987)

    • search.gesis.org
    • datacatalogue.cessda.eu
    • +1more
    Updated Apr 13, 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EMNID, Bielefeld (2010). Zukunftserwartungen und Zukunftsverhalten (1987) [Dataset]. http://doi.org/10.4232/1.1719
    Explore at:
    Dataset updated
    Apr 13, 2010
    Dataset provided by
    GESIS search
    GESIS Data Archive
    Authors
    EMNID, Bielefeld
    License

    https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

    Description

    Ideas about the political, economic and social development in the future.

    Topics: fear of the future; importance of selected areas of life in the future; future more worth living in due to technology and science; thoughts about the future of the country; personal influence on the future; greatest dangers in the future; religious, philosophical and political value systems with increasing significance in the future; social groupings becoming more powerful in the Federal Republic; personal wish for the next 20 years; political power block with increasing significance in the future; peace strategy with greatest probabilities of success; estimate of time worked each week in the year 2000; technical further development and influences on the development of the job market (scale); attitude to flexible handling of working hours and leisure time; forms of investment, accumulation of assets and payment with prospects for future; type of energy with prospects for the future; assessment of the future energy need of industrial nations; development of energy prices; most important means of transport; new transport systems; the ocean as provider of raw material or energy; most important areas of threat to the environment; assumed change of environmental quality in the future; adequate environmental protection measures; form of school and instruction methods with increasing significance; vocational training or study as preferred alternatives; assessment of general political interest in the future; expectation of more leisure time and athletic activity for health reasons; types of sport with increasing numbers of supporters; expected increase of vacation in foreign countries or at home; means of vacation travel with the greatest chances for growth; obstacles for long-distance vacation travel; preferred offerings to strengthen the attractiveness of local recreational areas; estimated change of the share of vacation costs in household income in the future; reasons for vacation with increasing significance in the future; personal hobbies and hobbies with increasing significance; most important characteristics of the equipment of stores and demands when shopping in the future; world regions with strongly growing or declining future population figures; expected development of the desire for children in the future and average number of children; expected development of the problem of guest workers; future eating habits; biological or ecological or conventional agricultural production with chances for the future; most important sources of nutrition in the future; most healthful source of nutrition; development of nutrition habits; attitude to natural cosmetics; health complaints with increasing significance; expected development of the frequency of visits to the doctor and future conduct with minor complaints; primary causes for physical complaints in the future; judgement on the desirability of selected medical cures; expected increase of addiction and drug problems; judgement on the development of information technology; technical progress as advantage or disadvantage for humanity; trust in data protection; preferred form of housing in the future; preference for one´s own home in the country or city apartment; expected development of the housing supply; expected development of personal car use with increasing gas prices and increasing air pollution; party preference (Sunday question) and behavior at the polls in the last Federal Parliament election; employment in the civil service.

    Also encoded were: ZIP (postal) code and identification of interviewer.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Namrata_Nyam (2025). Wealth Segmentation of U.S. ZIP Codes Based on IRS [Dataset]. http://doi.org/10.34740/kaggle/dsv/12424277
Organization logo

Wealth Segmentation of U.S. ZIP Codes Based on IRS

Clustered income and tax insights by ZIP code for B2B/B2C targeting

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 9, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Namrata_Nyam
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Wealth Segmentation of U.S. ZIP Codes Based on IRS Data

This dataset provides a wealth-tier classification of U.S. ZIP codes for high income brackets using IRS income data and multivariate KMeans clustering. It can help with regional targeting, CRM enrichment, market analysis, or any data science task that benefits from understanding high income distribution across the U.S.

💡 Source

  • IRS SOI ZIP Code Data (open source)
  • Aggregated across AGI brackets (stub 3–6)

🧠 What’s Inside

Each row represents a ZIP code with:

  • AGI (A00100), Total Income (A00200)
  • Capital Gains, Business Income, Tax Paid
  • Cluster assignment (0–2)
  • Wealth Tier label: Low, Medium, or High

The cluster assignments are refined using distance to cluster centroids in normalized feature space to improve accuracy.

💼 Use Cases

  • Segmenting markets for B2B/B2C outreach
  • CRM lead enrichment
  • Territory planning and resource allocation
  • Visualization and dashboard overlays
ColumnDescription
zipcodeU.S. ZIP code
STATEFIPSFederal Information Processing Standard (FIPS) code for the state
STATEU.S. state abbreviation (e.g., AL, CA)
agi_stubAdjusted Gross Income bracket (1 = <$25K, ..., 6 = $200K+)
A00100Adjusted Gross Income
A02650Total income from all sources
A10600Total tax payments
A00200Wages and salaries
MARS2Count of married joint returns
N2Number of dependents
A00900Business/professional net income
mars1Count of single returns
A26270Partnership and S-Corp income
A09400Self-employment tax
MARS4Head of household returns
A85300Net investment income
A00600Ordinary dividends
A04475Qualified business income deduction
A00650Qualified dividends
A18500Real estate taxes paid
ClusterNumeric cluster ID (0 = High, 1 = Medium, 2 = Low)
Wealth_TierHuman-readable wealth tier label

📬 Contact

Created by Namrata Nyamagoudar(LinkedIn) for open-source analysis and enrichment use cases.

Search
Clear search
Close search
Google apps
Main menu