Summary File 1 Data Profile 1 (SF1 Table DP-1) for Census Tracts in the Minneapolis-St. Paul 7 County metropolitan area is a subset of the profile of general demographic characteristics for 2000 prepared by the U.S. Census Bureau.
This table (DP-1) includes: Sex and Age, Race, Race alone or in combination with one or more otehr races, Hispanic or Latino and Race, Relationship, Household by Type, Housing Occupancy, Housing Tenure
US Census 2000 Demographic Profiles: 100-percent and Sample Data
The profile includes four tables (DP-1 thru DP-4) that provide various demographic, social, economic, and housing characteristics for the United States, states, counties, minor civil divisions in selected states, places, metropolitan areas, American Indian and Alaska Native areas, Hawaiian home lands and congressional districts (106th Congress). It includes 100-percent and sample data from Census 2000. The DP-1 table is available as part of the Summary File 1 (SF 1) dataset, and the other three tables are available as part of the Summary File 3 (SF 3) dataset.
The US Census provides DP-1 thru DP-4 data at the Census tract level through their DataFinder search engine. However, since the Metropolitan Council and MetroGIS participants are interested in all Census tracts within the seven county metropolitan area, it was quicker to take the raw Census SF-1 and SF-3 data at tract levels and recreate the DP1-4 variables using the appropriate formula for each DP variable. This file lists the formulas used to create the DP variables.
A random sample of households were invited to participate in this survey. In the dataset, you will find the respondent level data in each row with the questions in each column. The numbers represent a scale option from the survey, such as 1=Excellent, 2=Good, 3=Fair, 4=Poor. The question stem, response option, and scale information for each field can be found in the var "variable labels" and "value labels" sheets. VERY IMPORTANT NOTE: The scientific survey data were weighted, meaning that the demographic profile of respondents was compared to the demographic profile of adults in Bloomington from US Census data. Statistical adjustments were made to bring the respondent profile into balance with the population profile. This means that some records were given more "weight" and some records were given less weight. The weights that were applied are found in the field "wt". If you do not apply these weights, you will not obtain the same results as can be found in the report delivered to the Bloomington. The easiest way to replicate these results is likely to create pivot tables, and use the sum of the "wt" field rather than a count of responses.
Summary File 4 is repeated or iterated for the total population and 335 additional population groups: 132 race groups,78 American Indian and Alaska Native tribe categories, 39 Hispanic or Latino groups, and 86 ancestry groups.Tables for any population group excluded from SF 2 because the group's total population in a specific geographic area did not meet the SF 2 threshold of 100 people are excluded from SF 4. Tables in SF 4 shown for any of the above population groups will only be shown if there are at least 50 unweighted sample cases in a specific geographic area. The same 50 unweighted sample cases also applied to ancestry iterations. In an iterated file such as SF 4, the universes households, families, and occupied housing units are classified by the race or ethnic group of the householder. The universe subfamilies is classified by the race or ethnic group of the reference person for the subfamily. In a husband/wife subfamily, the reference person is the husband; in a parent/child subfamily, the reference person is always the parent. The universes population in households, population in families, and population in subfamilies are classified by the race or ethnic group of the inidviduals within the household, family, or subfamily without regard to the race or ethnicity of the householder. Notes follow selected tables to make the classification of the universe clear. In any population table where there is no note, the universe classification is always based on the race or ethnicity of the person. In all housing tables, the universe classification is based on the race or ethnicity of the householder.
The State Legislative District Summary File (Sample) (SLDSAMPLE) contains the sample data, which is the information compiled from the questions asked of a sample of all people and housing units. Population items include basic population totals; urban and rural; households and families; marital status; grandparents as caregivers; language and ability to speak English; ancestry; place of birth, citizenship status, and year of entry; migration; place of work; journey to work (commuting); school enrollment and educational attainment; veteran status; disability; employment status; industry, occupation, and class of worker; income; and poverty status. Housing items include basic housing totals; urban and rural; number of rooms; number of bedrooms; year moved into unit; household size and occupants per room; units in structure; year structure built; heating fuel; telephone service; plumbing and kitchen facilities; vehicles available; value of home; monthly rent; and shelter costs. The file contains subject content identical to that shown in Summary File 3 (SF 3).
https://www.ontario.ca/page/open-government-licence-ontariohttps://www.ontario.ca/page/open-government-licence-ontario
Data includes: board and school information, grade 3 and 6 EQAO student achievements for reading, writing and mathematics, and grade 9 mathematics EQAO and OSSLT. Data excludes private schools, Education and Community Partnership Programs (ECPP), summer, night and continuing education schools.
How Are We Protecting Privacy?
Results for OnSIS and Statistics Canada variables are suppressed based on school population size to better protect student privacy. In order to achieve this additional level of protection, the Ministry has used a methodology that randomly rounds a percentage either up or down depending on school enrolment. In order to protect privacy, the ministry does not publicly report on data when there are fewer than 10 individuals represented.
The information in the School Information Finder is the most current available to the Ministry of Education at this time, as reported by schools, school boards, EQAO and Statistics Canada. The information is updated as frequently as possible.
This information is also available on the Ministry of Education's School Information Finder website by individual school.
Descriptions for some of the data types can be found in our glossary.
School/school board and school authority contact information are updated and maintained by school boards and may not be the most current version. For the most recent information please visit: https://data.ontario.ca/dataset/ontario-public-school-contact-information.
This collection contains individual-level and 1-percent national sample data from the 1960 Census of Population and Housing conducted by the Census Bureau. It consists of a representative sample of the records from the 1960 sample questionnaires. The data are stored in 30 separate files, containing in total over two million records, organized by state. Some files contain the sampled records of several states while other files contain all or part of the sample for a single state. There are two types of records stored in the data files: one for households and one for persons. Each household record is followed by a variable number of person records, one for each of the household members. Data items in this collection include the individual responses to the basic social, demographic, and economic questions asked of the population in the 1960 Census of Population and Housing. Data are provided on household characteristics and features such as the number of persons in household, number of rooms and bedrooms, and the availability of hot and cold piped water, flush toilet, bathtub or shower, sewage disposal, and plumbing facilities. Additional information is provided on tenure, gross rent, year the housing structure was built, and value and location of the structure, as well as the presence of air conditioners, radio, telephone, and television in the house, and ownership of an automobile. Other demographic variables provide information on age, sex, marital status, race, place of birth, nationality, education, occupation, employment status, income, and veteran status. The data files were obtained by ICPSR from the Center for Social Analysis, Columbia University. (Source: downloaded from ICPSR 7/13/10)
Please Note: This dataset is part of the historical CISER Data Archive Collection and is also available at ICPSR at https://doi.org/10.3886/ICPSR07756.v1. We highly recommend using the ICPSR version as they may make this dataset available in multiple data formats in the future.
The dataset is a relational dataset of 8,000 households households, representing a sample of the population of an imaginary middle-income country. The dataset contains two data files: one with variables at the household level, the other one with variables at the individual level. It includes variables that are typically collected in population censuses (demography, education, occupation, dwelling characteristics, fertility, mortality, and migration) and in household surveys (household expenditure, anthropometric data for children, assets ownership). The data only includes ordinary households (no community households). The dataset was created using REaLTabFormer, a model that leverages deep learning methods. The dataset was created for the purpose of training and simulation and is not intended to be representative of any specific country.
The full-population dataset (with about 10 million individuals) is also distributed as open data.
The dataset is a synthetic dataset for an imaginary country. It was created to represent the population of this country by province (equivalent to admin1) and by urban/rural areas of residence.
Household, Individual
The dataset is a fully-synthetic dataset representative of the resident population of ordinary households for an imaginary middle-income country.
ssd
The sample size was set to 8,000 households. The fixed number of households to be selected from each enumeration area was set to 25. In a first stage, the number of enumeration areas to be selected in each stratum was calculated, proportional to the size of each stratum (stratification by geo_1 and urban/rural). Then 25 households were randomly selected within each enumeration area. The R script used to draw the sample is provided as an external resource.
other
The dataset is a synthetic dataset. Although the variables it contains are variables typically collected from sample surveys or population censuses, no questionnaire is available for this dataset. A "fake" questionnaire was however created for the sample dataset extracted from this dataset, to be used as training material.
The synthetic data generation process included a set of "validators" (consistency checks, based on which synthetic observation were assessed and rejected/replaced when needed). Also, some post-processing was applied to the data to result in the distributed data files.
This is a synthetic dataset; the "response rate" is 100%.
The AIAN Summary File contains data on population characteristics, such as sex, age, average household size, household type, and relationship to householder. The American Indian and Alaska Native Summary File (AIANSF) contains data on population characteristics, such as sex, age, average household size, household type, and relationship to householder. The file also includes housing characteristics, such as tenure (whether a housing unit is owner-occupied or renter- occupied) and age of householder for occupied housing units. Selected aggregates and medians also are provided. A complete listing of subjects in the AIANSF is found in Chapter 3, Subject Locator. The layout of the tables in the AIANSF is similar to that in Summary File 2 (SF 2). These data are presented in 47 population tables (identified with a "PCT") and 14 housing tables (identified with an "HCT") shown down to the census tract level; and 10 population tables (identified with a "PCO") shown down to the county level, for a total of 71 tables. Each table is iterated for the total population, the total American Indian and Alaska Native population alone, the total American Indian and Alaska Native population alone or in combination, and 1,567 detailed tribes and tribal groupings. Tribes or tribal groupings are included on the iterations list if they met a threshold of at least 100 people in the 2010 Census. In addition, the presentation of AIANSF tables for any of the tribes and tribal groupings is subject to a population threshold of 100 or more people in a given geography. That is, if there are fewer than 100 people in a specific population group in a specific geographic area, their population and housing characteristics data are not available for that geographic area in the AIANSF. See Appendix H, Characteristic Iterations, for more information.
Summary File 1 Data Profile 1 (SF1 Table DP-1) for cities and townships in Minnesota is a subset of the profile of general demographic characteristics for 2000 prepared by the U.S. Census Bureau.
This table includes: Sex and Age, Race, Race alone or in combination with one or more otehr races, Hispanic or Latino and Race, Relationship, Household by Type, Housing Occupancy, Housing Tenure
US Census 2000 Demographic Profiles: 100-percent and Sample Data
A profile includes four tables that provide various demographic, social, economic, and housing characteristics for the United States, states, counties, minor civil divisions in selected states, places, metropolitan areas, American Indian and Alaska Native areas, Hawaiian home lands and congressional districts (106th Congress). It includes 100-percent and sample data from Census 2000.
The Demographic Profile consists of four tables (DP-1 thru DP-4). For Census 2000 data, the DP-1 table is available as part of the Summary File 1 (SF 1) dataset, and the other three tables are available as part of the Summary File 3 (SF 3) dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Malta by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Malta. The dataset can be utilized to understand the population distribution of Malta by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Malta. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Malta.
Key observations
Largest age group (population): Male # 35-39 years (52) | Female # 25-29 years (62). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Malta Population by Gender. You can refer the same here
This API returns a search for the demographic information for a particular geography type and geography name.
By Centers for Disease Control and Prevention [source]
This dataset offers an in-depth look into the National Health and Nutrition Examination Survey (NHANES), which provides valuable insights on various health indicators throughout the United States. It includes important information such as the year when data was collected, location of the survey, data source and value, priority areas of focus, category and topic related to the survey, break out categories of data values, geographic location coordinates and other key indicators.Discover patterns in mortality rates from cardiovascular disease or analyze if pregnant women are more likely to report poor health than those who are not expecting with this NHANES dataset — a powerful collection for understanding personal health behaviors
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
Step 1: Understand the Data Format - Before beginning to work with NHANES data, you should become familiar with the different columns in the dataset. Each column contains a specific type of information about the data such as year collected, geographic location abbreviations and descriptions, sources used for collecting data, priority areas assigned by researchers or institutions associated with understanding health trends in a given area or population group as well as indicator values related to nutrition/health.
Step 2: Choose an Indicator - Once you understand what is included in each column and what type of values correspond to each field it is time to select which indicator(s) you would like plots or visualizations against demographic/geographical characteristics represented by NHANES data. Selecting an appropriate indicator helps narrow down your search criteria when conducting analyses of health/nutrition trends over time in different locations or amongst different demographic groups.
Step 3: Utilizing Subsets - When narrowing down your search criteria it may be beneficial to break up large datasets into smaller subsets that focus on a single area or topic for study (i.e., looking at nutrition trends among rural communities). This allows users to zoom into certain datasets if needed within their larger studies so they can further drill down on particular topics that are relevant for their research objectives without losing greater context from more general analysis results when viewing overall datasets containing all available fields for all locations examined by NHANES over many years of records collected at specific geographical areas requested within the parameters set forth by those wanting insights from external research teams utilizing this dataset remotely via Kaggle access granted through user accounts giving them authorized access controls solely limited by base administration permissions set forth where required prior granting needs authorization process has been met prior downloading/extraction activities successful completion finalized allowed beyond initial site signup page make sure rules followed while also ensuring positive experience interactive engagement processes fluid flow signature one-time registration entry after exit page exits once completed neutralize logout button pops finish downloading extract image files transfer end destination requires hard drive storage efficient manner duplicate second backup remain resilient mitigate file corruption concerns start working properly formatted smooth transition between systems be seamless reflective channel dynamic organization approach complement function beneficial effort allow comprehensive review completed quality control standards align desires outcomes desired critical path
- Creating a health calculator to help people measure their health risk. The indicator and data value fields can be used to create an algorithm that will generate a personalized label for each user's health status.
- Developing a visual representation of the nutritional habits of different populations based on the DataSource, LocationAbbr, and PriorityArea fields from this dataset.
- Employing machine learning to discern patterns in the data or predict potential health risks in different regions or populations by using the GeoLocation field as inputs for geographic analysis.
If you use this dataset in your research, please credit the original authors. Data Source
**Unknown License - Please check the dataset description for more information....
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This API returns a search for the demographic information for a particular geography type and geography ID
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Demographic data for all samples including, sample location, sex, length and age estimates and associated age class from pulp:tooth area ratio, GLG or life history data. Calculated pulp:tooth area ratios are also listed. (XLSX)
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The DSS Payment Demographic data set is made up of:
Selected DSS payment data by
Geography: state/territory, electorate, postcode, LGA and SA2 (for 2015 onwards)
Demographic: age, sex and Indigenous/non-Indigenous
Duration on Payment (Working Age & Pensions)
Duration on Income Support (Working Age, Carer payment & Disability Support Pension)
Rate (Working Age & Pensions)
Earnings (Working Age & Pensions)
Age Pension assets data
JobSeeker Payment and Youth Allowance (other) Principal Carers
Activity Tested Recipients by Partial Capacity to Work (NSA,PPS & YAO)
Exits within 3, 6 and 12 months (Newstart Allowance/JobSeeker Payment, Parenting Payment, Sickness Allowance & Youth Allowance)
Disability Support Pension by medical condition
Care Receiver by medical conditions
Commonwealth Rent Assistance by Payment type and Income Unit type have been added from March 2017. For further information about Commonwealth Rent Assistance and Income Units see the Data Descriptions and Glossary included in the dataset.
From December 2022, the "DSS Expanded Benefit and Payment Recipient Demographics – quarterly data" publication has introduced expanded reporting populations for income support recipients. As a result, the reporting population for Jobseeker Payment and Special Benefit has changed to include recipients who are current but on zero rate of payment and those who are suspended from payment. The reporting population for ABSTUDY, Austudy, Parenting Payment and Youth Allowance has changed to include those who are suspended from payment. The expanded report will replace the standard report after June 2023.
Additional data for DSS Expanded Benefit and Payment Recipient Demographics – quarterly data includes:
• A new contents page to assist users locate the information within the spreadsheet
• Additional data for the ‘Suspended’ population in the ‘Payment by Rate’ tab to enable users to calculate the old reporting rules.
• Additional information on the Employment Earning by ‘Income Free Area’ tab.
From December 2022, Services Australia have implemented a change in the Centrelink payment system to recognise gender other than the sex assigned at birth or during infancy, or as a gender which is not exclusively male or female. To protect the privacy of individuals and comply with confidentialisation policy, persons identifying as ‘non-binary’ will initially be grouped with ‘females’ in the period immediately following implementation of this change. The Department will monitor the implications of this change and will publish the ‘non-binary’ gender category as soon as privacy and confidentialisation considerations allow.
Local Government Area has been updated to reflect the Australian Statistical Geography Standard (ASGS) 2022 boundaries from June 2023.
Commonwealth Electorate Division has been updated to reflect the Australian Statistical Geography Standard (ASGS) 2021 boundaries from June 2023.
SA2 has been updated to reflect the Australian Statistical Geography Standard (ASGS) 2021 boundaries from June 2023.
From December 2021, the following are included in the report:
selected payments by work capacity, by various demographic breakdowns
rental type and homeownership
Family Tax Benefit recipients and children by payment type
Commonwealth Rent Assistance by proportion eligible for the maximum rate
an age breakdown for Age Pension recipients
For further information, please see the Glossary.
From June 2021, data on the Paid Parental Leave Scheme is included yearly in June releases. This includes both Parental Leave Pay and Dad and Partner Pay, across multiple breakdowns. Please see Glossary for further information.
From March 2017 the DSS demographic dataset will include top 25 countries of birth. For further information see the glossary.
From March 2016 machine readable files containing the three geographic breakdowns have also been published for use in National Map, links to these datasets are below:
Pre June 2014 Quarter Data contains:
Selected DSS payment data by
Geography: state/territory; electorate; postcode and LGA
Demographic: age, sex and Indigenous/non-Indigenous
Note: JobSeeker Payment replaced Newstart Allowance and other working age payments from 20 March 2020, for further details see: https://www.dss.gov.au/benefits-payments/jobseeker-payment
For data on DSS payment demographics as at June 2013 or earlier, the department has published data which was produced annually. Data is provided by payment type containing timeseries’, state, gender, age range, and various other demographics. Links to these publications are below:
Concession card data in the March and June 2020 quarters have been re-stated to address an over-count in reported cardholder numbers.
28/06/2024 – The March 2024 and December 2023 reports were republished with updated data in the ‘Carer Receivers by Med Condition’ section, updates are exclusive to the ‘Care Receivers of Carer Payment recipients’ table, under ‘Intellectual / Learning’ and ‘Circulatory System’ conditions only.
The L2 Voter and Demographic Dataset includes demographic and voter history tables for all 50 states and the District of Columbia. The dataset is built from publicly available government records about voter registration and election participation. These records indicate whether a person voted in an election or not, but they do not record whom that person voted for. Voter registration and election participation data are augmented by demographic information from outside data sources.
To create this file, L2 processes registered voter data on an ongoing basis for all 50 states and the District of Columbia, with refreshes of the underlying state voter data typically at least every six months and refreshes of telephone numbers and National Change of Address processing approximately every 30 to 60 days. These data are standardized and enhanced with propriety commercial data and modeling codes and consist of approximately 185,000,000 records nationwide.
For each state, there are two available tables: demographic and voter history. The demographic and voter tables can be joined on the LALVOTERID
variable. One can also use the LALVOTERID
variable to link the L2 Voter and Demographic Dataset with the L2 Consumer Dataset.
In addition, the LALVOTERID
variable can be used to validate the state. For example, let's look at the LALVOTERID = LALCA3169443
. The characters in the fourth and fifth positions of this identifier are 'CA' (California). The second way to validate the state is by using the RESIDENCE_ADDRESSES_STATE
variable, which should have a value of 'CA' (California).
The date appended to each table name represents when the data was last updated. These dates will differ state by state because states update their voter files at different cadences.
The demographic files use 698 consistent variables. For more information about these variables, see 2025-01-10-VM2-File-Layout.xlsx.
The voter history files have different variables depending on the state. The ***2025-08-05-L2-Voter-Dictionaries.tar.gz file contains .csv data dictionaries for each state's demographic and voter files. While the demographic file data dictionaries should mirror the 2025-01-10-VM2-File-Layout.xlsx*** file, the voter file data dictionaries will be unique to each state.
***2025-04-24-National-File-Notes.pdf ***contains L2 Voter and Demographic Dataset ("National File") release notes from 2018 to 2025.
***2025-08-05-L2-Voter-Fill-Rate.tar.gz ***contains .tab files tracking the percent of non-null values for any given field.
Data access is required to view this section.
Data access is required to view this section.
In this project, we aim to analyze and gain insights into the performance of students based on various factors that influence their academic achievements. We have collected data related to students' demographic information, family background, and their exam scores in different subjects.
**********Key Objectives:*********
Performance Evaluation: Evaluate and understand the academic performance of students by analyzing their scores in various subjects.
Identifying Underlying Factors: Investigate factors that might contribute to variations in student performance, such as parental education, family size, and student attendance.
Visualizing Insights: Create data visualizations to present the findings effectively and intuitively.
Dataset Details:
Analysis Highlights:
We will perform a comprehensive analysis of the dataset, including data cleaning, exploration, and visualization to gain insights into various aspects of student performance.
By employing statistical methods and machine learning techniques, we will determine the significant factors that affect student performance.
Why This Matters:
Understanding the factors that influence student performance is crucial for educators, policymakers, and parents. This analysis can help in making informed decisions to improve educational outcomes and provide support where it is most needed.
Acknowledgments:
We would like to express our gratitude to [mention any data sources or collaborators] for making this dataset available.
Please Note:
This project is meant for educational and analytical purposes. The dataset used is fictitious and does not represent any specific educational institution or individuals.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides comprehensive customer data suitable for segmentation analysis. It includes anonymized demographic, transactional, and behavioral attributes, allowing for detailed exploration of customer segments. Leveraging this dataset, marketers, data scientists, and business analysts can uncover valuable insights to optimize targeted marketing strategies and enhance customer engagement. Whether you're looking to understand customer behavior or improve campaign effectiveness, this dataset offers a rich resource for actionable insights and informed decision-making.
Anonymized demographic, transactional, and behavioral data. Suitable for customer segmentation analysis. Opportunities to optimize targeted marketing strategies. Valuable insights for improving campaign effectiveness. Ideal for marketers, data scientists, and business analysts.
Segmenting customers based on demographic attributes. Analyzing purchase behavior to identify high-value customer segments. Optimizing marketing campaigns for targeted engagement. Understanding customer preferences and tailoring product offerings accordingly. Evaluating the effectiveness of marketing strategies and iterating for improvement. Explore this dataset to unlock actionable insights and drive success in your marketing initiatives!
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This comprehensive dataset provides detailed educational attainment and demographic analysis across all 50 US states from 2021-2023, specifically designed for tech companies planning strategic market entry and product launch decisions.
Column Name | Data Type | Description | Example Value |
---|---|---|---|
NAME | String | Full US state name | "Massachusetts" |
total_population_25plus | Integer | Total population aged 25 and above | 4,975,152 |
bachelors_degree | Integer | Number of individuals with bachelor's degrees | 1,261,847 |
masters_degree | Integer | Number of individuals with master's degrees | 788,243 |
professional_degree | Integer | Number of individuals with professional degrees (JD, MD, etc.) | 157,762 |
doctoral_degree | Integer | Number of individuals with doctoral degrees (PhD, EdD, etc.) | 169,357 |
median_household_income | Integer | Median household income in USD | $99,858 |
total_households | Float | Total number of households (in millions) | 2.41 |
state | Integer | Numeric state identifier (1-50) | 25 |
year | Integer | Data collection year | 2023 |
college_graduates | Integer | Total college graduates (bachelor's + advanced degrees) | 2,377,209 |
college_graduate_percentage | Float | Percentage of population with college degrees | 47.78% |
graduate_degree_holders | Integer | Total with master's, professional, or doctoral degrees | 1,115,362 |
graduate_degree_percentage | Float | Percentage with graduate-level degrees | 22.42% |
advanced_degree_percentage | Float | Percentage with professional or doctoral degrees | 3.40% |
education_score | Float | Composite education ranking score | 28.76 |
education_rank | Integer | State ranking based on education score (1-50, 1=highest) | 1 |
The dataset reveals that Massachusetts consistently ranks #1 in education metrics with: - 47.78% college graduation rate (2023) - 22.42% graduate degree holders - $99,858 median household income - Education score of 28.76
Perfect for identifying premium tech markets and highly-educated consumer bases for sophisticated technology products.
This dataset is ideal for data scientists, market researchers, business analysts, and tech companies looking to make data-driven decisions about market entry, customer targeting, and regional strategy.
https://www.icpsr.umich.edu/web/ICPSR/studies/38008/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/38008/terms
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who do and do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete the Youth Interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Units (PSUs) and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the civilian, noninstitutionalized population at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the civilian, noninstitutionalized population at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This second replenishment sample was combined for estimation and analysis purposes with Wave 7 adult and youth respondents from the Wave 4 Cohort who were at least age 15 and in the civilian, noninstitutionalized population at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts. Dataset 0001 (DS0001) contains the data from the Public-Use File Master Linkage File (PUF-MLF). This file contains 93 variables and 82,139 cases. The file provides a master list of every person's unique identification number and what type of respondent they were in each wave for data that are available in the Public-Use Files and Special Collection Public-Use Files. Dataset 0002 (DS0002) contains the data from the Restricted-Use File Master Linkage File (RUF-MLF). This file contains 198 variables and 82,139 cases. The file provides a master list of every person's unique identification number and what type of respondent they were in each wave for data that are available in the Restricted-Use Files, Special Collection Restricted-Use Files, and Biomarker Restricted-Use Files.
Summary File 1 Data Profile 1 (SF1 Table DP-1) for Census Tracts in the Minneapolis-St. Paul 7 County metropolitan area is a subset of the profile of general demographic characteristics for 2000 prepared by the U.S. Census Bureau.
This table (DP-1) includes: Sex and Age, Race, Race alone or in combination with one or more otehr races, Hispanic or Latino and Race, Relationship, Household by Type, Housing Occupancy, Housing Tenure
US Census 2000 Demographic Profiles: 100-percent and Sample Data
The profile includes four tables (DP-1 thru DP-4) that provide various demographic, social, economic, and housing characteristics for the United States, states, counties, minor civil divisions in selected states, places, metropolitan areas, American Indian and Alaska Native areas, Hawaiian home lands and congressional districts (106th Congress). It includes 100-percent and sample data from Census 2000. The DP-1 table is available as part of the Summary File 1 (SF 1) dataset, and the other three tables are available as part of the Summary File 3 (SF 3) dataset.
The US Census provides DP-1 thru DP-4 data at the Census tract level through their DataFinder search engine. However, since the Metropolitan Council and MetroGIS participants are interested in all Census tracts within the seven county metropolitan area, it was quicker to take the raw Census SF-1 and SF-3 data at tract levels and recreate the DP1-4 variables using the appropriate formula for each DP variable. This file lists the formulas used to create the DP variables.