The Bureau of the Census has released Census 2000 Summary File 1 (SF1) 100-Percent data. The file includes the following population items: sex, age, race, Hispanic or Latino origin, household relationship, and household and family characteristics. Housing items include occupancy status and tenure (whether the unit is owner or renter occupied). SF1 does not include information on incomes, poverty status, overcrowded housing or age of housing. These topics will be covered in Summary File 3. Data are available for states, counties, county subdivisions, places, census tracts, block groups, and, where applicable, American Indian and Alaskan Native Areas and Hawaiian Home Lands. The SF1 data are available on the Bureau's web site and may be retrieved from American FactFinder as tables, lists, or maps. Users may also download a set of compressed ASCII files for each state via the Bureau's FTP server. There are over 8000 data items available for each geographic area. The full listing of these data items is available here as a downloadable compressed data base file named TABLES.ZIP. The uncompressed is in FoxPro data base file (dbf) format and may be imported to ACCESS, EXCEL, and other software formats. While all of this information is useful, the Office of Community Planning and Development has downloaded selected information for all states and areas and is making this information available on the CPD web pages. The tables and data items selected are those items used in the CDBG and HOME allocation formulas plus topics most pertinent to the Comprehensive Housing Affordability Strategy (CHAS), the Consolidated Plan, and similar overall economic and community development plans. The information is contained in five compressed (zipped) dbf tables for each state. When uncompressed the tables are ready for use with FoxPro and they can be imported into ACCESS, EXCEL, and other spreadsheet, GIS and database software. The data are at the block group summary level. The first two characters of the file name are the state abbreviation. The next two letters are BG for block group. Each record is labeled with the code and name of the city and county in which it is located so that the data can be summarized to higher-level geography. The last part of the file name describes the contents . The GEO file contains standard Census Bureau geographic identifiers for each block group, such as the metropolitan area code and congressional district code. The only data included in this table is total population and total housing units. POP1 and POP2 contain selected population variables and selected housing items are in the HU file. The MA05 table data is only for use by State CDBG grantees for the reporting of the racial composition of beneficiaries of Area Benefit activities. The complete package for a state consists of the dictionary file named TABLES, and the five data files for the state. The logical record number (LOGRECNO) links the records across tables.
Added +32,000 more locations. For information on data calculations please refer to the methodology pdf document. Information on how to calculate the data your self is also provided as well as how to buy data for $1.29 dollars.
The database contains 32,000 records on US Household Income Statistics & Geo Locations. The field description of the database is documented in the attached pdf file. To access, all 348,893 records on a scale roughly equivalent to a neighborhood (census tract) see link below and make sure to up vote. Up vote right now, please. Enjoy!
The dataset originally developed for real estate and business investment research. Income is a vital element when determining both quality and socioeconomic features of a given geographic location. The following data was derived from over +36,000 files and covers 348,893 location records.
Only proper citing is required please see the documentation for details. Have Fun!!!
Golden Oak Research Group, LLC. “U.S. Income Database Kaggle”. Publication: 5, August 2017. Accessed, day, month year.
2011-2015 ACS 5-Year Documentation was provided by the U.S. Census Reports. Retrieved August 2, 2017, from https://www2.census.gov/programs-surveys/acs/summary_file/2015/data/5_year_by_state/
Please tell us so we may provide you the most accurate data possible. You may reach us at: research_development@goldenoakresearch.com
for any questions you can reach me on at 585-626-2965
please note: it is my personal number and email is preferred
Check our data's accuracy: Census Fact Checker
Don't settle. Go big and win big. Optimize your potential. Overcome limitation and outperform expectation. Access all household income records on a scale roughly equivalent to a neighborhood, see link below:
Website: Golden Oak Research Kaggle Deals all databases $1.29 Limited time only
A small startup with big dreams, giving the every day, up and coming data scientist professional grade data at affordable prices It's what we do.
This dataset and map service provides information on the U.S. Housing and Urban Development's (HUD) low to moderate income areas. The term Low to Moderate Income, often referred to as low-mod, has a specific programmatic context within the Community Development Block Grant (CDBG) program. Over a 1, 2, or 3-year period, as selected by the grantee, not less than 70 percent of CDBG funds must be used for activities that benefit low- and moderate-income persons. HUD uses special tabulations of Census data to determine areas where at least 51% of households have incomes at or below 80% of the area median income (AMI). This dataset and map service contains the following layer.
A. SUMMARY The Department of Public Health and the Mayor’s Office of Housing and Community Development, with support from the Planning Department, created these 41 neighborhoods by grouping 2010 Census tracts, using common real estate and residents’ definitions for the purpose of providing consistency in the analysis and reporting of socio-economic, demographic, and environmental data, and data on City-funded programs and services. These neighborhoods are not codified in Planning Code nor Administrative Code, although this map is referenced in Planning Code Section 415 as the “American Community Survey Neighborhood Profile Boundaries Map. Note: These are NOT statistical boundaries as they are not controlled for population size. This is also NOT an official map of neighborhood boundaries in SF but an aggregation of Census tracts and should be used in conjunction with other spatial boundaries for decision making. B. HOW THE DATASET IS CREATED This dataset is produced by assigning Census tracts to neighborhoods based on existing neighborhood definitions used by Planning and MOHCD. A qualitative assessment is made to identify the appropriate neighborhood for a given tract based on understanding of population distribution and significant landmarks. Once all tracts have been assigned a neighborhood, the tracts are dissolved to produce this dataset, Analysis Neighborhoods. C. UPDATE PROCESS This dataset is static. Changes to the analysis neighborhood boundaries will be evaluated as needed by the Analysis Neighborhood working group led by DataSF and the Planning department and includes staff from various other city departments. Contact us for any questions. D. HOW TO USE THIS DATASET Downloading this dataset and opening it in Excel may cause some of the data values to be lost or not display properly (particularly the Analysis Neighborhood column). For a simple list of Analysis Neighborhoods without geographic coordinates, click here: https://data.sfgov.org/resource/xfcw-9evu.csv?$select=nhood E. RELATED DATASETS 2020 Census tracts assigned a neighborhood 2010 Census tracts assigned a neighborhood
Upvote! The database contains +40,000 records on US Gross Rent & Geo Locations. The field description of the database is documented in the attached pdf file. To access, all 325,272 records on a scale roughly equivalent to a neighborhood (census tract) see link below and make sure to upvote. Upvote right now, please. Enjoy!
Get the full free database with coupon code: FreeDatabase, See directions at the bottom of the description... And make sure to upvote :) coupon ends at 2:00 pm 8-23-2017
The data set originally developed for real estate and business investment research. Income is a vital element when determining both quality and socioeconomic features of a given geographic location. The following data was derived from over +36,000 files and covers 348,893 location records.
Only proper citing is required please see the documentation for details. Have Fun!!!
Golden Oak Research Group, LLC. “U.S. Income Database Kaggle”. Publication: 5, August 2017. Accessed, day, month year.
For any questions, you may reach us at research_development@goldenoakresearch.com. For immediate assistance, you may reach me on at 585-626-2965
please note: it is my personal number and email is preferred
Check our data's accuracy: Census Fact Checker
Don't settle. Go big and win big. Optimize your potential**. Access all gross rent records and more on a scale roughly equivalent to a neighborhood, see link below:
A small startup with big dreams, giving the every day, up and coming data scientist professional grade data at affordable prices It's what we do.
A. SUMMARY This dataset maps 2020 census tracts to Analysis Neighborhoods. The Department of Public Health and the Mayor’s Office of Housing and Community Development, with support from the Planning Department originally created the 41 Analysis Neighborhoods by grouping 2010 Census tracts, using common real estate and residents’ definitions for the purpose of providing consistency in the analysis and reporting of socio-economic, demographic, and environmental data, and data on City-funded programs and services. They are not codified in Planning Code nor Administrative Code. B. HOW THE DATASET IS CREATED This dataset is produced by mapping the 2020 Census tracts to Analysis neighborhoods. C. UPDATE PROCESS This dataset is static. Changes to the census tract boundaries are tracked in multiple datasets. See here for the 2010 census tracts assigned to neighborhoods D. HOW TO USE THIS DATASET This boundary file can be joined to other census datasets on GEOID, which is the primary key for census tracts in the dataset E. RELATED DATASET 2020 census tract boundaries for San Francisco can be found here
2020 Census data for the city of Boston, Boston neighborhoods, census tracts, block groups, and voting districts. In the 2020 Census, the U.S. Census Bureau divided Boston into 207 census tracts (~4,000 residents) made up of 581 smaller block groups. The Boston Planning and Development Agency uses the 2020 tracts to approximate Boston neighborhoods. The 2020 Census Redistricting data also identify Boston’s voting districts.
For analysis of Boston’s 2020 Census data including graphs and maps by the BPDA Research Division and Office of Digital Cartography and GIS, see 2020 Census Research Publications
For a complete official data dictionary, please go to 2020 Census State Redistricting Data (Public Law 94-171) Summary File, Chapter 6. Data Dictionary. 2020 Census State Redistricting Data (Public Law 94-171) Summary File
2020 Census Block Groups In Boston
Boston Neighborhood Boundaries Approximated By 2020 Census Tracts
Our Realtor.com (Multiple Listing Service) dataset represents one of the most exhaustive collections of real estate data available to the industry. It consolidates data from over 500 MLS aggregators across various regions, providing an unparalleled view of the property market.
Features:
Property Listings: Each listing provides comprehensive details about a property. This includes its physical address, number of bedrooms and bathrooms, square footage, lot size, type of property (e.g., single-family home, condo, townhome), and more.
Photographs and Virtual Tours: Visuals are crucial in the property market. Most listings are accompanied by high-quality photographs and, in many cases, virtual or 3D tours that allow potential buyers to explore properties remotely.
Pricing Information: Listings provide asking prices, and the dataset frequently updates to reflect price changes. Historical price data, which includes initial listing prices and any subsequent reductions or increases, is also available.
Transaction Histories: For sold properties, the dataset provides information about the date of sale, the sale price, and any discrepancies between the listing and sale prices.
Agent and Broker Information: Each listing typically has associated details about the property's real estate professional. This might include their name, contact details, and affiliated brokerage.
Open House Schedules: Open house dates and times are listed for properties that are actively being shown to potential buyers.
Market Trends: By analyzing the dataset over time, one can glean insights into market dynamics, such as the rate of price appreciation or depreciation in certain areas, the average time properties stay on the market, and seasonality effects.
Neighborhood Data: With comprehensive geographical data, it becomes possible to understand neighborhood-specific trends. This is invaluable for potential buyers or real estate investors looking to identify burgeoning markets.
Price Comparisons: Realtors and potential buyers can benchmark properties against similar listings in the same area to determine if a property is priced appropriately.
For Industry Professionals and Analysts: Beyond buyers and sellers, the dataset is a trove of information for real estate agents, brokers, analysts, and investors. They can harness this data to craft strategies, predict market movements, and serve their clients better.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Analysis Neighborhoods’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/75b22a4b-6038-4d4e-b203-88c3e3fe61f5 on 12 February 2022.
--- Dataset description provided by original source is as follows ---
The Department of Public Health and the Mayor’s Office of Housing and Community Development, with support from the Planning Department, created these 41 neighborhoods by grouping 2010 Census tracts, using common real estate and residents’ definitions for the purpose of providing consistency in the analysis and reporting of socio-economic, demographic, and environmental data, and data on City-funded programs and services. These neighborhoods are not codified in Planning Code nor Administrative Code, although this map is referenced in Planning Code Section 415 as the “American Community Survey Neighborhood Profile Boundaries Map."
This dataset is produced by assigning Census tracts to neighborhoods based on existing neighborhood definitions used by Planning and MOHCD. A qualitative assessment is made to identify the appropriate neighborhood for a given tract based on understanding of population distribution and significant landmarks. Once all tracts have been assigned a neighborhood, the tracts are dissolved to produce this dataset, Analysis Neighborhoods. It's companion dataset of all Census tracts assigned a neighborhood is available here: https://data.sfgov.org/d/bwbp-wk3r
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the House population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for House. The dataset can be utilized to understand the population distribution of House by age. For example, using this dataset, we can identify the largest age group in House.
Key observations
The largest age group in House, NM was for the group of age 60 to 64 years years with a population of 16 (34.04%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in House, NM was the Under 5 years years with a population of 0 (0%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for House Population by Age. You can refer the same here
A. SUMMARY This dataset includes COVID-19 tests by resident neighborhood and specimen collection date (the day the test was collected). Specifically, this dataset includes tests of San Francisco residents who listed a San Francisco home address at the time of testing. These resident addresses were then geo-located and mapped to neighborhoods. The resident address associated with each test is hand-entered and susceptible to errors, therefore neighborhood data should be interpreted as an approximation, not a precise nor comprehensive total.
In recent months, about 5% of tests are missing addresses and therefore cannot be included in any neighborhood totals. In earlier months, more tests were missing address data. Because of this high percentage of tests missing resident address data, this neighborhood testing data for March, April, and May should be interpreted with caution (see below)
Percentage of tests missing address information, by month in 2020 Mar - 33.6% Apr - 25.9% May - 11.1% Jun - 7.2% Jul - 5.8% Aug - 5.4% Sep - 5.1% Oct (Oct 1-12) - 5.1%
To protect the privacy of residents, the City does not disclose the number of tests in neighborhoods with resident populations of fewer than 1,000 people. These neighborhoods are omitted from the data (they include Golden Gate Park, John McLaren Park, and Lands End).
Tests for residents that listed a Skilled Nursing Facility as their home address are not included in this neighborhood-level testing data. Skilled Nursing Facilities have required and repeated testing of residents, which would change neighborhood trends and not reflect the broader neighborhood's testing data.
This data was de-duplicated by individual and date, so if a person gets tested multiple times on different dates, all tests will be included in this dataset (on the day each test was collected).
The total number of positive test results is not equal to the total number of COVID-19 cases in San Francisco. During this investigation, some test results are found to be for persons living outside of San Francisco and some people in San Francisco may be tested multiple times (which is common). To see the number of new confirmed cases by neighborhood, reference this map: https://sf.gov/data/covid-19-case-maps#new-cases-maps
B. HOW THE DATASET IS CREATED COVID-19 laboratory test data is based on electronic laboratory test reports. Deduplication, quality assurance measures and other data verification processes maximize accuracy of laboratory test information. All testing data is then geo-coded by resident address. Then data is aggregated by analysis neighborhood and specimen collection date.
Data are prepared by close of business Monday through Saturday for public display.
C. UPDATE PROCESS Updates automatically at 05:00 Pacific Time each day. Redundant runs are scheduled at 07:00 and 09:00 in case of pipeline failure.
D. HOW TO USE THIS DATASET San Francisco population estimates for geographic regions can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).
Due to the high degree of variation in the time needed to complete tests by different labs there is a delay in this reporting. On March 24 the Health Officer ordered all labs in the City to report complete COVID-19 testing information to the local and state health departments.
In order to track trends over time, a data user can analyze this data by "specimen_collection_date".
Calculating Percent Positivity: The positivity rate is the percentage of tests that return a positive result for COVID-19 (positive tests divided by the sum of positive and negative tests). Indeterminate results, which could not conclusively determine whether COVID-19 virus was present, are not included in the calculation of pe
The Decennial Census provides population estimates and demographic information on residents of the United States.
The Census Summary Files contain detailed tables on responses to the decennial census. Data tables in Summary File 1 provide information on population and housing characteristics, including cross-tabulations of age, sex, households, families, relationship to householder, housing units, detailed race and Hispanic or Latino origin groups, and group quarters for the total population. Summary File 2 contains data tables on population and housing characteristics as reported by housing unit.
Researchers at NYU Langone Health can find guidance for the use and analysis of Census Bureau data on the Population Health Data Hub (listed under "Other Resources"), which is accessible only through the intranet portal with a valid Kerberos ID (KID).
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Our dataset features comprehensive housing market data, extracted from 250,000 records sourced directly from Redfin USA. Our Crawl Feeds team utilized proprietary in-house tools to meticulously scrape and compile this valuable data.
Key Benefits of Our Housing Market Data:
Unlock the Power of Redfin Data for Real Estate Professionals
Leveraging our Redfin properties dataset allows real estate professionals to make data-driven decisions. With detailed insights into property listings, sales history, and pricing trends, agents and investors can identify opportunities in the market more effectively. The data is particularly useful for comparing neighborhood trends, understanding market demand, and making informed investment decisions.
Enhance Your Real Estate Research with Custom Filters and Analysis
Our Redfin dataset is not only extensive but also customizable, allowing users to apply filters based on specific criteria such as property type, listing status, and geographic location. This flexibility enables researchers and analysts to drill down into the data, uncovering patterns and insights that can guide strategic planning and market entry decisions. Whether you're tracking the performance of single-family homes or exploring multi-family property trends, this dataset offers the depth and accuracy needed for thorough analysis.
Looking for deeper insights or a custom data pull from Redfin?
Send a request with just one click and explore detailed property listings, price trends, and housing data.
🔗 Request Redfin Real Estate Data
This data collection contains 132 Public Use Microdata Samples (PUMS) files from the 1970 Census of Population and Housing. Information is provided in these files on the housing unit, such as occupancy and vacancy status of house, tenure, value of property, commercial use, year structure was built, number of rooms, availability of plumbing facilities, sewage disposal, bathtub or shower, complete kitchen facilities, flush toilet, water, telephone, and air conditioning. Data are also provided on household characteristics such as the number of persons aged 18 years and younger in the household, the presence of roomers, boarders, or lodgers, the presence of other nonrelative and of relative other than wife or child of head of household, the number of persons per room, the rent paid for unit, and the number of persons with Spanish surnames. Other demographic variables provide information on age, race, marital status, place of birth, state of birth, Puerto Rican heritage, citizenship, education, occupation, employment status, size of family, farm earnings, and family income. This hierarchical data collection contains approximately 214 variables for the 15-percent sample, 227 variables for the 5-percent sample, and 117 variables for the neighborhood characteristics sample. (Source: downloaded from ICPSR 7/13/10)
Please Note: This dataset is part of the historical CISER Data Archive Collection and is also available at ICPSR at https://doi.org/10.3886/ICPSR00018.v1. We highly recommend using the ICPSR version as they may make this dataset available in multiple data formats in the future.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Red House town population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Red House town. The dataset can be utilized to understand the population distribution of Red House town by age. For example, using this dataset, we can identify the largest age group in Red House town.
Key observations
The largest age group in Red House, New York was for the group of age 40-44 years with a population of 8 (27.59%), according to the 2021 American Community Survey. At the same time, the smallest age group in Red House, New York was the 5-9 years with a population of 0 (0.00%). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Red House town Population by Age. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data on relationship to householder were derived from answers to Question 2 in the 2015 American Community Survey (ACS), which was asked of all people in housing units. The question on relationship is essential for classifying the population information on families and other groups. Information about changes in the composition of the American family, from the number of people living alone to the number of children living with only one parent, is essential for planning and carrying out a number of federal programs.
The responses to this question were used to determine the relationships of all persons to the householder, as well as household type (married couple family, nonfamily, etc.). From responses to this question, we were able to determine numbers of related children, own children, unmarried partner households, and multi-generational households. We calculated average household and family size. When relationship was not reported, it was imputed using the age difference between the householder and the person, sex, and marital status.
Household – A household includes all the people who occupy a housing unit. (People not living in households are classified as living in group quarters.) A housing unit is a house, an apartment, a mobile home, a group of rooms, or a single room that is occupied (or if vacant, is intended for occupancy) as separate living quarters. Separate living quarters are those in which the occupants live separately from any other people in the building and which have direct access from the outside of the building or through a common hall. The occupants may be a single family, one person living alone, two or more families living together, or any other group of related or unrelated people who share living arrangements.
Average Household Size – A measure obtained by dividing the number of people in households by the number of households. In cases where people in households are cross-classified by race or Hispanic origin, people in the household are classified by the race or Hispanic origin of the householder rather than the race or Hispanic origin of each individual.
Average household size is rounded to the nearest hundredth.
Comparability – The relationship categories for the most part can be compared to previous ACS years and to similar data collected in the decennial census, CPS, and SIPP. With the change in 2008 from “In-law” to the two categories of “Parent-in-law” and “Son-in-law or daughter-in-law,” caution should be exercised when comparing data on in-laws from previous years. “In-law” encompassed any type of in-law such as sister-in-law. Combining “Parent-in-law” and “son-in-law or daughter-in-law” does not represent all “in-laws” in 2008.
The same can be said of comparing the three categories of “biological” “step,” and “adopted” child in 2008 to “Child” in previous years. Before 2008, respondents may have considered anyone under 18 as “child” and chosen that category. The ACS includes “foster child” as a category. However, the 2010 Census did not contain this category, and “foster children” were included in the “Other nonrelative” category. Therefore, comparison of “foster child” cannot be made to the 2010 Census. Beginning in 2013, the “spouse” category includes same-sex spouses.
The U.S. Department of Housing and Urban Development (HUD) periodically receives "custom tabulations" of Census data from the U.S. Census Bureau that are largely not available through standard Census products. These datasets, known as "CHAS" (Comprehensive Housing Affordability Strategy) data, demonstrate the extent of housing problems and housing needs, particularly for low income households. The primary purpose of CHAS data is to demonstrate the number of households in need of housing assistance. This is estimated by the number of households that have certain housing problems and have income low enough to qualify for HUD’s programs (primarily 30, 50, and 80 percent of median income). CHAS data provides counts of the numbers of households that fit these HUD-specified characteristics in a variety of geographic areas. In addition to estimating low-income housing needs, CHAS data contributes to a more comprehensive market analysis by documenting issues like lead paint risks, "affordability mismatch," and the interaction of affordability with variables like age of homes, number of bedrooms, and type of building. This dataset is a special tabulation of the 2016-2020 American Community Survey (ACS) and reflects conditions over that time period. The dataset uses custom HUD Area Median Family Income (HAMFI) figures calculated by HUD PDR staff based on 2016-2020 ACS income data. CHAS datasets are used by Federal, State, and Local governments to plan how to spend, and distribute HUD program funds. To learn more about the Comprehensive Housing Affordability Strategy (CHAS), visit: https://www.huduser.gov/portal/datasets/cp.html, for questions about the spatial attribution of this dataset, please reach out to us at GISHelpdesk@hud.gov. To learn more about the American Community Survey (ACS), and associated datasets visit: https://www.census.gov/programs-surveys/acs Data Dictionary: DD_ACS 5-Year CHAS Estimate Data by State Date of Coverage: 2016-2020
This dataset is a snapshot from October 2022 of all 48 homes in a section of a neighborhood nearby a large university in Central Florida. All of the homes are single family homes featuring a garage, a driveway, and a fenced-in backyard. Data was gathered by hand (keyboard) via a collection of sites, including Zillow, Realtor, Redfin, Trulia, and Orange County Property Appraiser. All homes were built in the same year in the early 2000's and feature central air and all other utilities typical of contemporary suburban homes in the United States. The area is close to a university and a large portion of renters are college students and young professionals, as well as families and older adults.
There are 30 columns:
Note that while the dataset is exhaustive in that it has all of the houses, some homes are missing some columns, typically because a home did not feature a estimate on a site or the one home not found on the property appraiser's site. This also is therefore not a randomized dataset, so the only population of homes that it can be used to infer on are those within this specific portion of the neighborhood. Personally, I am going to use the dataset to practice a couple of aspects of real-world data: Cleaning, Imputing, and Exploratory Data Analysis. Mainly, I want to compare different approaches to filling in the missing values of the dataset, then do some Model Building with some additional Dimensionality Reduction.
The American Community Survey (ACS) is an ongoing survey that provides vital information on a yearly basis about our nation and its people by contacting over 3.5 million households across the country. The resulting data provides incredibly detailed demographic information across the US aggregated at various geographic levels which helps determine how more than $675 billion in federal and state funding are distributed each year. Businesses use ACS data to inform strategic decision-making. ACS data can be used as a component of market research, provide information about concentrations of potential employees with a specific education or occupation, and which communities could be good places to build offices or facilities. For example, someone scouting a new location for an assisted-living center might look for an area with a large proportion of seniors and a large proportion of people employed in nursing occupations. Through the ACS, we know more about jobs and occupations, educational attainment, veterans, whether people own or rent their homes, and other topics. Public officials, planners, and entrepreneurs use this information to assess the past and plan the future. For more information, see the Census Bureau's ACS Information Guide . This public dataset is hosted in Google BigQuery as part of the Google Cloud Public Datasets Program , with Carto providing cleaning and onboarding support. It is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
The U.S. Department of Housing and Urban Development (HUD) periodically receives "custom tabulations" of Census data from the U.S. Census Bureau that are largely not available through standard Census products. These datasets, known as "CHAS" (Comprehensive Housing Affordability Strategy) data, demonstrate the extent of housing problems and housing needs, particularly for low income households. The primary purpose of CHAS data is to demonstrate the number of households in need of housing assistance. This is estimated by the number of households that have certain housing problems and have income low enough to qualify for HUD’s programs (primarily 30, 50, and 80 percent of median income). CHAS data provides counts of the numbers of households that fit these HUD-specified characteristics in a variety of geographic areas. In addition to estimating low-income housing needs, CHAS data contributes to a more comprehensive market analysis by documenting issues like lead paint risks, "affordability mismatch," and the interaction of affordability with variables like age of homes, number of bedrooms, and type of building. This dataset is a special tabulation of the 2016-2020 American Community Survey (ACS) and reflects conditions over that time period. The dataset uses custom HUD Area Median Family Income (HAMFI) figures calculated by HUD PDR staff based on 2016-2020 ACS income data. CHAS datasets are used by Federal, State, and Local governments to plan how to spend, and distribute HUD program funds. To learn more about the Comprehensive Housing Affordability Strategy (CHAS), visit: https://www.huduser.gov/portal/datasets/cp.html, for questions about the spatial attribution of this dataset, please reach out to us at GISHelpdesk@hud.gov. To learn more about the American Community Survey (ACS), and associated datasets visit: https://www.census.gov/programs-surveys/acs Data Dictionary: DD_ACS 5-Year CHAS Estimate Data by Tract Date of Coverage: 2016-2020
The Bureau of the Census has released Census 2000 Summary File 1 (SF1) 100-Percent data. The file includes the following population items: sex, age, race, Hispanic or Latino origin, household relationship, and household and family characteristics. Housing items include occupancy status and tenure (whether the unit is owner or renter occupied). SF1 does not include information on incomes, poverty status, overcrowded housing or age of housing. These topics will be covered in Summary File 3. Data are available for states, counties, county subdivisions, places, census tracts, block groups, and, where applicable, American Indian and Alaskan Native Areas and Hawaiian Home Lands. The SF1 data are available on the Bureau's web site and may be retrieved from American FactFinder as tables, lists, or maps. Users may also download a set of compressed ASCII files for each state via the Bureau's FTP server. There are over 8000 data items available for each geographic area. The full listing of these data items is available here as a downloadable compressed data base file named TABLES.ZIP. The uncompressed is in FoxPro data base file (dbf) format and may be imported to ACCESS, EXCEL, and other software formats. While all of this information is useful, the Office of Community Planning and Development has downloaded selected information for all states and areas and is making this information available on the CPD web pages. The tables and data items selected are those items used in the CDBG and HOME allocation formulas plus topics most pertinent to the Comprehensive Housing Affordability Strategy (CHAS), the Consolidated Plan, and similar overall economic and community development plans. The information is contained in five compressed (zipped) dbf tables for each state. When uncompressed the tables are ready for use with FoxPro and they can be imported into ACCESS, EXCEL, and other spreadsheet, GIS and database software. The data are at the block group summary level. The first two characters of the file name are the state abbreviation. The next two letters are BG for block group. Each record is labeled with the code and name of the city and county in which it is located so that the data can be summarized to higher-level geography. The last part of the file name describes the contents . The GEO file contains standard Census Bureau geographic identifiers for each block group, such as the metropolitan area code and congressional district code. The only data included in this table is total population and total housing units. POP1 and POP2 contain selected population variables and selected housing items are in the HU file. The MA05 table data is only for use by State CDBG grantees for the reporting of the racial composition of beneficiaries of Area Benefit activities. The complete package for a state consists of the dictionary file named TABLES, and the five data files for the state. The logical record number (LOGRECNO) links the records across tables.