100+ datasets found
  1. P

    California Housing Prices Dataset

    • paperswithcode.com
    Updated Sep 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). California Housing Prices Dataset [Dataset]. https://paperswithcode.com/dataset/california-housing-prices
    Explore at:
    Dataset updated
    Sep 19, 2024
    Area covered
    California
    Description

    Median house prices for California districts derived from the 1990 census.

    About Dataset

    Context This is the dataset used in the second chapter of Aurélien Géron's recent book 'Hands-On Machine learning with Scikit-Learn and TensorFlow'. It serves as an excellent introduction to implementing machine learning algorithms because it requires rudimentary data cleaning, has an easily understandable list of variables and sits at an optimal size between being to toyish and too cumbersome.

    The data contains information from the 1990 California census. So although it may not help you with predicting current housing prices like the Zillow Zestimate dataset, it does provide an accessible introductory dataset for teaching people about the basics of machine learning.

    Content The data pertains to the houses found in a given California district and some summary stats about them based on the 1990 census data. Be warned the data aren't cleaned so there are some preprocessing steps required! The columns are as follows, their names are pretty self-explanatory: - longitude - latitude - housing_median_age - total_rooms - total_bedrooms - population - households - median_income - median_house_value - ocean_proximity

    Acknowledgements This data was initially featured in the following paper: Pace, R. Kelley, and Ronald Barry. "Sparse spatial autoregressions." Statistics & Probability Letters 33.3 (1997): 291-297.

    and I encountered it in 'Hands-On Machine learning with Scikit-Learn and TensorFlow' by Aurélien Géron. Aurélien Géron wrote: This dataset is a modified version of the California Housing dataset available from: Luís Torgo's page (University of Porto)

    Inspiration See my kernel on machine learning basics in R using this dataset, or venture over to the following link for a python based introductory tutorial: https://github.com/ageron/handson-ml/tree/master/datasets/housing

  2. QuickFacts: California City city, California

    • census.gov
    • shutdown.census.gov
    csv
    Updated Jul 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Census Bureau > Communications Directorate - Center for New Media and Promotion (2024). QuickFacts: California City city, California [Dataset]. https://www.census.gov/quickfacts/fact/map/californiacitycitycalifornia/RHI225222
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 1, 2024
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    United States Census Bureau > Communications Directorate - Center for New Media and Promotion
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    California City, California
    Description

    U.S. Census Bureau QuickFacts statistics for California City city, California. QuickFacts data are derived from: Population Estimates, American Community Survey, Census of Population and Housing, Current Population Survey, Small Area Health Insurance Estimates, Small Area Income and Poverty Estimates, State and County Housing Unit Estimates, County Business Patterns, Nonemployer Statistics, Economic Census, Survey of Business Owners, Building Permits.

  3. Census of Population and Housing, 1970 [California]: Summary Statistic File...

    • icpsr.umich.edu
    ascii
    Updated Jan 18, 2006
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States. Bureau of the Census (2006). Census of Population and Housing, 1970 [California]: Summary Statistic File 4A: Population and Housing [Fourth Count] [Dataset]. http://doi.org/10.3886/ICPSR06712.v1
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Jan 18, 2006
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    United States. Bureau of the Census
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/6712/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/6712/terms

    Time period covered
    1970
    Area covered
    United States, California
    Description

    This collection comprises census tract-level data for California from the 1970 Census. The data contain 20-, 15-, and 5-percent sample population and housing characteristics including education, occupation, income, citizenship, vocational training, and household equipment and facilities.

  4. H

    Census of Population and Housing 1970, [San Diego, California]: Summary...

    • dataverse.harvard.edu
    bin, html, pdf, xls +1
    Updated Feb 11, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2014). Census of Population and Housing 1970, [San Diego, California]: Summary Statistic File 4A: Housing [Dataset]. http://doi.org/10.7910/DVN/NQ0X7A
    Explore at:
    bin(948097), bin(966459), bin(1322836), xls(18520576), bin(289496), bin(774418), bin(837149), zip(94428351), bin(168062), bin(345741), bin(240675), bin(134229), pdf(10733824), bin(134655), html(737122), bin(780843)Available download formats
    Dataset updated
    Feb 11, 2014
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    1970
    Area covered
    San Diego, California, United States
    Description

    This data collection includes housing data by census tract for San Diego County in 1970. Sample counts (5%, 15% and 20%) are derived from the Census of Population and Housing, 1970: Summary Tape File 4A: Housing. Housing characteristics include housing value, number of housing units in structure, number of rooms in housing unit, year structure was built, occupancy/vacancy status, tenure, rent, type of heating fuel, source of water, and presence of an air conditioner and other home appliances. Counts are available for the total, Negro, and Spanish American populations. Negros are defined as a racial category by the Census Bureau. In California, Spanish Americans include "Persons of Spanish language or Spanish surname". The California state 4A data file was processed with PERL and SPSS by the Social Science Data Collection (SSDC) staff of the University of California, San Diego Library from Census Bureau data processed by DUALabs, Inc. and archived at the Odum Institute. PERL script concatenated record type codes and output six data files that match the record types described in the codebook: tables 001-040, tables 041-107, tables 108-119, tables 120-130, tables 131-152 and Spanish American tables 153-200. These six files were subsequently processed with SPSS to extract tracts for San Diego County and recompute some aggregate housing value data content. Users may browse a list of data variables (tables and cells) included in these six data files. The Census Bureau produced printed reports for 1970 Summary Tape File 4A. The UCSD Geisel Library maintains printed reports and census tract maps in 1970 Census of Population and Housing Census Tracts, San Diego, Calif. (SSH Docs US Stacks C 3.223/11:970/188).

  5. A

    ‘California Housing Data (1990)’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘California Housing Data (1990)’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-california-housing-data-1990-a0c5/b7389540/?iid=007-628&v=presentation
    Explore at:
    Dataset updated
    Nov 12, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    California
    Description

    Analysis of ‘California Housing Data (1990)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/harrywang/housing on 12 November 2021.

    --- Dataset description provided by original source is as follows ---

    Source

    This is the dataset used in this book: https://github.com/ageron/handson-ml/tree/master/datasets/housing to illustrate a sample end-to-end ML project workflow (pipeline). This is a great book - I highly recommend!

    The data is based on California Census in 1990.

    About the Data (from the book):

    "This dataset is a modified version of the California Housing dataset available from Luís Torgo's page (University of Porto). Luís Torgo obtained it from the StatLib repository (which is closed now). The dataset may also be downloaded from StatLib mirrors.

    The following is the description from the book author:

    This dataset appeared in a 1997 paper titled Sparse Spatial Autoregressions by Pace, R. Kelley and Ronald Barry, published in the Statistics and Probability Letters journal. They built it using the 1990 California census data. It contains one row per census block group. A block group is the smallest geographical unit for which the U.S. Census Bureau publishes sample data (a block group typically has a population of 600 to 3,000 people).

    The dataset in this directory is almost identical to the original, with two differences: 207 values were randomly removed from the total_bedrooms column, so we can discuss what to do with missing data. An additional categorical attribute called ocean_proximity was added, indicating (very roughly) whether each block group is near the ocean, near the Bay area, inland or on an island. This allows discussing what to do with categorical data. Note that the block groups are called "districts" in the Jupyter notebooks, simply because in some contexts the name "block group" was confusing."

    About the Data (From Luís Torgo page):

    http://www.dcc.fc.up.pt/%7Eltorgo/Regression/cal_housing.html

    This is a dataset obtained from the StatLib repository. Here is the included description:

    "We collected information on the variables using all the block groups in California from the 1990 Cens us. In this sample a block group on average includes 1425.5 individuals living in a geographically co mpact area. Naturally, the geographical area included varies inversely with the population density. W e computed distances among the centroids of each block group as measured in latitude and longitude. W e excluded all the block groups reporting zero entries for the independent and dependent variables. T he final data contained 20,640 observations on 9 variables. The dependent variable is ln(median house value)."

    End-to-End ML Project Steps (Chapter 2 of the book)

    1. Look at the big picture
    2. Get the data
    3. Discover and visualize the data to gain insights
    4. Prepare the data for Machine Learning algorithms
    5. Select a model and train it
    6. Fine-tune your model
    7. Present your solution
    8. Launch, monitor, and maintain your system

    The 10-Step Machine Learning Project Workflow (My Version)

    1. Define business object
    2. Make sense of the data from a high level
      • data types (number, text, object, etc.)
      • continuous/discrete
      • basic stats (min, max, std, median, etc.) using boxplot
      • frequency via histogram
      • scales and distributions of different features
    3. Create the traning and test sets using proper sampling methods, e.g., random vs. stratified
    4. Correlation analysis (pair-wise and attribute combinations)
    5. Data cleaning (missing data, outliers, data errors)
    6. Data transformation via pipelines (categorical text to number using one hot encoding, feature scaling via normalization/standardization, feature combinations)
    7. Train and cross validate different models and select the most promising one (Linear Regression, Decision Tree, and Random Forest were tried in this tutorial)
    8. Fine tune the model using trying different combinations of hyperparameters
    9. Evaluate the model with best estimators in the test set
    10. Launch, monitor, and refresh the model and system

    --- Original source retains full ownership of the source dataset ---

  6. Housing Cost Burden

    • data.ca.gov
    • data.chhs.ca.gov
    • +4more
    pdf, xlsx, zip
    Updated Aug 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2024). Housing Cost Burden [Dataset]. https://data.ca.gov/dataset/housing-cost-burden
    Explore at:
    xlsx, pdf, zipAvailable download formats
    Dataset updated
    Aug 28, 2024
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This table contains data on the percent of households paying more than 30% (or 50%) of monthly household income towards housing costs for California, its regions, counties, cities/towns, and census tracts. Data is from the U.S. Department of Housing and Urban Development (HUD), Consolidated Planning Comprehensive Housing Affordability Strategy (CHAS) and the U.S. Census Bureau, American Community Survey (ACS). The table is part of a series of indicators in the [Healthy Communities Data and Indicators Project of the Office of Health Equity] Affordable, quality housing is central to health, conferring protection from the environment and supporting family life. Housing costs—typically the largest, single expense in a family's budget—also impact decisions that affect health. As housing consumes larger proportions of household income, families have less income for nutrition, health care, transportation, education, etc. Severe cost burdens may induce poverty—which is associated with developmental and behavioral problems in children and accelerated cognitive and physical decline in adults. Low-income families and minority communities are disproportionately affected by the lack of affordable, quality housing. More information about the data table and a data dictionary can be found in the Attachments.

  7. T

    Vital Signs: Housing Production – by city

    • data.bayareametro.gov
    application/rdfxml +5
    Updated Feb 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Finance (2023). Vital Signs: Housing Production – by city [Dataset]. https://data.bayareametro.gov/dataset/Vital-Signs-Housing-Production-by-city/f2uk-mtng
    Explore at:
    csv, tsv, xml, application/rssxml, application/rdfxml, jsonAvailable download formats
    Dataset updated
    Feb 3, 2023
    Dataset authored and provided by
    California Department of Finance
    Description

    VITAL SIGNS INDICATOR Housing Production (LU4)

    FULL MEASURE NAME Produced housing units by unit type

    LAST UPDATED October 2019

    DESCRIPTION Housing production is measured in terms of the number of units that local jurisdictions produces throughout a given year. The annual production count captures housing units added by new construction and annexations, subtracts demolitions and destruction from natural disasters, and adjusts for units lost or gained by conversions.

    DATA SOURCE California Department of Finance Form E-8 1990-2010 http://www.dof.ca.gov/Forecasting/Demographics/Estimates/E-8/

    California Department of Finance Form E-5 2011-2018 http://www.dof.ca.gov/Forecasting/Demographics/Estimates/E-5/

    U.S. Census Bureau Population Estimates 2000-2018 https://www.census.gov/programs-surveys/popest.html

    CONTACT INFORMATION vitalsigns.info@bayareametro.gov

    METHODOLOGY NOTES (across all datasets for this indicator) Single-family housing units include single detached units and single attached units. Multi-family housing includes two to four units and five plus or apartment units.

    Housing production data for metropolitan areas for each year is the difference of annual housing unit estimates from the Census Bureau’s Population Estimates Program. Housing production data for the region, counties, and cities for each year is the difference of annual housing unit estimates from the California Department of Finance. Department of Finance data uses an annual cycle between January 1 and December 31, whereas U.S. Census Bureau data uses an annual cycle from April 1 to March 31 of the following year.

    Housing production data shows how many housing units have been produced over time. Like housing permit statistics, housing production numbers are an indicator of where the region is growing. However, since permitted units are sometimes not constructed or there can be a long lag time between permit approval and the start of construction, production data also reflects the effects of barriers to housing production. These range from a lack of builder confidence to high construction costs and limited financing. Data also differentiates the trends in multi-family, single-family and mobile home production.

  8. F

    New Private Housing Units Authorized by Building Permits: 1-Unit Structures...

    • fred.stlouisfed.org
    json
    Updated Jun 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). New Private Housing Units Authorized by Building Permits: 1-Unit Structures for California [Dataset]. https://fred.stlouisfed.org/series/CABP1FH
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jun 25, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Area covered
    California
    Description

    Graph and download economic data for New Private Housing Units Authorized by Building Permits: 1-Unit Structures for California (CABP1FH) from Jan 1988 to May 2025 about privately owned, 1-unit structures, permits, family, buildings, CA, housing, and USA.

  9. California Housing price prediction

    • kaggle.com
    Updated Mar 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siddartha Gandi (2023). California Housing price prediction [Dataset]. https://www.kaggle.com/datasets/siddarthagandi/california-housing-price-prediction
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 3, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Siddartha Gandi
    Area covered
    California
    Description

    The California Housing dataset is based on 1990 US census and is widely used for machine learning and statistics. It was published in 1990 by Pace, R. Kelley and Ronald Barry, and can be found in the UCI Machine Learning Repository. The California Data set gives the information about Economic and Geographic values of the Houses,and also the economic status of the people present in the California.

  10. d

    TIGER/Line Shapefile, 2010, 2010 state, California, 2010 Census Block...

    • catalog.data.gov
    Updated Jan 15, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). TIGER/Line Shapefile, 2010, 2010 state, California, 2010 Census Block State-based Shapefile with Housing and Population Data [Dataset]. https://catalog.data.gov/dataset/tiger-line-shapefile-2010-2010-state-california-2010-census-block-state-based-shapefile-with-ho
    Explore at:
    Dataset updated
    Jan 15, 2021
    Area covered
    California
    Description

    The TIGER/Line Files are shapefiles and related database files (.dbf) that are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The purpose of this file is to provide the geography for the 2010 Census Blocks along with their 2010 housing unit count and population. Census Blocks are statistical areas bounded on all sides by visible features, such as streets, roads, streams, and railroad tracks, and/or by nonvisible boundaries such as city, town, township, and county limits, and short line-of-sight extensions of streets and roads. Blocks are the smallest geographic areas for which the Census Bureau publishes data from the decennial census. A block may consist of one or more faces.

  11. QuickFacts: Bakersfield city, California

    • census.gov
    • shutdown.census.gov
    csv
    Updated Feb 25, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Census Bureau > Communications Directorate - Center for New Media and Promotion (2022). QuickFacts: Bakersfield city, California [Dataset]. https://www.census.gov/quickfacts/fact/faq/bakersfieldcitycalifornia/AFN120222
    Explore at:
    csvAvailable download formats
    Dataset updated
    Feb 25, 2022
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    United States Census Bureau > Communications Directorate - Center for New Media and Promotion
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Bakersfield, California
    Description

    U.S. Census Bureau QuickFacts statistics for Bakersfield city, California. QuickFacts data are derived from: Population Estimates, American Community Survey, Census of Population and Housing, Current Population Survey, Small Area Health Insurance Estimates, Small Area Income and Poverty Estimates, State and County Housing Unit Estimates, County Business Patterns, Nonemployer Statistics, Economic Census, Survey of Business Owners, Building Permits.

  12. N

    Income Distribution by Quintile: Mean Household Income in Orange County, CA...

    • neilsberg.com
    csv, json
    Updated Mar 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Income Distribution by Quintile: Mean Household Income in Orange County, CA // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/4838eca2-f81d-11ef-a994-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Mar 3, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Orange County, California
    Variables measured
    Income Level, Mean Household Income
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It delineates income distributions across income quintiles (mentioned above) following an initial analysis and categorization. Subsequently, we adjusted these figures for inflation using the Consumer Price Index retroactive series via current methods (R-CPI-U-RS). For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents the mean household income for each of the five quintiles in Orange County, CA, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.

    Key observations

    • Income disparities: The mean income of the lowest quintile (20% of households with the lowest income) is 23,564, while the mean income for the highest quintile (20% of households with the highest income) is 380,227. This indicates that the top earners earn 16 times compared to the lowest earners.
    • *Top 5%: * The mean household income for the wealthiest population (top 5%) is 663,324, which is 174.45% higher compared to the highest quintile, and 2814.99% higher compared to the lowest quintile.
    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Income Levels:

    • Lowest Quintile
    • Second Quintile
    • Third Quintile
    • Fourth Quintile
    • Highest Quintile
    • Top 5 Percent

    Variables / Data Columns

    • Income Level: This column showcases the income levels (As mentioned above).
    • Mean Household Income: Mean household income, in 2023 inflation-adjusted dollars for the specific income level.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Orange County median household income. You can refer the same here

  13. QuickFacts: Lafayette city, California

    • census.gov
    • shutdown.census.gov
    csv
    Updated Jul 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Census Bureau > Communications Directorate - Center for New Media and Promotion (2024). QuickFacts: Lafayette city, California [Dataset]. https://www.census.gov/quickfacts/fact/table/lafayettecitycalifornia/INC110223
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 1, 2024
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    United States Census Bureau > Communications Directorate - Center for New Media and Promotion
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    California, Lafayette city
    Description

    U.S. Census Bureau QuickFacts statistics for Lafayette city, California. QuickFacts data are derived from: Population Estimates, American Community Survey, Census of Population and Housing, Current Population Survey, Small Area Health Insurance Estimates, Small Area Income and Poverty Estimates, State and County Housing Unit Estimates, County Business Patterns, Nonemployer Statistics, Economic Census, Survey of Business Owners, Building Permits.

  14. F

    New Private Housing Structures Authorized by Building Permits for Humboldt...

    • fred.stlouisfed.org
    json
    Updated May 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). New Private Housing Structures Authorized by Building Permits for Humboldt County, CA [Dataset]. https://fred.stlouisfed.org/series/BPPRIV006023
    Explore at:
    jsonAvailable download formats
    Dataset updated
    May 23, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Area covered
    Humboldt County, California
    Description

    Graph and download economic data for New Private Housing Structures Authorized by Building Permits for Humboldt County, CA (BPPRIV006023) from 1990 to 2024 about Humboldt County, CA; permits; buildings; CA; private; housing; and USA.

  15. F

    Home Vacancy Rate for California

    • fred.stlouisfed.org
    json
    Updated Mar 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Home Vacancy Rate for California [Dataset]. https://fred.stlouisfed.org/series/CAHVAC
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Mar 18, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Area covered
    California
    Description

    Graph and download economic data for Home Vacancy Rate for California (CAHVAC) from 1986 to 2024 about vacancy, CA, housing, rate, and USA.

  16. T

    Vital Signs: Housing Permits - Bay Area (2022)

    • data.bayareametro.gov
    application/rdfxml +5
    Updated Feb 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Vital Signs: Housing Permits - Bay Area (2022) [Dataset]. https://data.bayareametro.gov/dataset/Vital-Signs-Housing-Permits-Bay-Area-2022-/wmxm-3pzn
    Explore at:
    json, csv, xml, application/rdfxml, application/rssxml, tsvAvailable download formats
    Dataset updated
    Feb 23, 2023
    Area covered
    San Francisco Bay Area
    Description

    VITAL SIGNS INDICATOR
    Housing Permits (LU3)

    FULL MEASURE NAME
    Permitted housing units

    LAST UPDATED
    February 2023

    DESCRIPTION
    Housing growth is measured in terms of the number of units that local jurisdictions permit throughout a given year. A permitted unit is a unit that a city or county has authorized for construction.

    DATA SOURCE
    California Housing Foundation/Construction Industry Research Board (CIRB) - https://www.cirbreport.org/
    Construction Review report (1967-2022)

    Association of Bay Area Governments (ABAG) – Metropolitan Transportation Commission (MTC) - https://data.bayareametro.gov/Development/HCD-Annual-Progress-Report-Jurisdiction-Summary/nxbj-gfv7
    Housing Permits Database (2014-2021)

    Census Bureau Building Permit Survey - https://www2.census.gov/econ/bps/County/
    Building permits by county (annual, monthly)

    CONTACT INFORMATION
    vitalsigns.info@bayareametro.gov

    METHODOLOGY NOTES (across all datasets for this indicator)
    Bay Area housing permits data by single/multi family come from the California Housing Foundation/Construction Industry Research Board (CIRB). Affordability breakdowns from 2014 to 2021 come from the Association of Bay Area Governments (ABAG) – Metropolitan Transportation Commission (MTC) Housing Permits Database.

    Single-family housing units include detached, semi-detached, row house and town house units. Row houses and town houses are included as single-family units when each unit is separated from the adjacent unit by an unbroken ground-to-roof party or fire wall. Condominiums are included as single-family units when they are of zero-lot-line or zero-property-line construction; when units are separated by an air space; or, when units are separated by an unbroken ground-to-roof party or fire wall. Multi-family housing includes duplexes, three-to-four-unit structures and apartment-type structures with five units or more. Multi-family also includes condominium units in structures of more than one living unit that do not meet the single-family housing definition.

    Each multi-family unit is counted separately even though they may be in the same building. Total units is the sum of single-family and multi-family units. County data is available from 1967 whereas city data is available from 1990. City data is only available for incorporated cities and towns. All permits in unincorporated cities and towns are included under their respective county’s unincorporated total. Permit data is not available for years when the city or town was not incorporated.

    Affordable housing is the total number of permitted units affordable to low and very low income households. Housing affordable to very low income households are households making below 50% of the area median income. Housing affordable to low income households are households making between 50% and 80% of the area median income. Housing affordable to moderate income households are households making below 80% and 120% of the area median income. Housing affordable to above moderate income households are households making above 120% of the area median income.

    Permit data is missing for the following cities and years:
    Clayton, 1990-2007
    Lafayette, 1990-2007
    Moraga, 1990-2007
    Orinda, 1990-2007
    San Ramon, 1990

    Building permit data for metropolitan areas for each year is the sum of non-seasonally adjusted monthly estimates from the Census Building Permit Survey. The Bay Area values are the sum of the San Francisco-Oakland-Hayward MSA and the San Jose-Sunnyvale-Santa Clara MSA. The counties included in these areas are: San Francisco, Marin, Contra Costa, Alameda, San Mateo, Santa Clara, and San Benito.

    Permit values reflect the number of units permitted in each respective year. Note that the data columns come from difference sources. The columns (SFunits, MFunits, TOTALunits, SF_Share and MF_Share) are sourced from CIRB. The columns (VeryLowunits, Lowunits, Moderateunits, AboveModerateunits, VeryLow_Share, Low_Share, Moderate_Share, AboveModerate_Share, Affordableunits and Affordableunits_Share) are sourced from the ABAG Housing Permits Database. Due to the slightly different methodologies that exist within each of those datasets, the total units from each of the two sources might not be consistent with each other.

    As shown, three different data sources are used for this analysis of housing permits issued in the Bay Area. Data from the Construction Industry Research Board (CIRB) represents the best available data source for examining housing permits issued over time in cities and counties across the Bay Area, dating back to 1967. In recent years, Annual Progress Report (APR) data collected by the California Department of Housing and Community Development has been available for analyzing housing permits issued by affordability levels. Since CIRB data is only available for California jurisdictions, the U.S. Census Bureau provides the best data source for comparing housing permits issued across different metropolitan areas. Notably, annual permit totals for the Bay Area differ across these three data sources, reflecting the limitations of needing to use different data sources for different purposes.

  17. Hands on Machine Learning Book - Housing Dataset

    • kaggle.com
    Updated Mar 13, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Walace Oliveira (2019). Hands on Machine Learning Book - Housing Dataset [Dataset]. https://www.kaggle.com/walacedatasci/hands-on-machine-learning-housing-dataset/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 13, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Walace Oliveira
    Description

    Source

    This dataset is a modified version of the California Housing dataset available from Luís Torgo's page (University of Porto). Luís Torgo obtained it from the StatLib repository (which is closed now). The dataset may also be downloaded from StatLib mirrors.

    This dataset appeared in a 1997 paper titled Sparse Spatial Autoregressions by Pace, R. Kelley and Ronald Barry, published in the Statistics and Probability Letters journal. They built it using the 1990 California census data. It contains one row per census block group. A block group is the smallest geographical unit for which the U.S. Census Bureau publishes sample data (a block group typically has a population of 600 to 3,000 people)

    Tweaks

    The dataset in this directory is almost identical to the original, with two differences:

    207 values were randomly removed from the total_bedrooms column, so we can discuss what to do with missing data. An additional categorical attribute called ocean_proximity was added, indicating (very roughly) whether each block group is near the ocean, near the Bay area, inland or on an island. This allows discussing what to do with categorical data. Note that the block groups are called "districts" in the Jupyter notebooks, simply because in some contexts the name "block group" was confusing.

  18. Low-Income or Disadvantaged Communities Designated by California

    • data.ca.gov
    • data.cnra.ca.gov
    • +4more
    Updated Jun 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Energy Commission (2025). Low-Income or Disadvantaged Communities Designated by California [Dataset]. https://data.ca.gov/dataset/low-income-or-disadvantaged-communities-designated-by-california
    Explore at:
    zip, geojson, kml, csv, arcgis geoservices rest api, htmlAvailable download formats
    Dataset updated
    Jun 11, 2025
    Dataset authored and provided by
    California Energy Commissionhttp://www.energy.ca.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    California
    Description

    This layer shows census tracts that meet the following definitions: Census tracts with median household incomes at or below 80 percent of the statewide median income or with median household incomes at or below the threshold designated as low income by the Department of Housing and Community Development’s list of state income limits adopted under Healthy and Safety Code section 50093 and/or Census tracts receiving the highest 25 percent of overall scores in CalEnviroScreen 4.0 or Census tracts lacking overall scores in CalEnviroScreen 4.0 due to data gaps, but receiving the highest 5 percent of CalEnviroScreen 4.0 cumulative population burden scores or Census tracts identified in the 2017 DAC designation as disadvantaged, regardless of their scores in CalEnviroScreen 4.0 or Lands under the control of federally recognized Tribes.


    Data downloaded in May 2022 from https://webmaps.arb.ca.gov/PriorityPopulations/.

  19. F

    New Private Housing Structures Authorized by Building Permits for Sierra...

    • fred.stlouisfed.org
    json
    Updated May 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). New Private Housing Structures Authorized by Building Permits for Sierra County, CA [Dataset]. https://fred.stlouisfed.org/series/BPPRIV006091
    Explore at:
    jsonAvailable download formats
    Dataset updated
    May 23, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Area covered
    Sierra County, California
    Description

    Graph and download economic data for New Private Housing Structures Authorized by Building Permits for Sierra County, CA (BPPRIV006091) from 1990 to 2024 about Sierra County, CA; permits; buildings; CA; private; housing; and USA.

  20. N

    Age-wise distribution of California, MO household incomes: Comparative...

    • neilsberg.com
    csv, json
    Updated Jan 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Age-wise distribution of California, MO household incomes: Comparative analysis across 16 income brackets [Dataset]. https://www.neilsberg.com/research/datasets/8567153c-8dec-11ee-9302-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Jan 9, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    California, Missouri
    Variables measured
    Number of households with income $200,000 or more, Number of households with income less than $10,000, Number of households with income between $15,000 - $19,999, Number of households with income between $20,000 - $24,999, Number of households with income between $25,000 - $29,999, Number of households with income between $30,000 - $34,999, Number of households with income between $35,000 - $39,999, Number of households with income between $40,000 - $44,999, Number of households with income between $45,000 - $49,999, Number of households with income between $50,000 - $59,999, and 6 more
    Measurement technique
    The data presented in this dataset is derived from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. It delineates income distributions across 16 income brackets (mentioned above) following an initial analysis and categorization. Using this dataset, you can find out the total number of households within a specific income bracket along with how many households with that income bracket for each of the 4 age cohorts (Under 25 years, 25-44 years, 45-64 years and 65 years and over). For additional information about these estimations, please contact us via email at research@neilsberg.com
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset presents the the household distribution across 16 income brackets among four distinct age groups in California: Under 25 years, 25-44 years, 45-64 years, and over 65 years. The dataset highlights the variation in household income, offering valuable insights into economic trends and disparities within different age categories, aiding in data analysis and decision-making..

    Key observations

    • Upon closer examination of the distribution of households among age brackets, it reveals that there are 86(4.88%) households where the householder is under 25 years old, 593(33.67%) households with a householder aged between 25 and 44 years, 507(28.79%) households with a householder aged between 45 and 64 years, and 575(32.65%) households where the householder is over 65 years old.
    • In California, the age group of 25 to 44 years stands out with both the highest median income and the maximum share of households. This alignment suggests a financially stable demographic, indicating an established community with stable careers and higher incomes.
    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Income brackets:

    • Less than $10,000
    • $10,000 to $14,999
    • $15,000 to $19,999
    • $20,000 to $24,999
    • $25,000 to $29,999
    • $30,000 to $34,999
    • $35,000 to $39,999
    • $40,000 to $44,999
    • $45,000 to $49,999
    • $50,000 to $59,999
    • $60,000 to $74,999
    • $75,000 to $99,999
    • $100,000 to $124,999
    • $125,000 to $149,999
    • $150,000 to $199,999
    • $200,000 or more

    Variables / Data Columns

    • Household Income: This column showcases 16 income brackets ranging from Under $10,000 to $200,000+ ( As mentioned above).
    • Under 25 years: The count of households led by a head of household under 25 years old with income within a specified income bracket.
    • 25 to 44 years: The count of households led by a head of household 25 to 44 years old with income within a specified income bracket.
    • 45 to 64 years: The count of households led by a head of household 45 to 64 years old with income within a specified income bracket.
    • 65 years and over: The count of households led by a head of household 65 years and over old with income within a specified income bracket.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for California median household income by age. You can refer the same here

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). California Housing Prices Dataset [Dataset]. https://paperswithcode.com/dataset/california-housing-prices

California Housing Prices Dataset

Explore at:
Dataset updated
Sep 19, 2024
Area covered
California
Description

Median house prices for California districts derived from the 1990 census.

About Dataset

Context This is the dataset used in the second chapter of Aurélien Géron's recent book 'Hands-On Machine learning with Scikit-Learn and TensorFlow'. It serves as an excellent introduction to implementing machine learning algorithms because it requires rudimentary data cleaning, has an easily understandable list of variables and sits at an optimal size between being to toyish and too cumbersome.

The data contains information from the 1990 California census. So although it may not help you with predicting current housing prices like the Zillow Zestimate dataset, it does provide an accessible introductory dataset for teaching people about the basics of machine learning.

Content The data pertains to the houses found in a given California district and some summary stats about them based on the 1990 census data. Be warned the data aren't cleaned so there are some preprocessing steps required! The columns are as follows, their names are pretty self-explanatory: - longitude - latitude - housing_median_age - total_rooms - total_bedrooms - population - households - median_income - median_house_value - ocean_proximity

Acknowledgements This data was initially featured in the following paper: Pace, R. Kelley, and Ronald Barry. "Sparse spatial autoregressions." Statistics & Probability Letters 33.3 (1997): 291-297.

and I encountered it in 'Hands-On Machine learning with Scikit-Learn and TensorFlow' by Aurélien Géron. Aurélien Géron wrote: This dataset is a modified version of the California Housing dataset available from: Luís Torgo's page (University of Porto)

Inspiration See my kernel on machine learning basics in R using this dataset, or venture over to the following link for a python based introductory tutorial: https://github.com/ageron/handson-ml/tree/master/datasets/housing

Search
Clear search
Close search
Google apps
Main menu