100+ datasets found

Utrecht housing dataset
kaggle.com
Updated Jan 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ICT Institute (2025). Utrecht housing dataset [Dataset]. https://www.kaggle.com/datasets/ictinstitute/utrecht-housing-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 27, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
ICT Institute
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Area covered
Utrecht
Description
The Utrecht housing dataset is a freely available dataset that can be used by students to learn about data science and machine learning. The older versions are synthetic datasets. The latest version is an actual dataset based on data collected from a house offering website (Funda) and official land registry (Kadaster).

This dataset is described in the following accompanying paper: - Van Otterloo, S and Burda, P. 2025. The Utrecht Housing dataset: A housing appraisal dataset. Computers and Society Research Journal (2025), 1. The paper can be downloaded here: https://ictinstitute.nl/utrecht-housing-dataset-2025/.

History In July 2022, Stefan Leijnen and Sieuwert van Otterloo taught a one week summerschool ‘AI and machine learning’ at the Utrecht University of Applied Sciences. The goal of this summer school is to make AI and Machine Learning accessible to as many people as possible. Using AI without properly understanding it comes with risks. We want to reduce these risks by giving students from all backgrounds the tools and knowledge to understand AI. Luckily, AI has become more accessible thanks to the existence of many free and open tools and libraries. Any student can train and test algorithms with only a few days of training.

The Utrecht Housing Dataset was designed for use during day 1, day 2 and day 3. The dataset has multiple different input variables that are interesting to explore. The size is such that it is well suited for visualisations. The dataset represents one of the core tenets of responsible AI: AI should be made accessible to a wide group of people, so that anyone with some university experience can test and evaluate algorithms.

When developing the summerschool, we could not find a dataset that was both interesting to analyse and easy to use. Existing datasets often have data quality issues that distract from the learning goals, or are only suited for illustrating one phenomenon. Many classical machine learning datasets also do not have meaningful tasks. The problems that one can do with these datasets are either too basic or theoretical. The Utrecht Housing Dataset thus offers a new combination that we found useful in our classroom.

The dataset is released as creative commons, and can be used freely for any purpose. If you use it, please refer to it as the “The Utrecht housing dataset – example dataset for prediction” by Sieuwert van Otterloo, www.ictinstitute.nl or refer to Sieuwert van Otterloo as the author/source.

The dataset is provided as a CSV file. Each line contains data for one house. The values are seperated by commas.
Existing own homes; average purchase prices, region
data.overheid.nl
staging.dexes.eu
+3more
atom, json
Updated Feb 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Centraal Bureau voor de Statistiek (Rijk) (2025). Existing own homes; average purchase prices, region [Dataset]. https://data.overheid.nl/dataset/4146-existing-own-homes--average-purchase-prices--region
Explore at:
json(KB), atom(KB)Available download formats
Dataset updated
Feb 17, 2025
Dataset provided by
Statistics Netherlands
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This table shows the average purchase price that has been paid in the reporting period for existing own homes purchased by a private individual. The average purchase price of existing own homes may differ from the price index of existing own homes. The average purchase price is no indicator for price developments of owner-occupied residential property. The average purchase price reflects the average price of dwellings sold in a particular period. The fact that de dwellings sold differs from one period to another is not taken into account. The following instance explains which problems are entailed by the continually changing of the quality of the dwellings sold. Suppose in February of a particular year mainly big houses with extensive gardens beautifully situated alongside canals are sold, whereas in March many small terraced houses are sold. In that case the average purchase price in February will be higher than in March but this does not mean that house prices are increased. See note 3 for a link to the article 'Why the average purchase price is not an indicator'.

Data available from: 1995

Status of the figures: The figures in this table are immediately definitive. The calculation of these figures is based on the number of notary transactions that are registered every month by the Dutch Land Registry Office (Kadaster). A revision of the figures is exceptional and occurs specifically if an error significantly exceeds the acceptable statistical margins. The average purchasing prices of existing owner-occupied sold homes can be calculated by Kadaster at a later date. These figures are usually the same as the publication on Statline, but in some periods they differ. Kadaster calculates the average purchasing prices based on the most recent data. These may have changed since the first publication. Statistics Netherlands uses figures from the first publication in accordance with the revision policy described above.

Changes as of 17 February 2025: Added average purchase prices of the municipalities for the year 2024.

When will new figures be published? New figures are published approximately one to three months after the period under review.
T
United States Existing Home Sales
tradingeconomics.com
ar.tradingeconomics.com
+12more
csv, excel, json, xml
Updated Jul 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). United States Existing Home Sales [Dataset]. https://tradingeconomics.com/united-states/existing-home-sales
Explore at:
csv, json, xml, excelAvailable download formats
Dataset updated
Jul 23, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 31, 1968 - Jun 30, 2025
Area covered
United States
Description
Existing Home Sales in the United States decreased to 3930 Thousand in June from 4040 Thousand in May of 2025. This dataset provides the latest reported value for - United States Existing Home Sales - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
T
United States Housing Starts
tradingeconomics.com
zh.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Jul 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). United States Housing Starts [Dataset]. https://tradingeconomics.com/united-states/housing-starts
Explore at:
json, excel, csv, xmlAvailable download formats
Dataset updated
Jul 18, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 31, 1959 - Jun 30, 2025
Area covered
United States
Description
Housing Starts in the United States increased to 1321 Thousand units in June from 1263 Thousand units in May of 2025. This dataset provides the latest reported value for - United States Housing Starts - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
T
United States Total Housing Inventory
tradingeconomics.com
zh.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Jul 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). United States Total Housing Inventory [Dataset]. https://tradingeconomics.com/united-states/total-housing-inventory
Explore at:
excel, json, xml, csvAvailable download formats
Dataset updated
Jul 23, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jun 30, 1982 - Jun 30, 2025
Area covered
United States
Description
Total Housing Inventory in the United States decreased to 1530 Thousands in June from 1540 Thousands in May of 2025. This dataset includes a chart with historical data for the United States Total Housing Inventory.
House Price Prediction Dataset : InsuranceHub- USA
kaggle.com
Updated Aug 2, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bs004 (2020). House Price Prediction Dataset : InsuranceHub- USA [Dataset]. https://www.kaggle.com/datasets/bharatsahu/house-price-prediction-dataset-insurancehub-usa
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 2, 2020
Dataset provided by
Kaggle
Authors
Bs004
Area covered
United States
Description
Context

Insurance companies collect multiple features of a House and select which houses can be insured and what amount they can charge the Premium from them. So here I have collected data from multiple insurance companies in USA where features with house prices are given

Content

This data set has many property details from address to their location co ordinates nad many other features, use them to predict the House price

Inspiration

Multiple regression datasets have been published every one unique in their own way, Use of location coordinates and some other co-ordinates are new here.
Average Second Hand House Price - Dataset - data.gov.ie
data.gov.ie
Updated Sep 12, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.gov.ie (2016). Average Second Hand House Price - Dataset - data.gov.ie [Dataset]. https://data.gov.ie/dataset/average-second-hand-house-price
Explore at:
Dataset updated
Sep 12, 2016
Dataset provided by
data.gov.ie
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Average house prices are derived from data supplied by the mortgage lending agencies on loans approved by them rather than loans paid. In comparing house prices figures from one period to another, account should be taken of the fact that changes in the mix of houses (incl apartments) will affect the average figures. The most current data is published on these sheets. Previously published data may be subject to revision. Any change from the originally published data will be highlighted by a comment on the cell in question. These comments will be maintained for at least a year after the date of the value change. Excluding apartments, measured in € Figure changed on the 27/6/16 as revised data received from the Local authority
d
Housing Cost Burden by Race
catalog.data.gov
data.seattle.gov
+3more
Updated Jan 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Seattle ArcGIS Online (2025). Housing Cost Burden by Race [Dataset]. https://catalog.data.gov/dataset/housing-cost-burden-by-race-cea20
Explore at:
Dataset updated
Jan 31, 2025
Dataset provided by
City of Seattle ArcGIS Online
Description
Displacement risk indicator showing how many households within the specified groups are facing either housing cost burden (contributing more than 30% of monthly income toward housing costs) or severe housing cost burden (contributing more than 50% of monthly income toward housing costs).
b
Real Estate Dataset
brightdata.com
.json, .csv, .xlsx
Updated Sep 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2022). Real Estate Dataset [Dataset]. https://brightdata.com/products/datasets/real-estate
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Sep 11, 2022
Dataset authored and provided by
Bright Data
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Real estate datasets from various websites cover all major real estate data points including: property type, size, location, price, bedrooms, baths, address, history, images, and much more. Popular use cases include: forecast housing demand, analyze price fluctuations, improve customer satisfaction, see past prices to monitor market trends, and more.
d
Housing Database
catalog.data.gov
data.cityofnewyork.us
+1more
Updated Jan 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cityofnewyork.us (2025). Housing Database [Dataset]. https://catalog.data.gov/dataset/housing-database
Explore at:
Dataset updated
Jan 10, 2025
Dataset provided by
data.cityofnewyork.us
Description
The NYC Department of City Planning’s (DCP) Housing Database contains all NYC Department of Buildings (DOB) approved housing construction and demolition jobs filed or completed in NYC since January 1, 2010. It includes the three primary construction job types that add or remove residential units: new buildings, major alterations, and demolitions, and can be used to determine the change in legal housing units across time and space. Records in the Housing Database Project-Level Files are geocoded to the greatest level of precision possible, subject to numerous quality assurance and control checks, recoded for usability, and joined to other housing data sources relevant to city planners and analysts. Data are updated semiannually, at the end of the second and fourth quarters of each year. Please see DCP’s annual Housing Production Snapshot summarizing findings from the 21Q4 data release here. Additional Housing and Economic analyses are also available. The NYC Department of City Planning’s (DCP) Housing Database Unit Change Summary Files provide the net change in Class A housing units since 2010, and the count of units pending completion for commonly used political and statistical boundaries (Census Block, Census Tract, City Council district, Community District, Community District Tabulation Area (CDTA), Neighborhood Tabulation Area (NTA). These tables are aggregated from the DCP Housing Database Project-Level Files, which is derived from Department of Buildings (DOB) approved housing construction and demolition jobs filed or completed in NYC since January 1, 2010. Net housing unit change is calculated as the sum of all three construction job types that add or remove residential units: new buildings, major alterations, and demolitions. These files can be used to determine the change in legal housing units across time and space.
Socio-economic, physical, housing, eviction, and risk dataset (SEPHER) ***
redivis.com
application/jsonl +7
Updated Jan 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Environmental Impact Data Collaborative (2023). Socio-economic, physical, housing, eviction, and risk dataset (SEPHER) *** [Dataset]. https://redivis.com/datasets/7mkv-4r0gdseef
Explore at:
parquet, spss, arrow, csv, avro, sas, stata, application/jsonlAvailable download formats
Dataset updated
Jan 16, 2023
Dataset provided by
Redivis Inc.
Authors
Environmental Impact Data Collaborative
Time period covered
Jan 1, 2000 - Dec 31, 2018
Description
Abstract

The purpose of the SEPHER data set is to allow for testing, assessing and generating new analysis and metrics that can address inequalities and climate injustice. The data set was created by Tedesco, M., C. Hultquist, S. E. Char, C. Constantinides, T. Galjanic, and A. D. Sinha.

Methodology

SEPHER draws upon four major source datasets: CDC Social Vulnerability Index, FEMA National Risk Index, Home Mortgage Disclosure Act, and Evictions datasets. The data from these source datasets have been merged, cleaned, and standardized and all of the variables documented in the data dictionary.

CDC Social Vulnerability Index

CDC Social Vulnerability Index (SVI) dataset is a dataset prepared for the Centers for Disease Control and Prevention for the purpose of assessing the degree of social vulnerability of American communities to natural hazards and anthropogenic events. It contains data on 15 social factors taken or derived from Census reports as well as rankings of each tract based on these individual factors, groups of factors corresponding to four related themes (Socioeconomic, Household Composition & Disability, Minority Status & Language, and Housing Type & Transportation) and overall. The data is available for the years 2000, 2010, 2014, 2016, and 2018.

FEMA National Risk Index

The National Risk Index (NRI) dataset compiled by the Federal Emergency Management Agency (FEMA) consists of historic natural disaster data from across the United States at a tract-level. The dataset includes information about 18 natural disasters including earthquakes, tsunamis, wildfires, volcanic activity and many others. Each disaster is detailed out in terms of its frequency, historic impact, potential exposure, expected annual loss and associated risk. The dataset also includes some summary variables for each tract including the total expected loss in terms of building loss, human loss and agricultural loss, the population of the tract, and the area covered by the tract. It finally includes a few more features to characterize the population such as social vulnerability rating and community resilience.

Home Mortgage Disclosure Act

The Home Mortgage Disclosure Act (HMDA) dataset contains loan-level data for home mortgages including information on applications, denials, approvals, and institution purchases. It is managed and expanded annually by the Consumer Financial Protection Bureau based on the data collected from financial institutions. The dataset is used by public officials to make decisions and policies, uncover lending patterns and discrimination among mortgage applicants, and investigate if lenders are serving the housing needs of the communities. It covers the period from 2007 to 2017.

Evictions

The Evictions dataset is compiled and managed by the Eviction Lab at Princeton University and consists of court records related to eviction cases in the United States between 2000 and 2016. Its purpose is to estimate the prevalence of court-ordered evictions and compare eviction rates among states, counties, cities, and neighborhoods. Besides information on eviction filings and judgments, the dataset includes socioeconomic and real estate data for each tract including race/ethnic origin, household income, poverty rate, property value, median gross rent, rent burden, and others.
T
United States New Home Sales
tradingeconomics.com
it.tradingeconomics.com
+13more
csv, excel, json, xml
Updated May 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). United States New Home Sales [Dataset]. https://tradingeconomics.com/united-states/new-home-sales
Explore at:
csv, json, excel, xmlAvailable download formats
Dataset updated
May 23, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 31, 1963 - Jun 30, 2025
Area covered
United States
Description
New Home Sales in the United States increased to 627 Thousand units in June from 623 Thousand units in May of 2025. This dataset provides the latest reported value for - United States New Home Sales - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
e
Second homes
data.europa.eu
cloud.csiss.gmu.edu
+1more
csv, excel xls
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cambridgeshire Insight, Second homes [Dataset]. https://data.europa.eu/data/datasets/second-homes1
Explore at:
csv, excel xlsAvailable download formats
Dataset authored and provided by
Cambridgeshire Insight
Description
Are there many properties used as second homes in our local area?
How many people live locally and own a second homes elsewhere in England and Wales?

You can use this summary of Census 2011 data, produced by the Office for Natinal Statistics (ONS) to highlight some key facts about second home ownership across Cambridgeshire, Peterborough and West Suffolk.
Orlando Neighborhood
kaggle.com
Updated Oct 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sebastian Giovannini (2022). Orlando Neighborhood [Dataset]. https://www.kaggle.com/datasets/sgiov95/orlando-neighborhood
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 7, 2022
Dataset provided by
Kaggle
Authors
Sebastian Giovannini
Area covered
Orlando
Description
This dataset is a snapshot from October 2022 of all 48 homes in a section of a neighborhood nearby a large university in Central Florida. All of the homes are single family homes featuring a garage, a driveway, and a fenced-in backyard. Data was gathered by hand (keyboard) via a collection of sites, including Zillow, Realtor, Redfin, Trulia, and Orange County Property Appraiser. All homes were built in the same year in the early 2000's and feature central air and all other utilities typical of contemporary suburban homes in the United States. The area is close to a university and a large portion of renters are college students and young professionals, as well as families and older adults.

There are 30 columns:

HID: House ID, a unique identifier for each house (int from 1 to 48, not the actual address number) -Sqft: The Square Footage of the Interior of the house (int) -LandSqft: The Total Square Footage of the land (int) -Neighbors: The number of homes directly adjacent to each house (int) -Stories: The number of stories in each house (int) -Pool: Does the house have a pool (int, 0 for 'No', 1 for 'Yes') -Bedrooms: The number of bedrooms in each house (int) -Bathrooms: The number of bathrooms (full or half) in each house (int) -DateLastSold: The date on which the house was last sold (datetime) -PropertyTaxes2022: The annual property taxes for 2022 (float) -OwnedByBank: Is the house owned by a bank (int, 0 for 'No', 1 for 'Yes') -OuterPortion: Is the house on the Outer Portion of the Neighborhood (int, 0 for 'No', 1 for 'Yes') -NextToLoudRoad: Is the house directly adjacent to a loud road (int, 0 for 'No', 1 for 'Yes') -PriceLastSold: Price that the house was last sold for (float) -Zestimate: Zillow's Price Estimate for the house (float) -RentZestimate: Zillow's Estimate for the Monthly Price of rent for the house (float) -RealtorcomEstimate: Realtor dot com's Estimate for the house (float) -RedfinEstimate: Redfin's Estimate for the house (float) -TruliaEstimate: Trulia's Estimate for the house (float) -OCPALandValue2022: The Land Value on the county's 2022 records (float) -OCPABuildingValue2022: The Building Value on the county's 2022 records (float) -OCPAFeaturesValue2022: The Features Value on the county's 2022 records (float) -OCPAMarketValue2022: The Market Value on the county's 2022 records (float) -OCPAAssessedValue2022: The Assessed Value on the county's 2022 records (float), AKA what homeowners are taxed on -OCPALandValue2021: The Land Value on the county's 2021 records (float) -OCPABuildingValue2021: The Building Value on the county's 2021 records (float) -OCPAFeaturesValue2021: The Features Value on the county's 2021 records (float) -OCPAMarketValue2021: The Market Value on the county's 2021 records (float) -OCPAAssessedValue2021: The Assessed Value on the county's 2021 records (float), AKA what homeowners are taxed on -Notes: any notes on any of the homes (str)

Note that while the dataset is exhaustive in that it has all of the houses, some homes are missing some columns, typically because a home did not feature a estimate on a site or the one home not found on the property appraiser's site. This also is therefore not a randomized dataset, so the only population of homes that it can be used to infer on are those within this specific portion of the neighborhood. Personally, I am going to use the dataset to practice a couple of aspects of real-world data: Cleaning, Imputing, and Exploratory Data Analysis. Mainly, I want to compare different approaches to filling in the missing values of the dataset, then do some Model Building with some additional Dimensionality Reduction.
USA Housing Dataset
kaggle.com
Updated Feb 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ArnavGupta (2025). USA Housing Dataset [Dataset]. https://www.kaggle.com/datasets/arnavgupta1205/usa-housing-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 5, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
ArnavGupta
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered
United States
Description
This USA Housing Market Dataset (Synthetic) contains 300 rows and 10 columns of real estate-related data designed for housing price prediction, trend analysis, and investment insights. It includes key property details such as price, number of bedrooms and bathrooms, square footage, year built, garage spaces, lot size, zip code, crime rate, and school ratings.

This dataset is ideal for: ✅ Machine Learning Models for predicting housing prices ✅ Market Research & Investment Analysis ✅ Exploring Property Trends in the USA ✅ Educational Purposes for Data Science and Analytics

This dataset provides a realistic yet synthetic view of the real estate market, making it useful for data-driven decision-making in the housing industry.

Let me know if you need any modifications!
Live tables on housing supply: indicators of new supply
gov.uk
s3.amazonaws.com
Updated Jun 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ministry of Housing, Communities and Local Government (2025). Live tables on housing supply: indicators of new supply [Dataset]. https://www.gov.uk/government/statistical-data-sets/live-tables-on-house-building
Explore at:
Dataset updated
Jun 20, 2025
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Ministry of Housing, Communities and Local Government
Description
Local authorities compiling this data or other interested parties may wish to see notes and definitions for house building which includes P2 full guidance notes.

Live tables

Data from live tables 253 and 253a is also published as http://opendatacommunities.org/def/concept/folders/themes/house-building" class="govuk-link">Open Data (linked data format).

https://assets.publishing.service.gov.uk/media/68541eb5a3a282804858153b/LiveTable213.ods">

https://assets.publishing.service.gov.uk/media/68541eb5a3a282804858153b/LiveTable213.ods">Table 213: permanent dwellings started and completed, by tenure, England (quarterly)

<abbr title="OpenDocument Spreadsheet" class="gem-c-attachment_abbr">ODS</abbr>, 26.7 KB This file is in an <a href="https://www.gov.uk/guidance/using-open-document-formats-odf-in-your-organisation" target="_self" class="govuk-link">OpenDocument</a> format

https://assets.publishing.service.gov.uk/media/68541ee7a3a282804858153c/LiveTable217.ods">

https://assets.publishing.service.gov.uk/media/68541ee7a3a282804858153c/LiveTable217.ods">Table 217: permanent dwellings started and completed by tenure and region (quarterly)

<abbr title="OpenDocument Spreadsheet" class="gem-c-attachment_abbr">ODS</abbr>, 113 KB This file is in an <a href="https://www.gov.uk/guidance/using-open-document-formats-odf-in-your-organisation" target="_self" class="govuk-link">OpenDocument</a> format

https://assets.publishing.service.gov.uk/media/68541ef6a3a282804858153d/LiveTable222.ods">
Hands on Machine Learning Book - Housing Dataset
kaggle.com
Updated Mar 13, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Walace Oliveira (2019). Hands on Machine Learning Book - Housing Dataset [Dataset]. https://www.kaggle.com/walacedatasci/hands-on-machine-learning-housing-dataset/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 13, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Walace Oliveira
Description
Source

This dataset is a modified version of the California Housing dataset available from Luís Torgo's page (University of Porto). Luís Torgo obtained it from the StatLib repository (which is closed now). The dataset may also be downloaded from StatLib mirrors.

This dataset appeared in a 1997 paper titled Sparse Spatial Autoregressions by Pace, R. Kelley and Ronald Barry, published in the Statistics and Probability Letters journal. They built it using the 1990 California census data. It contains one row per census block group. A block group is the smallest geographical unit for which the U.S. Census Bureau publishes sample data (a block group typically has a population of 600 to 3,000 people)

Tweaks

The dataset in this directory is almost identical to the original, with two differences:

207 values were randomly removed from the total_bedrooms column, so we can discuss what to do with missing data. An additional categorical attribute called ocean_proximity was added, indicating (very roughly) whether each block group is near the ocean, near the Bay area, inland or on an island. This allows discussing what to do with categorical data. Note that the block groups are called "districts" in the Jupyter notebooks, simply because in some contexts the name "block group" was confusing.
F
Median Sales Price of Houses Sold for the United States
fred.stlouisfed.org
json
Updated Jul 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Median Sales Price of Houses Sold for the United States [Dataset]. https://fred.stlouisfed.org/series/MSPUS
Explore at:
jsonAvailable download formats
Dataset updated
Jul 24, 2025
License
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Area covered
United States
Description
Graph and download economic data for Median Sales Price of Houses Sold for the United States (MSPUS) from Q1 1963 to Q2 2025 about sales, median, housing, and USA.
Wildfire Risk to Communities Housing Unit Density (Image Service)
catalog.data.gov
resilience.climate.gov
+11more
Updated Apr 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Forest Service (2025). Wildfire Risk to Communities Housing Unit Density (Image Service) [Dataset]. https://catalog.data.gov/dataset/wildfire-risk-to-communities-housing-unit-density-image-service-fac22
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
U.S. Department of Agriculture Forest Servicehttp://fs.fed.us/
Description
The data included in this publication depict components of wildfire risk specifically for populated areas in the United States. These datasets represent where people live in the United States and the in situ risk from wildfire, i.e., the risk at the location where the adverse effects take place.National wildfire hazard datasets of annual burn probability and fire intensity, generated by the USDA Forest Service, Rocky Mountain Research Station and Pyrologix LLC, form the foundation of the Wildfire Risk to Communities data. Vegetation and wildland fuels data from LANDFIRE 2020 (version 2.2.0) were used as input to two different but related geospatial fire simulation systems. Annual burn probability was produced with the USFS geospatial fire simulator (FSim) at a relatively coarse cell size of 270 meters (m). To bring the burn probability raster data down to a finer resolution more useful for assessing hazard and risk to communities, we upsampled them to the native 30 m resolution of the LANDFIRE fuel and vegetation data. In this upsampling process, we also spread values of modeled burn probability into developed areas represented in LANDFIRE fuels data as non-burnable. Burn probability rasters represent landscape conditions as of the end of 2020. Fire intensity characteristics were modeled at 30 m resolution using a process that performs a comprehensive set of FlamMap runs spanning the full range of weather-related characteristics that occur during a fire season and then integrates those runs into a variety of results based on the likelihood of those weather types occurring. Before the fire intensity modeling, the LANDFIRE 2020 data were updated to reflect fuels disturbances occurring in 2021 and 2022. As such, the fire intensity datasets represent landscape conditions as of the end of 2022. The data products in this publication that represent where people live, reflect 2021 estimates of housing unit and population counts from the U.S. Census Bureau, combined with building footprint data from Onegeo and USA Structures, both reflecting 2022 conditions.The specific raster datasets included in this publication include:Building Count: Building Count is a 30-m raster representing the count of buildings in the building footprint dataset located within each 30-m pixel.Building Density: Building Density is a 30-m raster representing the density of buildings in the building footprint dataset (buildings per square kilometer [km²]).Building Coverage: Building Coverage is a 30-m raster depicting the percentage of habitable land area covered by building footprints.Population Count (PopCount): PopCount is a 30-m raster with pixel values representing residential population count (persons) in each pixel.Population Density (PopDen): PopDen is a 30-m raster of residential population density (people/km²).Housing Unit Count (HUCount): HUCount is a 30-m raster representing the number of housing units in each pixel.Housing Unit Density (HUDen): HUDen is a 30-m raster of housing-unit density (housing units/km²).Housing Unit Exposure (HUExposure): HUExposure is a 30-m raster that represents the expected number of housing units within a pixel potentially exposed to wildfire in a year. This is a long-term annual average and not intended to represent the actual number of housing units exposed in any specific year.Housing Unit Impact (HUImpact): HUImpact is a 30-m raster that represents the relative potential impact of fire to housing units at any pixel, if a fire were to occur. It is an index that incorporates the general consequences of fire on a home as a function of fire intensity and uses flame length probabilities from wildfire modeling to capture likely intensity of fire.Housing Unit Risk (HURisk): HURisk is a 30-m raster that integrates all four primary elements of wildfire risk - likelihood, intensity, susceptibility, and exposure - on pixels where housing unit density is greater than zero.Additional methodology documentation is provided with the data publication download. Metadata and Downloads.Note: Pixel values in this image service have been altered from the original raster dataset due to data requirements in web services. The service is intended primarily for data visualization. Relative values and spatial patterns have been largely preserved in the service, but users are encouraged to download the source data for quantitative analysis.
T
HOUSING STARTS by Country Dataset
tradingeconomics.com
csv, excel, json, xml
Updated Sep 28, 2013
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2013). HOUSING STARTS by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/housing-starts
Explore at:
csv, json, excel, xmlAvailable download formats
Dataset updated
Sep 28, 2013
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
World
Description
This dataset provides values for HOUSING STARTS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

Facebook

Twitter

Click to copy link

Link copied

Cite

ICT Institute (2025). Utrecht housing dataset [Dataset]. https://www.kaggle.com/datasets/ictinstitute/utrecht-housing-dataset

Utrecht housing dataset

Predict house prices based on location, size and other factors

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jan 27, 2025

Dataset provided by

Kagglehttp://kaggle.com/

Authors

ICT Institute

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Area covered

Utrecht

Description

The Utrecht housing dataset is a freely available dataset that can be used by students to learn about data science and machine learning. The older versions are synthetic datasets. The latest version is an actual dataset based on data collected from a house offering website (Funda) and official land registry (Kadaster).

This dataset is described in the following accompanying paper: - Van Otterloo, S and Burda, P. 2025. The Utrecht Housing dataset: A housing appraisal dataset. Computers and Society Research Journal (2025), 1. The paper can be downloaded here: https://ictinstitute.nl/utrecht-housing-dataset-2025/.

History In July 2022, Stefan Leijnen and Sieuwert van Otterloo taught a one week summerschool ‘AI and machine learning’ at the Utrecht University of Applied Sciences. The goal of this summer school is to make AI and Machine Learning accessible to as many people as possible. Using AI without properly understanding it comes with risks. We want to reduce these risks by giving students from all backgrounds the tools and knowledge to understand AI. Luckily, AI has become more accessible thanks to the existence of many free and open tools and libraries. Any student can train and test algorithms with only a few days of training.

The Utrecht Housing Dataset was designed for use during day 1, day 2 and day 3. The dataset has multiple different input variables that are interesting to explore. The size is such that it is well suited for visualisations. The dataset represents one of the core tenets of responsible AI: AI should be made accessible to a wide group of people, so that anyone with some university experience can test and evaluate algorithms.

When developing the summerschool, we could not find a dataset that was both interesting to analyse and easy to use. Existing datasets often have data quality issues that distract from the learning goals, or are only suited for illustrating one phenomenon. Many classical machine learning datasets also do not have meaningful tasks. The problems that one can do with these datasets are either too basic or theoretical. The Utrecht Housing Dataset thus offers a new combination that we found useful in our classroom.

The dataset is released as creative commons, and can be used freely for any purpose. If you use it, please refer to it as the “The Utrecht housing dataset – example dataset for prediction” by Sieuwert van Otterloo, www.ictinstitute.nl or refer to Sieuwert van Otterloo as the author/source.

The dataset is provided as a CSV file. Each line contains data for one house. The values are seperated by commas.

Clear search

Close search

Google apps

Main menu

Utrecht housing dataset

Existing own homes; average purchase prices, region

United States Existing Home Sales

United States Housing Starts

United States Total Housing Inventory

House Price Prediction Dataset : InsuranceHub- USA

Context

Content

Inspiration

Average Second Hand House Price - Dataset - data.gov.ie

Housing Cost Burden by Race

Real Estate Dataset

Housing Database

Socio-economic, physical, housing, eviction, and risk dataset (SEPHER) ***

Abstract

Methodology

United States New Home Sales

Second homes

Orlando Neighborhood

USA Housing Dataset

Live tables on housing supply: indicators of new supply

Live tables

https://assets.publishing.service.gov.uk/media/68541eb5a3a282804858153b/LiveTable213.ods">Table 213: permanent dwellings started and completed, by tenure, England (quarterly)

https://assets.publishing.service.gov.uk/media/68541ee7a3a282804858153c/LiveTable217.ods">Table 217: permanent dwellings started and completed by tenure and region (quarterly)

Hands on Machine Learning Book - Housing Dataset

Source

Tweaks

Median Sales Price of Houses Sold for the United States

Wildfire Risk to Communities Housing Unit Density (Image Service)

HOUSING STARTS by Country Dataset

Utrecht housing dataset

Predict house prices based on location, size and other factors