Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://raw.githubusercontent.com/Masterx-AI/Project_Housing_Price_Prediction_/main/hs.jpg" alt="">
A simple yet challenging project, to predict the housing price based on certain factors like house area, bedrooms, furnished, nearness to mainroad, etc. The dataset is small yet, it's complexity arises due to the fact that it has strong multicollinearity. Can you overcome these obstacles & build a decent predictive model?
Harrison, D. and Rubinfeld, D.L. (1978) Hedonic prices and the demand for clean air. J. Environ. Economics and Management 5, 81–102. Belsley D.A., Kuh, E. and Welsch, R.E. (1980) Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.
Facebook
TwitterHouse prices grew year-on-year in most states in the U.S. in the first quarter of 2025. Hawaii was the only exception, with a decline of **** percent. The annual appreciation for single-family housing in the U.S. was **** percent, while in Rhode Island—the state where homes appreciated the most—the increase was ******percent. How have home prices developed in recent years? House price growth in the U.S. has been going strong for years. In 2025, the median sales price of a single-family home exceeded ******* U.S. dollars, up from ******* U.S. dollars five years ago. One of the factors driving house prices was the cost of credit. The record-low federal funds effective rate allowed mortgage lenders to set mortgage interest rates as low as *** percent. With interest rates on the rise, home buying has also slowed, causing fluctuations in house prices. Why are house prices growing? Many markets in the U.S. are overheated because supply has not been able to keep up with demand. How many homes enter the housing market depends on the construction output, whereas the availability of existing homes for purchase depends on many other factors, such as the willingness of owners to sell. Furthermore, growing investor appetite in the housing sector means that prospective homebuyers have some extra competition to worry about. In certain metros, for example, the share of homes bought by investors exceeded ** percent in 2025.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains 2000 rows of house-related data, representing various features that could influence house prices. Below, we discuss key aspects of the dataset, which include its structure, the choice of features, and potential use cases for analysis.
The dataset is designed to capture essential attributes for predicting house prices, including:
Area: Square footage of the house, which is generally one of the most important predictors of price. Bedrooms & Bathrooms: The number of rooms in a house significantly affects its value. Homes with more rooms tend to be priced higher. Floors: The number of floors in a house could indicate a larger, more luxurious home, potentially raising its price. Year Built: The age of the house can affect its condition and value. Newly built houses are generally more expensive than older ones. Location: Houses in desirable locations such as downtown or urban areas tend to be priced higher than those in suburban or rural areas. Condition: The current condition of the house is critical, as well-maintained houses (in 'Excellent' or 'Good' condition) will attract higher prices compared to houses in 'Fair' or 'Poor' condition. Garage: Availability of a garage can increase the price due to added convenience and space. Price: The target variable, representing the sale price of the house, used to train machine learning models to predict house prices based on the other features.
Area Distribution: The area of the houses in the dataset ranges from 500 to 5000 square feet, which allows analysis across different types of homes, from smaller apartments to larger luxury houses. Bedrooms and Bathrooms: The number of bedrooms varies from 1 to 5, and bathrooms from 1 to 4. This variance enables analysis of homes with different sizes and layouts. Floors: Houses in the dataset have between 1 and 3 floors. This feature could be useful for identifying the influence of multi-level homes on house prices. Year Built: The dataset contains houses built from 1900 to 2023, giving a wide range of house ages to analyze the effects of new vs. older construction. Location: There is a mix of urban, suburban, downtown, and rural locations. Urban and downtown homes may command higher prices due to proximity to amenities. Condition: Houses are labeled as 'Excellent', 'Good', 'Fair', or 'Poor'. This feature helps model the price differences based on the current state of the house. Price Distribution: Prices range between $50,000 and $1,000,000, offering a broad spectrum of property values. This range makes the dataset appropriate for predicting a wide variety of housing prices, from affordable homes to luxury properties.
3. Correlation Between Features
A key area of interest is the relationship between various features and house price: Area and Price: Typically, a strong positive correlation is expected between the size of the house (Area) and its price. Larger homes are likely to be more expensive. Location and Price: Location is another major factor. Houses in urban or downtown areas may show a higher price on average compared to suburban and rural locations. Condition and Price: The condition of the house should show a positive correlation with price. Houses in better condition should be priced higher, as they require less maintenance and repair. Year Built and Price: Newer houses might command a higher price due to better construction standards, modern amenities, and less wear-and-tear, but some older homes in good condition may retain historical value. Garage and Price: A house with a garage may be more expensive than one without, as it provides extra storage or parking space.
The dataset is well-suited for various machine learning and data analysis applications, including:
House Price Prediction: Using regression techniques, this dataset can be used to build a model to predict house prices based on the available features. Feature Importance Analysis: By using techniques such as feature importance ranking, data scientists can determine which features (e.g., location, area, or condition) have the greatest impact on house prices. Clustering: Clustering techniques like k-means could help identify patterns in the data, such as grouping houses into segments based on their characteristics (e.g., luxury homes, affordable homes). Market Segmentation: The dataset can be used to perform segmentation by location, price range, or house type to analyze trends in specific sub-markets, like luxury vs. affordable housing. Time-Based Analysis: By studying how house prices vary with the year built or the age of the house, analysts can derive insights into the trends of older vs. newer homes.
Facebook
TwitterThe U.S. housing market has slowed, after ** consecutive years of rising home prices. In 2021, house prices surged by an unprecedented ** percent, marking the highest increase on record. However, the market has since cooled, with the Freddie Mac House Price Index showing more modest growth between 2022 and 2024. In 2024, home prices increased by *** percent. That was lower than the long-term average of *** percent since 1990. Impact of mortgage rates on homebuying The recent cooling in the housing market can be partly attributed to rising mortgage rates. After reaching a record low of **** percent in 2021, the average annual rate on a 30-year fixed-rate mortgage more than doubled in 2023. This significant increase has made homeownership less affordable for many potential buyers, contributing to a substantial decline in home sales. Despite these challenges, forecasts suggest a potential recovery in the coming years. How much does it cost to buy a house in the U.S.? In 2023, the median sales price of an existing single-family home reached a record high of over ******* U.S. dollars. Newly built homes were even pricier, despite a slight decline in the median sales price in 2023. Naturally, home prices continue to vary significantly across the country, with West Virginia being the most affordable state for homebuyers.
Facebook
TwitterHome prices in the U.S. reach new heights The American housing market continues to show remarkable resilience, with the S&P/Case Shiller U.S. National Home Price Index reaching an all-time high of 331.69 in June 2025. This figure represents a significant increase from the index value of 166.23 recorded in January 2015, highlighting the substantial growth in home prices over the past decade. The S&P Case Shiller National Home Price Index is based on the prices of single-family homes and is the leading indicator of the American housing market and one of the indicators of the state of the broader economy. The S&P Case Shiller National Home Price Index series also includes S&P/Case Shiller 20-City Composite Home Price Index and S&P/Case Shiller 10-City Composite Home Price Index – measuring the home price changes in the major U.S. metropolitan areas, as well as twenty composite indices for the leading U.S. cities. Market fluctuations and recovery Despite the overall upward trend, the housing market has experienced some fluctuations in recent years. During the housing boom in 2021, the number of existing home sales reached the highest level since 2006. However, transaction volumes quickly plummeted, as the soaring interest rates and out-of-reach prices led to housing sentiment deteriorating. Factors influencing home prices Several factors have contributed to the rise in home prices, including a chronic supply shortage, the gradual decline in interest rates, and the spike in demand during the COVID-19 pandemic. During the subprime mortgage crisis (2007-2010), the construction of new homes declined dramatically. Although it has gradually increased since then, the number of new building permits, home starts, and completions are still shy from the levels before the crisis. With demand outweighing supply, competition for homes can be fierce, leading to bidding wars and soaring prices. The supply of existing homes is further constrained, as homeowners are less likely to sell and move homes due to the worsened lending conditions.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides insights into the global housing market, covering various economic factors from 2015 to 2024. It includes details about property prices, rental yields, interest rates, and household income across multiple countries. This dataset is ideal for real estate analysis, financial forecasting, and market trend visualization.
| Column Name | Description |
|---|---|
Country | The country where the housing market data is recorded 🌍 |
Year | The year of observation 📅 |
Average House Price ($) | The average price of houses in USD 💰 |
Median Rental Price ($) | The median monthly rent for properties in USD 🏠 |
Mortgage Interest Rate (%) | The average mortgage interest rate percentage 📉 |
Household Income ($) | The average annual household income in USD 🏡 |
Population Growth (%) | The percentage increase in population over the year 👥 |
Urbanization Rate (%) | Percentage of the population living in urban areas 🏙️ |
Homeownership Rate (%) | The percentage of people who own their homes 🔑 |
GDP Growth Rate (%) | The annual GDP growth percentage 📈 |
Unemployment Rate (%) | The percentage of unemployed individuals in the labor force 💼 |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
House Price Index YoY in the United States decreased to 1.70 percent in September from 2.40 percent in August of 2025. This dataset includes a chart with historical data for the United States FHFA House Price Index YoY.
Facebook
TwitterGlobal house prices experienced a significant shift in 2022, with advanced economies seeing a notable decline after a prolonged period of growth. The real house price index (adjusted for inflation) for advanced economies peaked at nearly *** index points in early 2022 before falling to around ***** points by the second quarter of 2023. In the second quarter of 2025, the index reached ***** points. This represents a reversal of the upward trend that had characterized the housing market for roughly a decade. Likewise, real house prices in emerging economies declined after reaching a high of ***** points in the third quarter of 2021. What is behind the slowdown? Inflation and slow economic growth have been the primary drivers for the cooling of the housing market. Secondly, the growing gap between incomes and house prices since 2012 has decreased the affordability of homeownership. Last but not least, homebuyers in 2024 faced dramatically higher mortgage interest rates, further contributing to worsening sentiment and declining transactions. Some markets continue to grow While many countries witnessed a deceleration in house price growth in 2022, some markets continued to see substantial increases. Turkey, in particular, stood out with a nominal increase in house prices of over ** percent in the first quarter of 2025. Other countries that recorded a two-digit growth include North Macedonia and Russia. When accounting for inflation, the three countries with the fastest growing residential prices in early 2025 were North Macedonia, Portugal, and Bulgaria.
Facebook
Twitterhttps://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for All-Transactions House Price Index for the United States (USSTHPI) from Q1 1975 to Q3 2025 about appraisers, HPI, housing, price index, indexes, price, and USA.
Facebook
TwitterThe average price per square foot of floor space in new single-family housing in the United States decreased after the great financial crisis, followed by several years of stagnation. Since 2012, the price has continuously risen, hitting ****** U.S. dollars per square foot in 2024. In 2024, the average sales price of a new home exceeded ******* U.S. dollars. Development of house sales in the U.S. One of the reasons for rising property prices is the gradual growth of house sales between 2011 and 2020. This period was marked by the gradual recovery following the subprime mortgage crisis and a growing housing sentiment. Another significant factor for the housing demand was the growing number of new household formations each year. Despite this trend, housing transactions plummeted in 2021, amid soaring prices and borrowing costs. In 2021, the average construction cost for single-family housing rose by nearly ** percent year-on-year, and in 2022, the increase was even higher, at close to ** percent. Financing a house purchase Mortgage interest rates in the U.S. rose dramatically in 2022 and remained elevated until 2024. In 2020, a homebuyer could lock in a 30-year fixed interest rate of under ***** percent, whereas in 2024, the average rate for the same mortgage type was more than twice higher. That has led to a decline in homebuyer sentiment, and an increasing share of the population pessimistic about buying a home in the current market.
Facebook
Twitterhttps://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Interest Rates and Price Indexes; Multi-Family Real Estate Apartment Price Index, Level (BOGZ1FL075035403A) from 1985 to 2024 about multifamily, real estate, family, interest rate, interest, rate, price index, indexes, price, and USA.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains a comprehensive collection of indicators which dictate the housing prices in the United States.
Facebook
TwitterIn the United States, interest rates for all mortgage types started to increase in 2021. This was due to the Federal Reserve introducing a series of hikes in the federal funds rate to contain the rising inflation. In the second quarter of 2025, the 30-year fixed rate dropped slightly, to **** percent. The rate remained below the peak of **** percent in the fourth quarter of 2023. Why have U.S. home sales decreased? Cheaper mortgages normally encourage consumers to buy homes, while higher borrowing costs have the opposite effect. As interest rates increased in 2022, the number of existing homes sold plummeted. Soaring house prices over the past 10 years have further affected housing affordability. Between 2014 and 2024, the median price of an existing single-family home risen by about ** percent. On the other hand, the median weekly earnings have risen much slower. Comparing mortgage terms and rates Between 2008 and 2024, the average rate on a 15-year fixed-rate mortgage in the United States stood between **** and **** percent. Over the same period, a 30-year mortgage term averaged a fixed-rate of between **** and **** percent. Rates on 15-year loan terms are lower to encourage a quicker repayment, which helps to improve a homeowner’s equity.
Facebook
Twitterhttps://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Average Sales Price of Houses Sold for the United States (ASPUS) from Q1 1963 to Q2 2025 about sales, housing, and USA.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Average House Prices in Canada increased to 688800 CAD in October from 687600 CAD in September of 2025. This dataset includes a chart with historical data for Canada Average House Prices.
Facebook
TwitterMortgage interest rates in Czechia have experienced significant fluctuations over the past few years, reaching a peak of nearly *** percent in December 2022 before gradually declining. As of March 2025, the interest rate on new mortgages in the country amounted to **** percent, showing a slight decrease from the previous month. This trend in mortgage rates has occurred alongside substantial increases in housing prices. Housing market dynamics The changes in mortgage rates have gone hand in hand with notable shifts in the Czech housing market. Despite the high-interest rates, new mortgage lending reached over 18 million Czech koruna in December 2024, marking a significant increase from the same month in the previous year. This growth in lending has continued despite the steady rise in housing prices, with the house price index reaching ***** in the third quarter of 2024. This marks a significant increase from the 2015 baseline, reflecting the ongoing upward trend. The average purchase price per square meter for family houses increasing across the country. In 2023, Prague recorded the highest average price at ******* Czech koruna per square meter. Construction sector trends The construction sector in Czechia has shown its response to these market conditions. The index of multi-dwelling building construction fluctuated recently, with 2024 showing a slight decrease to **** index points compared to the previous year. However, regarding non-residential buildings, the construction has been continuously growing since 2018 with hotels and industrial buildings accounting for the majority of new non-residential constructions.
Facebook
TwitterHouse prices in the UK rose dramatically during the coronavirus pandemic, with growth slowing down in 2022 and turning negative in 2023. The year-on-year annual house price change peaked at 14 percent in July 2022. In April 2025, house prices increased by 3.5 percent. As of late 2024, the average house price was close to 290,000 British pounds. Correction in housing prices: a European phenomenon The trend of a growing residential real estate market was not exclusive to the UK during the pandemic. Likewise, many European countries experienced falling prices in 2023. When comparing residential property RHPI (price index in real terms, e.g. corrected for inflation), countries such as Germany, France, Italy, and Spain also saw prices decline. Sweden, one of the countries with the fastest growing residential markets, saw one of the largest declines in prices. How has demand for UK housing changed since the outbreak of the coronavirus? The easing of the lockdown was followed by a dramatic increase in home sales. In November 2020, the number of mortgage approvals reached an all-time high of over 107,000. One of the reasons for the housing boom were the low mortgage rates, allowing home buyers to take out a loan with an interest rate as low as 2.5 percent. That changed as the Bank of England started to raise the base lending rate, resulting in higher borrowing costs and a decline in homebuyer sentiment.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Compare UK interest rates and mortgage rates alongside house prices. Interactive charts showing the Bank of England base rate versus 2-year, 5-year, and SVR mortgage rates, with historical HPI trends.
Facebook
Twitterhttps://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Interest Rates and Price Indexes; Multi-Family Real Estate Apartment Price Index, Level (BOGZ1FL075035403Q) from Q4 1985 to Q2 2025 about multifamily, real estate, family, interest rate, interest, rate, price index, indexes, price, and USA.
Facebook
TwitterAbout the dataset (cleaned data)
The dataset (parquet file) contains approximately 1,5 million residential household sales from Denmark during the periode from 1992 to 2024. All cleaned data is merged into one parquet file here on Kaggle. Note some cleaning might still be nessesary, see notebook under code.
Also, added a random sample (100k) of the dataset as a csv file.
Done in Python version: 2.6.3.
Raw data
Raw data and more info is avaible on Github repositary: https://github.com/MartinSamFred/Danish-residential-housingPrices-1992-2024.git
The dataset has been scraped and cleaned (to some extent). Cleaned files are located in: \Housing_data_cleaned \ named DKHousingprices_1 and 2. Saved in parquet format (and saved as two files due to size).
Cleaning from raw files to above cleaned files is outlined in BoligsalgConcatCleanigGit.ipynb. (done in Python version: 2.6.3)
Webscraping script: Webscrape_script.ipynb (done in Python version: 2.6.3)
Provided you want to clean raw files from scratch yourself:
Uncleaned scraped files (81 in total) are located in \Housing_data_raw \ Housing_data_batch1 and 2. Saved in .csv format and compressed as 7-zip files.
Additional files added/appended to the Cleaned files are located in \Addtional_data and named DK_inflation_rates, DK_interest_rates, DK_morgage_rates and DK_regions_zip_codes. Saved in .xlsx format.
Content
Each row in the dataset contains a residential household sale during the period 1992 - 2024.
“Cleaned files” columns:
0 'date': is the transaction date
1 'quarter': is the quarter based on a standard calendar year
2 'house_id': unique house id (could be dropped)
3 'house_type': can be 'Villa', 'Farm', 'Summerhouse', 'Apartment', 'Townhouse'
4 'sales_type': can be 'regular_sale', 'family_sale', 'other_sale', 'auction', '-' (“-“ could be dropped)
5 'year_build': range 1000 to 2024 (could be narrowed more)
6 'purchase_price': is purchase price in DKK
7 '%_change_between_offer_and_purchase': could differ negatively, be zero or positive
8 'no_rooms': number of rooms
9 'sqm': number of square meters
10 'sqm_price': 'purchase_price' divided by 'sqm_price'
11 'address': is the address
12 'zip_code': is the zip code
13 'city': is the city
14 'area': 'East & mid jutland', 'North jutland', 'Other islands', 'Capital, Copenhagen', 'South jutland', 'North Zealand', 'Fyn & islands', 'Bornholm'
15 'region': 'Jutland', 'Zealand', 'Fyn & islands', 'Bornholm'
16 'nom_interest_rate%': Danish nominal interest rate show pr. quarter however actual rate is not converted from annualized to quarterly
17 'dk_ann_infl_rate%': Danish annual inflation rate show pr. quarter however actual rate is not converted from annualized to quarterly
18 'yield_on_mortgage_credit_bonds%': 30 year mortgage bond rate (without spread)
Uses
Various (statistical) analysis, visualisation and I assume machine learning as well.
Practice exercises etc.
Uncleaned scraped files are great to practice cleaning, especially string cleaning. I’m not an expect as seen in the coding ;-).
Disclaimer
The data and information in the data set provided here are intended to be used primarily for educational purposes only. I do not own any data, and all rights are reserved to the respective owners as outlined in “Acknowledgements/sources”. The accuracy of the dataset is not guaranteed accordingly any analysis and/or conclusions is solely at the user's own responsibly and accountability.
Acknowledgements/sources
All data is publicly available on:
Boliga: https://www.boliga.dk/
Finans Danmark: https://finansdanmark.dk/
Danmarks Statistik: https://www.dst.dk/da
Statistikbanken: https://statistikbanken.dk/statbank5a/default.asp?w=2560
Macrotrends: https://www.macrotrends.net/
PostNord: https://www.postnord.dk/
World Data: https://www.worlddata.info/
Dataset picture / cover photo: Nick Karvounis (https://unsplash.com/)
Have fun… :-)
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://raw.githubusercontent.com/Masterx-AI/Project_Housing_Price_Prediction_/main/hs.jpg" alt="">
A simple yet challenging project, to predict the housing price based on certain factors like house area, bedrooms, furnished, nearness to mainroad, etc. The dataset is small yet, it's complexity arises due to the fact that it has strong multicollinearity. Can you overcome these obstacles & build a decent predictive model?
Harrison, D. and Rubinfeld, D.L. (1978) Hedonic prices and the demand for clean air. J. Environ. Economics and Management 5, 81–102. Belsley D.A., Kuh, E. and Welsch, R.E. (1980) Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.