55 datasets found
  1. House Price Prediction Dataset

    • kaggle.com
    zip
    Updated Sep 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zafar (2024). House Price Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/zafarali27/house-price-prediction-dataset
    Explore at:
    zip(29372 bytes)Available download formats
    Dataset updated
    Sep 21, 2024
    Authors
    Zafar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    House Price Prediction Dataset.

    The dataset contains 2000 rows of house-related data, representing various features that could influence house prices. Below, we discuss key aspects of the dataset, which include its structure, the choice of features, and potential use cases for analysis.

    1. Dataset Features

    The dataset is designed to capture essential attributes for predicting house prices, including:

    Area: Square footage of the house, which is generally one of the most important predictors of price. Bedrooms & Bathrooms: The number of rooms in a house significantly affects its value. Homes with more rooms tend to be priced higher. Floors: The number of floors in a house could indicate a larger, more luxurious home, potentially raising its price. Year Built: The age of the house can affect its condition and value. Newly built houses are generally more expensive than older ones. Location: Houses in desirable locations such as downtown or urban areas tend to be priced higher than those in suburban or rural areas. Condition: The current condition of the house is critical, as well-maintained houses (in 'Excellent' or 'Good' condition) will attract higher prices compared to houses in 'Fair' or 'Poor' condition. Garage: Availability of a garage can increase the price due to added convenience and space. Price: The target variable, representing the sale price of the house, used to train machine learning models to predict house prices based on the other features.

    2. Feature Distributions

    Area Distribution: The area of the houses in the dataset ranges from 500 to 5000 square feet, which allows analysis across different types of homes, from smaller apartments to larger luxury houses. Bedrooms and Bathrooms: The number of bedrooms varies from 1 to 5, and bathrooms from 1 to 4. This variance enables analysis of homes with different sizes and layouts. Floors: Houses in the dataset have between 1 and 3 floors. This feature could be useful for identifying the influence of multi-level homes on house prices. Year Built: The dataset contains houses built from 1900 to 2023, giving a wide range of house ages to analyze the effects of new vs. older construction. Location: There is a mix of urban, suburban, downtown, and rural locations. Urban and downtown homes may command higher prices due to proximity to amenities. Condition: Houses are labeled as 'Excellent', 'Good', 'Fair', or 'Poor'. This feature helps model the price differences based on the current state of the house. Price Distribution: Prices range between $50,000 and $1,000,000, offering a broad spectrum of property values. This range makes the dataset appropriate for predicting a wide variety of housing prices, from affordable homes to luxury properties.

    3. Correlation Between Features

    A key area of interest is the relationship between various features and house price: Area and Price: Typically, a strong positive correlation is expected between the size of the house (Area) and its price. Larger homes are likely to be more expensive. Location and Price: Location is another major factor. Houses in urban or downtown areas may show a higher price on average compared to suburban and rural locations. Condition and Price: The condition of the house should show a positive correlation with price. Houses in better condition should be priced higher, as they require less maintenance and repair. Year Built and Price: Newer houses might command a higher price due to better construction standards, modern amenities, and less wear-and-tear, but some older homes in good condition may retain historical value. Garage and Price: A house with a garage may be more expensive than one without, as it provides extra storage or parking space.

    4. Potential Use Cases

    The dataset is well-suited for various machine learning and data analysis applications, including:

    House Price Prediction: Using regression techniques, this dataset can be used to build a model to predict house prices based on the available features. Feature Importance Analysis: By using techniques such as feature importance ranking, data scientists can determine which features (e.g., location, area, or condition) have the greatest impact on house prices. Clustering: Clustering techniques like k-means could help identify patterns in the data, such as grouping houses into segments based on their characteristics (e.g., luxury homes, affordable homes). Market Segmentation: The dataset can be used to perform segmentation by location, price range, or house type to analyze trends in specific sub-markets, like luxury vs. affordable housing. Time-Based Analysis: By studying how house prices vary with the year built or the age of the house, analysts can derive insights into the trends of older vs. newer homes.

    5. Limitations and ...

  2. T

    United States FHFA House Price Index

    • tradingeconomics.com
    • ko.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Sep 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States FHFA House Price Index [Dataset]. https://tradingeconomics.com/united-states/housing-index
    Explore at:
    xml, excel, json, csvAvailable download formats
    Dataset updated
    Sep 15, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 31, 1991 - Sep 30, 2025
    Area covered
    United States
    Description

    Housing Index in the United States decreased to 435.40 points in September from 435.60 points in August of 2025. This dataset provides the latest reported value for - United States House Price Index MoM Change - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.

  3. r

    Metro median house sales

    • researchdata.edu.au
    • data.wu.ac.at
    Updated Jul 3, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Housing and Urban Development (2015). Metro median house sales [Dataset]. https://researchdata.edu.au/metro-median-house-sales/1953650
    Explore at:
    Dataset updated
    Jul 3, 2015
    Dataset provided by
    data.sa.gov.au
    Authors
    Department for Housing and Urban Development
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Quarterly median house prices for metropolitan Adelaide by suburb

  4. Average resale house prices Canada 2011-2024, with a forecast until 2026, by...

    • statista.com
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Average resale house prices Canada 2011-2024, with a forecast until 2026, by province [Dataset]. https://www.statista.com/statistics/587661/average-house-prices-canada-by-province/
    Explore at:
    Dataset updated
    Nov 29, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Canada
    Description

    The average resale house price in Canada was forecast to reach nearly ******* Canadian dollars in 2026, according to a January forecast. In 2024, house prices increased after falling for the first time since 2019. One of the reasons for the price correction was the notable drop in transaction activity. Housing transactions picked up in 2024 and are expected to continue to grow until 2026. British Columbia, which is the most expensive province for housing, is projected to see the average house price reach *** million Canadian dollars in 2026. Affordability in Vancouver Vancouver is the most populous city in British Columbia and is also infamously expensive for housing. In 2023, the city topped the ranking for least affordable housing market in Canada, with the average homeownership cost outweighing the average household income. There are a multitude of reasons for this, but most residents believe that foreigners investing in the market cause the high housing prices. Victoria housing market The capital of British Columbia is Victoria, where housing prices are also very high. The price of a single family home in Victoria's most expensive suburb, Oak Bay was *** million Canadian dollars in 2024.

  5. Housing Price Prediction using DT and RF in R

    • kaggle.com
    zip
    Updated Aug 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vikram amin (2023). Housing Price Prediction using DT and RF in R [Dataset]. https://www.kaggle.com/datasets/vikramamin/housing-price-prediction-using-dt-and-rf-in-r
    Explore at:
    zip(629100 bytes)Available download formats
    Dataset updated
    Aug 31, 2023
    Authors
    vikram amin
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description
    • Objective: To predict the prices of houses in the City of Melbourne
    • Approach: Using Decision Tree and Random Forest https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Ffc6fb7d0bd8e854daf7a6f033937a397%2FPicture1.png?generation=1693489996707941&alt=media" alt="">
    • Data Cleaning:
    • Date column is shown as a character vector which is converted into a date vector using the library ‘lubridate’
    • We create a new column called age to understand the age of the house as it can be a factor in the pricing of the house. We extract the year from column ‘Date’ and subtract it from the column ‘Year Built’
    • We remove 11566 records which have missing values
    • We drop columns which are not significant such as ‘X’, ‘suburb’, ‘address’, (we have kept zipcode as it serves the purpose in place of suburb and address), ‘type’, ‘method’, ‘SellerG’, ‘date’, ‘Car’, ‘year built’, ‘Council Area’, ‘Region Name’
    • We split the data into ‘train’ and ‘test’ in 80/20 ratio using the sample function
    • Run libraries ‘rpart’, ‘rpart.plot’, ‘rattle’, ‘RcolorBrewer’
    • Run decision tree using the rpart function. ‘Price’ is the dependent variable https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F6065322d19b1376c4a341a4f22933a51%2FPicture2.png?generation=1693490067579017&alt=media" alt="">
    • Average price for 5464 houses is $1084349
    • Where building area is less than 200.5, the average price for 4582 houses is $931445. Where building area is less than 200.5 & age of the building is less than 67.5 years, the avg price for 3385 houses is $799299.6.
    • $4801538 is the Highest average prices of 13 houses where distance is lower than 5.35 & building are is >280.5
      https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F136542b7afb6f03c1890bae9b07dc464%2FDecision%20Tree%20Plot.jpeg?generation=1693490124083168&alt=media" alt="">
    • We use the caret package for tuning the parameter and the optimal complexity parameter found is 0.01 with RMSE 445197.9 https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Feb1633df9dd61ba3a51574873b055fd0%2FPicture3.png?generation=1693490163033658&alt=media" alt="">
    • We use library (Metrics) to find out the RMSE ($392107), MAPE (0.297) which means an accuracy of 99.70% and MAE ($272015.4)
    • Variables ‘postcode’, longitude and building are the most important variables
    • Test$Price indicates the actual price and test$predicted indicates the predicted price for particular 6 houses. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F620b1aad968c9aee169d0e7371bf3818%2FPicture4.png?generation=1693490211728176&alt=media" alt="">
    • We use the default parameters of random forest on the train data https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fe9a3c3f8776ee055e4a1bb92d782e19c%2FPicture5.png?generation=1693490244695668&alt=media" alt="">
    • The below image indicates that ‘Building Area’, ‘Age of the house’ and ‘Distance’ are the most important variables that affect the price of the house. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fc14d6266184db8f30290c528d72b9f6b%2FRandom%20Forest%20Variables%20Importance.jpeg?generation=1693490284920037&alt=media" alt="">
    • Based on the default parameters, RMSE is $250426.2, MAPE is 0.147 (accuracy is 99.853%) and MAE is $151657.7
    • Error starts to remain constant between 100 to 200 trees and thereafter there is almost minimal reduction. We can choose N tree=200. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F365f9e8587d3a65805330889d22f9e60%2FNtree%20Plot.jpeg?generation=1693490308734539&alt=media" alt="">
    • We tune the model and find mtry = 3 has the lowest out of bag error
    • We use the caret package and use 5 fold cross validation technique
    • RMSE is $252216.10 , MAPE is 0.146 (accuracy is 99.854%) , MAE is $151669.4
    • We can conclude that Random Forest give us more accurate results as compared to Decision Tree
    • In Random Forest , the default parameters (N tree = 500) give us lower RMSE and MAPE as compared to N tree = 200. So we can proceed with those parameters.
  6. Single family house prices in Victoria BC 2025, by suburb

    • statista.com
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Single family house prices in Victoria BC 2025, by suburb [Dataset]. https://www.statista.com/statistics/647969/single-family-house-prices-in-victoria-bc-by-suburb/
    Explore at:
    Dataset updated
    Nov 29, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jun 2025
    Area covered
    Canada
    Description

    In June 2025, a single-family house in Oak Bay cost **** million Canadian dollars. Oak Bay was the most expensive suburb in Victoria, British Columbia, followed by Highlands and North Saanich. Victoria: an overview Victoria is the capital city of the province of British Columbia. The city is located south of Vancouver, and across the U.S. border from Seattle. In 2020, the average home price in Victoria was ****million Canadian dollars, which placed the city as the sixth most expensive Canadian city for residential real estate. Home affordability in Canada Housing affordability is, undoubtedly, one of the biggest barriers to homeownership in Canada. In 2025, the ratio of homeownership costs to income was **** percent. Nevertheless, more expensive locations in the country had a higher ratio, with Vancouver exceeding ** percent, suggesting that on average, mortgage payments were slightly lower than the average income.

  7. r

    housing-planning

    • researchdata.edu.au
    • acquire.cqu.edu.au
    Updated Feb 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md Zillur Rahman (2024). housing-planning [Dataset]. http://doi.org/10.25946/25018466.V1
    Explore at:
    Dataset updated
    Feb 29, 2024
    Dataset provided by
    Central Queensland University
    Authors
    Md Zillur Rahman
    Description

    Urban housing location and locational amenities play an important role in median house price distribution and growth among the suburbs of many metropolitan cities in developed countries, such as Australia. In particular, distance from the central business district (CBD) and access to the transport network plays a vital role in house price distribution and growth over various suburbs in a city. However, Australian metropolitan cities have experienced increases in housing prices by up to 120% over the last 20 years, and the growth pattern was different across all suburbs in a city, such as in Melbourne. Therefore, this study examines the impacts of locational amenities on house price changes across various suburbs in Melbourne over the three census periods of 2006, 2011, and 2016, and suggests some strategic guidelines to improve the availability and accessibility of locational amenities in the suburbs with less concentrated amenities.

    This study chose three Local Government Areas (LGAs) of Maribyrnong, Brimbank and Wyndham in Melbourne. Each LGA has been selected as a case study because many low-income people live in these LGAs’ areas. Further, some suburbs of these LGAs have maintained similar housing prices for an extended time, while some have not.

    The study applied a quantitative spatial methodology to examine the housing price distribution and growth patterns by evaluating the concentration and accessibility of locational urban amenities using GIS-based techniques and a spatial data set. The spatial data analyses were performed by spatial statistics methods to measure central tendency, Local Moran’s I of LISA clustering, Kernel Density Estimation (KDE), Kernel Density Smoothing (KDS). These tests were used to find the patterns of house price distribution and growth. The study also identified the accessibility of amenities in relation to median house price distribution and growth. Spatial Autoregressive Regression (SAR), Spatial Lag, and Spatial Errors models were used to identify the spatial dependencies to test the statistical significance between the median house price and the concentration and access of local urban amenities over the three census years.

    This study found three median house price distribution and growth patterns among the suburbs in the three selected LGAs. There are growth differences in the median house price for different census years between 2006 and 2011, 2011 and 2016, and 2006 and 2016. The Low-High (LH) median house price distribution clusters between 2006 and 2011 became High-High (HH) clusters between the census years 2011 and 2016, and 2006 and 2016. The median house price growth rate increased significantly in the census years between 2006 and 2011. Most of the HH median house price distribution and growth clusters’ tendencies were closer to the Melbourne CBD. On the other hand, the Low-Low (LL) distribution and growth clusters were closer to Melbourne’s periphery. The suburbs located further away had low access to amenities. The HH median house price clusters are located closer to stations and educational institutes. Better access to locational amenities led to more significant HH median house price clusters, as the median house price increased at an increasing rate between 2011 and 2016. The HH median house price clusters recorded more growth between 2006 and 2016. The suburbs with train stations had better access to most other locational amenities. Almost all HH median house price clusters had train stations with higher access to amenities.

    There was a consistent relationship between median house price distribution, growth patterns, and locational urban amenities. The spatial lag and spatial error model tests showed that between 2006 and 2011, and 2006 and 2016, there were differences in the amenities. Still, these did not affect the outcomes in observations, and were related only to immeasurable factors for some reason. Therefore, the higher house price in the neighbouring suburb could increase the price in that suburb. The research also found from the regression analysis that highly significant amenities confirming travel time to the CBD by bus, and distance to the CBD, were negatively related in all three previous census years. This negative relationship estimates that the house price growth is lower when the distance is longer. Due to this travel to the CBD by bus is not a popular option for households. The train stations are essential for high house price growth. The house price growth is low when homes are further away from train stations and workplaces.

    This thesis has three contributions. Firstly, it uses the Rational Choice Theory (RCT), providing a theoretical basis for analysing households’ mutually interdependent preferences of urban amenities that are found to regulate house price growth clusters. Secondly, the methodological contribution uses the GIS-defined cluster mapping and spatial statistics in queries and reasoning, measurements, transformations, descriptive summaries, optimisation, and hypothesis testing models between house price distribution and growth, and access to urban locational amenities. Thirdly, this research contributes to designing practical guidelines to identify local urban amenities for planning local area development.

    Overall, this thesis demonstrates that the median house price distribution and growth patterns are highly correlated with the concentration and accessibility of locational urban amenities among the suburbs in three selected LGAs in Melbourne over the three census years (i.e., 2006, 2011, and 2016). The findings bring to the fore the need for research at the local and state levels to identify specific amenities relevant to the middle-class house distribution strategy, which can be helpful for investors, estate agents, town planners, and builders as partners for effective local development. The future study might use social, psychological, and macroeconomic variables not considered or used in this research.

  8. T

    Estonia - Median of the housing cost burden distribution: Towns and suburbs

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jul 27, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Estonia - Median of the housing cost burden distribution: Towns and suburbs [Dataset]. https://tradingeconomics.com/estonia/median-of-the-housing-cost-burden-distribution-towns-suburbs-eurostat-data.html
    Explore at:
    csv, xml, json, excelAvailable download formats
    Dataset updated
    Jul 27, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Estonia
    Description

    Estonia - Median of the housing cost burden distribution: Towns and suburbs was 15.60% in December of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Estonia - Median of the housing cost burden distribution: Towns and suburbs - last updated from the EUROSTAT on December of 2025. Historically, Estonia - Median of the housing cost burden distribution: Towns and suburbs reached a record high of 16.00% in December of 2010 and a record low of 9.00% in December of 2020.

  9. Data from: The Housing 🏡 Dataset

    • kaggle.com
    zip
    Updated May 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sakshi Satre (2024). The Housing 🏡 Dataset [Dataset]. https://www.kaggle.com/datasets/sakshisatre/the-boston-housing-dataset
    Explore at:
    zip(12299 bytes)Available download formats
    Dataset updated
    May 15, 2024
    Authors
    Sakshi Satre
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The Boston Housing dataset, which is often used for regression analysis and predictive modeling tasks, doesn't typically have an official "subtitle." However, it's commonly referred to as the "Boston Housing dataset" or the "Boston Housing Price dataset" due to its focus on housing-related features and its primary target variable being the median value of owner-occupied homes in Boston suburbs.

    Column Description

    Columns: 1. CRIM: per capita crime rate by town (numeric) 2. ZN: proportion of residential land zoned for lots over 25,000 sq.ft. (numeric) 3. INDUS: proportion of non-retail business acres per town (numeric) 4. CHAS: Charles River dummy variable (1 if tract bounds river; 0 otherwise) (categorical) 5. NOX: nitric oxides concentration (parts per 10 million) (numeric) 6. RM: average number of rooms per dwelling (numeric) 7. AGE: proportion of owner-occupied units built prior to 1940 (numeric) 8. DIS: weighted distances to five Boston employment centres (numeric) 9. RAD: index of accessibility to radial highways (numeric) 10. TAX: full-value property-tax rate per $10,000 (numeric) 11. PTRATIO: pupil-teacher ratio by town (numeric) 12. B: 1000(Bk - 0.63)^2 where Bk is the proportion of [people of African American descent] by town (numeric) 13. LSTAT: % lower status of the population (numeric) 14. MEDV: Median value of owner-occupied homes in $1000s (target variable) (numeric)

  10. G

    Germany Commercial Property Market Index: WG: 49 Cities: Average Retail...

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Germany Commercial Property Market Index: WG: 49 Cities: Average Retail Rent: Suburban [Dataset]. https://www.ceicdata.com/en/germany/property-market-index/commercial-property-market-index-wg-49-cities-average-retail-rent-suburban
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2008 - Dec 1, 2019
    Area covered
    Germany
    Description

    Germany Commercial Property Market Index: WG: 49 Cities: Average Retail Rent: Suburban data was reported at 166.760 1975=100 in 2019. This records an increase from the previous number of 166.010 1975=100 for 2018. Germany Commercial Property Market Index: WG: 49 Cities: Average Retail Rent: Suburban data is updated yearly, averaging 143.580 1975=100 from Dec 1975 (Median) to 2019, with 45 observations. The data reached an all-time high of 192.090 1975=100 in 1993 and a record low of 100.000 1975=100 in 1975. Germany Commercial Property Market Index: WG: 49 Cities: Average Retail Rent: Suburban data remains active status in CEIC and is reported by Bulwiengesa AG. The data is categorized under Global Database’s Germany – Table DE.EB004: Property Market Index.

  11. T

    Slovakia - Median of the housing cost burden distribution: Towns and suburbs...

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jul 27, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Slovakia - Median of the housing cost burden distribution: Towns and suburbs [Dataset]. https://tradingeconomics.com/slovakia/median-of-the-housing-cost-burden-distribution-towns-suburbs-eurostat-data.html
    Explore at:
    csv, excel, xml, jsonAvailable download formats
    Dataset updated
    Jul 27, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Slovakia
    Description

    Slovakia - Median of the housing cost burden distribution: Towns and suburbs was 16.40% in December of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Slovakia - Median of the housing cost burden distribution: Towns and suburbs - last updated from the EUROSTAT on December of 2025. Historically, Slovakia - Median of the housing cost burden distribution: Towns and suburbs reached a record high of 18.90% in December of 2009 and a record low of 13.90% in December of 2022.

  12. T

    Euro Area - Median of the housing cost burden distribution: Towns and...

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jul 27, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Euro Area - Median of the housing cost burden distribution: Towns and suburbs [Dataset]. https://tradingeconomics.com/euro-area/median-of-the-housing-cost-burden-distribution-towns-suburbs-eurostat-data.html
    Explore at:
    xml, json, excel, csvAvailable download formats
    Dataset updated
    Jul 27, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Euro Area
    Description

    Euro Area - Median of the housing cost burden distribution: Towns and suburbs was 13.50% in December of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Euro Area - Median of the housing cost burden distribution: Towns and suburbs - last updated from the EUROSTAT on November of 2025. Historically, Euro Area - Median of the housing cost burden distribution: Towns and suburbs reached a record high of 16.50% in December of 2013 and a record low of 12.90% in December of 2020.

  13. Housing data with correlated variables

    • kaggle.com
    zip
    Updated Nov 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sororos (2023). Housing data with correlated variables [Dataset]. https://www.kaggle.com/datasets/sororos/housing-data-with-correlated-variables
    Explore at:
    zip(192021 bytes)Available download formats
    Dataset updated
    Nov 30, 2023
    Authors
    sororos
    Description

    This is the Boston Housing Dataset, copied from: https://www.kaggle.com/datasets/vikrishnan/boston-house-prices

    Each record in the database describes a Boston suburb or town. The data was drawn from the Boston Standard Metropolitan Statistical Area (SMSA) in 1970. The attributes are defined as follows (taken from the UCI Machine Learning Repository1): CRIM: per capita crime rate by town

    CRIM per capita crime rate by town ZN proportion of residential land zoned for lots over 25,000 sq.ft. INDUS proportion of non-retail business acres per town CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise) NOX nitric oxides concentration (parts per 10 million) RM average number of rooms per dwelling AGE proportion of owner-occupied units built prior to 1940 DIS weighted distances to five Boston employment centres RAD index of accessibility to radial highways TAX full-value property-tax rate per 10 000 USD PTRATIO pupil-teacher ratio by town B 1000 (Bk - 0.63)^2 where Bk is the proportion of black people by town LSTAT % lower status of the population MEDV Median value of owner-occupied homes in $1000's Missing values: None

    Duplicate entries: None

    This is a copy of UCI ML housing dataset. https://archive.ics.uci.edu/ml/machine-learning-databases/housing/

    It has then been amended to include multiple different correlations:

    Directly Derived Features - New features created by applying direct transformations to existing features. For example a scaled version of another (e.g., CRIM_dup_2 = CRIM * 2), or adding some noise to an existing feature (e.g., RM_noisy = RM + random_noise).

    Linear Combinations - Combining existing features linearly. For instance, a feature that is a weighted sum of several other features (e.g., weighted_feature = 0.5 * CRIM + 0.3 * NOX + 0.2 * RM).

    Polynomial Features - Creating polynomial transformations of existing features. For example, square or cube a feature (e.g., AGE_squared = AGE^2). These will have a predictable correlation with their original feature.

    Interaction Terms - Generating features that are the product of two existing features. Revealing interactions between variables (e.g., TAX_RAD_interaction = TAX * RAD).

    Duplicate Features with Variations: Duplicate some existing features and add small variations. For example, copy a feature and add a random small value to each entry (e.g., LSTAT_varied = LSTAT + small_random_value).

    These have been done by taking the dataset in python and transforming it, for example:

    ``import pandas as pd import random import numpy as np

    List of original column names

    original_columns = ["CRIM", "ZN", "INDUS", "CHAS", "NOX", "RM", "AGE", "DIS", "RAD", "TAX", "PTRATIO", "B", "LSTAT"]

    Generate features

    for col_name in original_columns: # Linear Combinations other_cols = random.sample([c for c in original_columns if c != col_name], 2) df[f"{col_name}_linear_combo"] = 0.5 * df[col_name] + 0.3 * df[other_cols[0]] + 0.2 * df[other_cols[1]]

    # Polynomial Features
    df[f"{col_name}_squared"] = df[col_name] ** 2
    
    # Interaction Terms
    other_col = random.choice([c for c in original_columns if c != col_name])
    df[f"{col_name}_{other_col}_interaction"] = df[col_name] * df[other_col]
    
    # Duplicate Features with Variations
    df[f"{col_name}_varied"] = df[col_name] + (np.random.rand(df.shape[0]) * 0.05)
    

    Display the DataFrame

    print(df) ``

  14. T

    Poland - Median of the housing cost burden distribution: Towns and suburbs

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jul 27, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Poland - Median of the housing cost burden distribution: Towns and suburbs [Dataset]. https://tradingeconomics.com/poland/median-of-the-housing-cost-burden-distribution-towns-suburbs-eurostat-data.html
    Explore at:
    csv, xml, json, excelAvailable download formats
    Dataset updated
    Jul 27, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Poland
    Description

    Poland - Median of the housing cost burden distribution: Towns and suburbs was 12.70% in December of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Poland - Median of the housing cost burden distribution: Towns and suburbs - last updated from the EUROSTAT on November of 2025. Historically, Poland - Median of the housing cost burden distribution: Towns and suburbs reached a record high of 18.30% in December of 2014 and a record low of 12.70% in December of 2024.

  15. Clean Boston Housing Dataset

    • kaggle.com
    zip
    Updated Aug 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Barkat Ali Arbab (2025). Clean Boston Housing Dataset [Dataset]. https://www.kaggle.com/datasets/barkataliarbab/boston-housing-dataset-for-regression-modeling/code
    Explore at:
    zip(10425 bytes)Available download formats
    Dataset updated
    Aug 6, 2025
    Authors
    Barkat Ali Arbab
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Overview

    This dataset is a cleaned and updated version of the classic Boston Housing Dataset, originally made available by the U.S. Census and later popularized in machine learning communities. It contains detailed information about housing prices in Boston suburbs, along with environmental, structural, and socio-economic indicators for each neighborhood.

    The dataset is widely used as a benchmark for regression tasks and offers an excellent opportunity to explore linear modeling, feature engineering, multicollinearity analysis, bias mitigation, and more. 📚 Context

    Originally published by Harrison and Rubinfeld in 1978, this dataset has been widely adopted in the machine learning and statistics communities. It contains 506 observations, each representing a town or neighborhood in the Boston metropolitan area.

    However, some features in the dataset—particularly the B column which encodes race-based information—have become the subject of ethical scrutiny in recent years. Therefore, this version may have undergone data cleaning, feature selection, or modification to ensure it is more appropriate for modern and ethical ML applications. 📊 Features Feature Description CRIM Per capita crime rate by town ZN Proportion of residential land zoned for lots over 25,000 sq. ft. INDUS Proportion of non-retail business acres per town CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise) NOX Nitric oxides concentration (parts per 10 million) RM Average number of rooms per dwelling AGE Proportion of owner-occupied units built before 1940 DIS Weighted distance to five Boston employment centers RAD Index of accessibility to radial highways TAX Property tax rate per $10,000 PTRATIO Pupil-teacher ratio by town B 1000(Bk - 0.63)^2 where Bk is the proportion of Black residents LSTAT Percentage of lower-status population MEDV Median value of owner-occupied homes in $1000s (Target Variable)

    🟡 Note: Some features (e.g., CHAS, B, or RAD) may have been removed or modified in this version depending on your ethical preprocessing or cleaning steps.
    

    🎯 Target Variable

    MEDV: Median value of owner-occupied homes (in $1000s). This is the value we aim to predict in regression tasks.
    

    ✅ Use Cases

    This dataset is ideal for:

    Predictive modeling using linear regression or advanced ML techniques
    
    Feature engineering and feature selection
    
    Studying the effects of urban and environmental variables on real estate prices
    
    Analyzing multicollinearity and variable importance
    
    Exploring ethical considerations in machine learning
    

    ⚖️ Ethical Considerations

    The original dataset includes the feature B, which encodes racial information. While historically included for statistical analysis, modern ML best practices recommend caution when using such data to avoid unintended bias or discrimination.
    
    In this version, you may choose to remove or retain the column depending on the intended use and audience.
    
    Always consider the fairness, accountability, and transparency of your ML models.
    

    📁 File Information

    Filename: boston_housing_cleaned.csv
    
    Records: 506 rows (observations)
    
    Columns: 13 features + 1 target variable (depending on cleaning)
    
    Missing Values: None (in original); NA if introduced during preprocessing
    
    Source: Based on U.S. Census data (original), sourced from Kaggle and cleaned
    

    📌 Tags

    housing-prices · regression · real-estate · data-cleaning · ethical-ml · boston · exploratory-data-analysis · feature-engineering

  16. g

    Victorian Property Sales Report - Median House by Suburb Quarterly |...

    • gimi9.com
    Updated Jul 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Victorian Property Sales Report - Median House by Suburb Quarterly | gimi9.com [Dataset]. https://gimi9.com/dataset/au_victorian-property-sales-report-median-house-by-suburb/
    Explore at:
    Dataset updated
    Jul 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The report lists the percentage shift in median prices between quarters as well as the change over a 12-month period. An overall Melbourne metropolitan median sale price and country Victoria median sale price are also included for each property type.

  17. T

    Hungary - Median of the housing cost burden distribution: Towns and suburbs

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jul 27, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Hungary - Median of the housing cost burden distribution: Towns and suburbs [Dataset]. https://tradingeconomics.com/hungary/median-of-the-housing-cost-burden-distribution-towns-suburbs-eurostat-data.html
    Explore at:
    xml, excel, csv, jsonAvailable download formats
    Dataset updated
    Jul 27, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Hungary
    Description

    Hungary - Median of the housing cost burden distribution: Towns and suburbs was 14.30% in December of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Hungary - Median of the housing cost burden distribution: Towns and suburbs - last updated from the EUROSTAT on December of 2025. Historically, Hungary - Median of the housing cost burden distribution: Towns and suburbs reached a record high of 22.80% in December of 2012 and a record low of 9.80% in December of 2020.

  18. North America Luxury Residential Real Estate Market Size By Type...

    • verifiedmarketresearch.com
    Updated Mar 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2025). North America Luxury Residential Real Estate Market Size By Type (Single-Family Homes, Condominiums, Penthouses, Townhouses), By Location (Urban Centers, Suburban Areas, Resort Destinations), By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/north-america-luxury-residential-real-estate-market/
    Explore at:
    Dataset updated
    Mar 26, 2025
    Dataset provided by
    Verified Market Researchhttps://www.verifiedmarketresearch.com/
    Authors
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Time period covered
    2026 - 2032
    Area covered
    North America
    Description

    North America Luxury Residential Real Estate Market size was valued at USD 150 Billion in 2024 and is expected to reach USD 239.08 Billion by 2032, growing at a CAGR of 6% from 2026 to 2032.

    North America Luxury Residential Real Estate Market Dynamics

    The key market dynamics that are shaping the North America luxury residential real estate market include:

    Key Market Drivers

    Rising HNWI Population and Wealth Accumulation: The North American HNWI population has shown significant growth, with studies indicating that 78% of HNWIs consider luxury real estate a primary investment vehicle. Research reveals that 65% of HNWIs plan to increase their real estate portfolio allocation by 2025, with an average investment size of $4.2 million per property acquisition. This trend is particularly pronounced in gateway cities where luxury property appreciation has outpaced general market gains by 2.3 times.

  19. Average price of newly built residential properties in Shanghai 2023, by...

    • statista.com
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Average price of newly built residential properties in Shanghai 2023, by location [Dataset]. https://www.statista.com/statistics/993524/china-average-price-of-new-residential-property-in-shanghai-by-location/
    Explore at:
    Dataset updated
    Nov 29, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    China
    Description

    In 2022, the price for new residential property in Shanghai's inner ring dropped by more than ***** yuan per square meter, to ******* yuan per square meter. Although the local authorities introduced policies to stabilize the market, the real estate market in Shanghai’s central districts remained under downward pressure, similar to those experienced by other major cities in China. The most competitive real estate market in the country Home prices in Shanghai are among the most expensive globally. The area within the city's inner ring road is certainly one of the most competitive real estate markets in all of China, with property prices nearly *********** higher than those outside the outer ring road. Rising prices are far beyond the reach of ordinary residents, and the few who can afford to buy often have to take out substantial mortgages for their homes, resulting in a high proportion of real estate in their personal assets. Challenges facing China’s real estate sector The high level of indebtedness of the Chinese people and the bubbles in the country's real estate sector have become one of the major risks to China's economy. While developers expanded through continuous borrowing and the sale of off-plan properties to homebuyers, the market saw a significant excess of housing supply in most regions. There have also been instances in recent years where developers have had difficulties in completing construction projects or in repaying their loans or bonds. Addressing the risks in China's real estate sector, particularly in companies such as the Evergrande Group and Country Garden, has become an urgent task to ensure China's economic stability and prosperity.

  20. T

    Croatia - Median of the housing cost burden distribution: Towns and suburbs

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jul 27, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). Croatia - Median of the housing cost burden distribution: Towns and suburbs [Dataset]. https://tradingeconomics.com/croatia/median-of-the-housing-cost-burden-distribution-towns-suburbs-eurostat-data.html
    Explore at:
    excel, csv, json, xmlAvailable download formats
    Dataset updated
    Jul 27, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    Croatia
    Description

    Croatia - Median of the housing cost burden distribution: Towns and suburbs was 9.80% in December of 2024, according to the EUROSTAT. Trading Economics provides the current actual value, an historical data chart and related indicators for Croatia - Median of the housing cost burden distribution: Towns and suburbs - last updated from the EUROSTAT on December of 2025. Historically, Croatia - Median of the housing cost burden distribution: Towns and suburbs reached a record high of 19.20% in December of 2010 and a record low of 9.80% in December of 2024.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Zafar (2024). House Price Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/zafarali27/house-price-prediction-dataset
Organization logo

House Price Prediction Dataset

House Price Prediction Dataset

Explore at:
zip(29372 bytes)Available download formats
Dataset updated
Sep 21, 2024
Authors
Zafar
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

House Price Prediction Dataset.

The dataset contains 2000 rows of house-related data, representing various features that could influence house prices. Below, we discuss key aspects of the dataset, which include its structure, the choice of features, and potential use cases for analysis.

1. Dataset Features

The dataset is designed to capture essential attributes for predicting house prices, including:

Area: Square footage of the house, which is generally one of the most important predictors of price. Bedrooms & Bathrooms: The number of rooms in a house significantly affects its value. Homes with more rooms tend to be priced higher. Floors: The number of floors in a house could indicate a larger, more luxurious home, potentially raising its price. Year Built: The age of the house can affect its condition and value. Newly built houses are generally more expensive than older ones. Location: Houses in desirable locations such as downtown or urban areas tend to be priced higher than those in suburban or rural areas. Condition: The current condition of the house is critical, as well-maintained houses (in 'Excellent' or 'Good' condition) will attract higher prices compared to houses in 'Fair' or 'Poor' condition. Garage: Availability of a garage can increase the price due to added convenience and space. Price: The target variable, representing the sale price of the house, used to train machine learning models to predict house prices based on the other features.

2. Feature Distributions

Area Distribution: The area of the houses in the dataset ranges from 500 to 5000 square feet, which allows analysis across different types of homes, from smaller apartments to larger luxury houses. Bedrooms and Bathrooms: The number of bedrooms varies from 1 to 5, and bathrooms from 1 to 4. This variance enables analysis of homes with different sizes and layouts. Floors: Houses in the dataset have between 1 and 3 floors. This feature could be useful for identifying the influence of multi-level homes on house prices. Year Built: The dataset contains houses built from 1900 to 2023, giving a wide range of house ages to analyze the effects of new vs. older construction. Location: There is a mix of urban, suburban, downtown, and rural locations. Urban and downtown homes may command higher prices due to proximity to amenities. Condition: Houses are labeled as 'Excellent', 'Good', 'Fair', or 'Poor'. This feature helps model the price differences based on the current state of the house. Price Distribution: Prices range between $50,000 and $1,000,000, offering a broad spectrum of property values. This range makes the dataset appropriate for predicting a wide variety of housing prices, from affordable homes to luxury properties.

3. Correlation Between Features

A key area of interest is the relationship between various features and house price: Area and Price: Typically, a strong positive correlation is expected between the size of the house (Area) and its price. Larger homes are likely to be more expensive. Location and Price: Location is another major factor. Houses in urban or downtown areas may show a higher price on average compared to suburban and rural locations. Condition and Price: The condition of the house should show a positive correlation with price. Houses in better condition should be priced higher, as they require less maintenance and repair. Year Built and Price: Newer houses might command a higher price due to better construction standards, modern amenities, and less wear-and-tear, but some older homes in good condition may retain historical value. Garage and Price: A house with a garage may be more expensive than one without, as it provides extra storage or parking space.

4. Potential Use Cases

The dataset is well-suited for various machine learning and data analysis applications, including:

House Price Prediction: Using regression techniques, this dataset can be used to build a model to predict house prices based on the available features. Feature Importance Analysis: By using techniques such as feature importance ranking, data scientists can determine which features (e.g., location, area, or condition) have the greatest impact on house prices. Clustering: Clustering techniques like k-means could help identify patterns in the data, such as grouping houses into segments based on their characteristics (e.g., luxury homes, affordable homes). Market Segmentation: The dataset can be used to perform segmentation by location, price range, or house type to analyze trends in specific sub-markets, like luxury vs. affordable housing. Time-Based Analysis: By studying how house prices vary with the year built or the age of the house, analysts can derive insights into the trends of older vs. newer homes.

5. Limitations and ...

Search
Clear search
Close search
Google apps
Main menu