Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Task Description: Real Estate Price Prediction
This task involves predicting the price of real estate properties based on various features that influence the value of a property. The dataset contains several attributes of real estate properties such as square footage, the number of bedrooms, bathrooms, floors, the year the property was built, whether the property has a garden or pool, the size of the garage, the location score, and the distance from the city center.
The goal is to build a regression model that can predict the Price of a property based on the provided features.
Dataset Columns:
ID: A unique identifier for each property.
Square_Feet: The area of the property in square meters.
Num_Bedrooms: The number of bedrooms in the property.
Num_Bathrooms: The number of bathrooms in the property.
Num_Floors: The number of floors in the property.
Year_Built: The year the property was built.
Has_Garden: Indicates whether the property has a garden (1 for yes, 0 for no).
Has_Pool: Indicates whether the property has a pool (1 for yes, 0 for no).
Garage_Size: The size of the garage in square meters.
Location_Score: A score from 0 to 10 indicating the quality of the neighborhood (higher scores indicate better neighborhoods).
Distance_to_Center: The distance from the property to the city center in kilometers.
Price: The target variable that represents the price of the property. This is the value we aim to predict.
Objective: The goal of this task is to develop a regression model that predicts the Price of a real estate property using the other features as inputs. The model should be able to learn the relationship between these features and the price, providing an accurate prediction for unseen data.
Facebook
TwitterAs a Data scientist, who yearns to experiment, learn and explore different techniques applied in this field, one cannot overlook the importance of application of Exploratory Data Analysis on various datasets out there.
This housing dataset provides a thorough analysis of the current state of the housing market. It includes information on housing prices, availability, and key trends, allowing you to gain a better understanding of the market and make informed decisions. Whether you're a homebuyer, investor, or simply interested in the state of the housing market, this dataset has valuable insights to offer.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Inspired by the quintessential House Prices Starter Competition and the popular Melbourne Housing Dataset, this dataset captures 4K+ condominium unit listings on the Malaysian housing website mudah.my.
Like the above datasets, your job is to predict the house prices given certain parameters.
The data was scraped directly from the website using this data collection notebook. I might adapt the code to include houses as well in the future, but scraping the data takes a while due to having to wait for the website to load and having to timeout to account for CloudFlare's protections.
Note: This data is a lot less clean and organized than the data in the two datasets mentioned above. However, this is a good opportunity to practice data cleaning techniques, as this is something that is often overlooked on Kaggle. That being said, I made a starter notebook that goes through the data cleaning steps and outputs a fairly cleaned version of the dataset.
description: The full (unfiltered) description for the unit listing.Ad List: The ID of the listing on the website.Category: The category of the listing. It will most likely be Apartment / Condominium.Facilities: The facilities that the apartment has, in a comma-separated list.Building Name: The name of the building.Developer: The developer for the building.Tenure Type: The type of tenure for the building.Address: The address of the building. You can refer to this link for a description of what Malaysian addresses look like.Completion Year: The completion year of the building. If the building is still under construction, this is listed as -.# of Floors: The number of floors in the building.Total Units: The total number of units in the building.Property Type: The type of property.Bedroom: The number of bedrooms in the unit.Bathroom: The number of bathrooms in the unit.Parking Lot: The number of parking lots assigned to the unit, if any.Floor Range: The floor range for the building.Property Size: The size of the unit.Land Title: The title given to the land. This link explains what land titles are.Firm Type: The type of firm who posted the listing.Firm Number: The ID of the firm who posted the listing.REN Number: The REN number of the firm who posted the listing. Refer to this link for what REN numbers are.price: The price of the unit. This is what you are trying to predict.Nearby School/School: If there is a nearby school to the unit, which school it is.Park: If there is a nearby park to the unit, which park it is.Nearby Railway Station: If there is a nearby railway station to the unit, which railway station it is.Bus Stop: If there is a nearby bus stop to the unit, which station it is.Nearby Mall/Mall: If there is a nearby mall to the unit, which mall it is.Highway: If there is a nearby highway to the unit, which highway it is.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Nahb Housing Market Index in the United States increased to 37 points in October from 32 points in September of 2025. This dataset provides the latest reported value for - United States Nahb Housing Market Index - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Facebook
TwitterThe Housing Affordability Data System (HADS) is a set of files derived from the 1985 and later national American Housing Survey (AHS) and the 2002 and later Metro AHS. This system categorizes housing units by affordability and households by income, with respect to the Adjusted Median Income, Fair Market Rent (FMR), and poverty income. It also includes housing cost burden for owner and renter households. These files have been the basis for the worst case needs tables since 2001. The data files are available for public use, since they were derived from AHS public use files and the published income limits and FMRs. These dataset give the community of housing analysts the opportunity to use a consistent set of affordability measures. The most recent year HADS is available as a Public Use File (PUF) is 2013. For 2015 and beyond, HADS is only available as an IUF and can no longer be released on a PUF. Those seeking access to more recent data should reach to the listed point of contact.
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Enrich your real estate strategies and market insights with our comprehensive New York Housing dataset. Analyzing this dataset can aid in understanding housing market dynamics and trends, empowering organizations to refine their investment strategies and business decisions. Access the entire dataset or tailor a subset to fit your requirements.
Popular use cases include optimizing investment strategies based on neighborhood engagement and property popularity, performing detailed user behavior analysis and segmentation by housing type, price range, and location to tailor marketing and engagement efforts, and identifying and forecasting emerging trends in the New York housing market to stay ahead in the competitive real estate industry.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Total Housing Inventory in the United States increased to 1550 Thousands in September from 1530 Thousands in August of 2025. This dataset includes a chart with historical data for the United States Total Housing Inventory.
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Looking to analyze the real estate market across the USA? Our Redfin real estate dataset provides a detailed sample of property listings, including prices, addresses, property features, and images. This dataset is perfect for analysts, developers, and real estate enthusiasts looking to gain insights into housing trends and market dynamics.
The dataset includes fields such as price, currency, address, property details, number of beds and baths, square footage, listing status, images, and more, giving you a robust foundation for analysis.
You can explore the full dataset and download the sample from Redfin real estate dataset. This makes it easy to integrate into your analytics pipelines, machine learning models, or market research projects.
Whether you're building a property analytics dashboard, testing real estate algorithms, or simply exploring housing trends, this dataset provides rich, up-to-date information directly from Redfin listings across the USA.
Start analyzing the USA housing market today with our Redfin dataset sample and make data-driven decisions with confidence.
Facebook
TwitterOpen Government Licence 2.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/2/
License information was derived automatically
Average House Price
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was generated for analyzing the economic impacts of subway networks on housing prices in metropolitan areas. The provision of transit networks and accompanying improvement in accessibility induce various impacts and we focused on the economic impacts realized through housing prices. As a proxy of housing price, we consider the price of condominiums, the dominant housing type in South Korea. Although our focus is transit accessibility and housing prices, the presented dataset is applicable to other studies. In particular, it provides a wide range of variables closely related to housing price, including housing properties, local amenities, local demographic characteristics, and control variables for the seasonality. Many of these variables were scientifically generated by our research team. Various distance variables were constructed in a geographic information system environment based on public data and they are useful not only for exploring environmental impacts on housing prices, but also for other statistical analyses in regard to real estate and social science research. The four metropolitan areas covered by the data—Busan, Daegu, Daejeon, and Gwangju—are independent of the transit systems of Greater Seoul, providing accurate information on the metropolitan structure separate from the capital city.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
A dataset of indicators of the state of the UK housing market This is a collection of indicators from diverse sources on different aspects of the state of the UK housing market. Some indicators are updated monthly, others quarterly. Publication of this dataset began in August 2012. The choice of which indicators are included in this dataset may be subject to revision, but the intention is to update the dataset regularly as new data become available. Historical time series have been added for some (but not yet all) of the indicators.
Facebook
TwitterThe purpose of this kernel is to predict the price of a house that a realtor can charge, or a customer can invest to buy a house by considering multiple input factors. Also, to classify the houses into Good and Excellent category based on the input variables by using best machine learning classification and regression algorithms with more efficiency.
This dataset contains house sale prices for King County, which includes Seattle. It includes homes sold between May 2014 and May 2015. The dataset is pretty unbalanced with wide range of houses information that are built and renovated from the year 1990 to 2015. The dataset has total 21 variables including price,price, condition, number of bedrooms, bathrooms and other features of house.
I was inspired by the House sales dataset in King County, USA (https://www.kaggle.com/harlfoxem/housesalesprediction) and House Sales in Ontario (https://www.kaggle.com/mnabaee/ontarioproperties) datasets and the predictions and classifiers used.
Sale of Houses can go high and low depending on the market and multiple factors like location, number of bedrooms, year built etc. All these factors help in deriving the sale price of the house and grading of the house. Millions of houses information can be stored with all the details and factors in the historical timelines. Using machine learning techniques, we can analyze the data and predict the price of new houses and also classify the houses and fix a price value by calculating all the factors that directly or indirectly impact on the overall sale of house.
Facebook
TwitterDataset on Housing Prices in the Philippines, scraped from from Lamudi on May 2023.
Facebook
TwitterIn 2017, the County Department of Economic Development, in conjunction with Reinvestment Fund, completed the 2016 Market Value Analysis (MVA) for Allegheny County. A similar MVA was completed with the Pittsburgh Urban Redevelopment Authority in 2016. The Market Value Analysis (MVA) offers an approach for community revitalization; it recommends applying interventions not only to where there is a need for development but also in places where public investment can stimulate private market activity and capitalize on larger public investment activities. The MVA is a unique tool for characterizing markets because it creates an internally referenced index of a municipality’s residential real estate market. It identifies areas that are the highest demand markets as well as areas of greatest distress, and the various markets types between. The MVA offers insight into the variation in market strength and weakness within and between traditional community boundaries because it uses Census block groups as the unit of analysis. Where market types abut each other on the map becomes instructive about the potential direction of market change, and ultimately, the appropriateness of types of investment or intervention strategies. The 2016 Allegheny County MVA does not include the City of Pittsburgh, which was characterized at the same time in the fourth update of the City of Pittsburgh’s MVA. All calculations herein therefore do not include the City of Pittsburgh. While the methodology between the City and County MVA's are very similar, the classification of communities will differ, and so the data between the two should not be used interchangeably. Allegheny County's MVA utilized data that helps to define the local real estate market. Most data used covers the 2013-2016 period, and data used in the analysis includes: •Residential Real Estate Sales; • Mortgage Foreclosures; • Residential Vacancy; • Parcel Year Built; • Parcel Condition; • Owner Occupancy; and • Subsidized Housing Units. The MVA uses a statistical technique known as cluster analysis, forming groups of areas (i.e., block groups) that are similar along the MVA descriptors, noted above. The goal is to form groups within which there is a similarity of characteristics within each group, but each group itself different from the others. Using this technique, the MVA condenses vast amounts of data for the universe of all properties to a manageable, meaningful typology of market types that can inform area-appropriate programs and decisions regarding the allocation of resources. During the research process, staff from the County and Reinvestment Fund spent an extensive amount of effort ensuring the data and analysis was accurate. In addition to testing the data, staff physically examined different areas to verify the data sets being used were appropriate indicators and the resulting MVA categories accurately reflect the market. Please refer to the report (included here as a pdf) for more information about the data, methodology, and findings.
Facebook
Twitterttd22/house-price dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Housing Starts in the United States decreased to 1307 Thousand units in August from 1429 Thousand units in July of 2025. This dataset provides the latest reported value for - United States Housing Starts - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Average house prices are derived from data supplied by the mortgage lending agencies on loans approved by them rather than loans paid. In comparing house prices figures from one period to another, account should be taken of the fact that changes in the mix of houses (incl apartments) will affect the average figures. The most current data is published on these sheets. Previously published data may be subject to revision. Any change from the originally published data will be highlighted by a comment on the cell in question. These comments will be maintained for at least a year after the date of the value change. Excluding apartments, measured in EUR Figure changed on the 27/6/16 as revised data received from the Local authority
Facebook
TwitterTreasury and the U.S. Department of Housing and Urban Development (HUD) jointly produce a Monthly Housing Scorecard on the health of the nation’s housing market. The Scorecard is generally released during the first week of each month.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Muhammad Hanzla Tahir
Released under MIT
Facebook
TwitterThis page provides data for the Housing Inventory Ratio performance measure. This dataset includes both quantity and percentage information about housing stock in Tempe based on affordability categories of (Affordable, Workforce, and Market Rate). The performance measure dashboard is available at 4.09 Housing Inventory Ratio. Additional Information Source: 3rd Party ReportContact: Irma Hollamby CainContact E-Mail: irma_hollambycain@tempe.govData Source Type: Excel / CSVPreparation Method: Manual extractionPublish Frequency: AnnualPublish Method: ManualData Dictionary
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Task Description: Real Estate Price Prediction
This task involves predicting the price of real estate properties based on various features that influence the value of a property. The dataset contains several attributes of real estate properties such as square footage, the number of bedrooms, bathrooms, floors, the year the property was built, whether the property has a garden or pool, the size of the garage, the location score, and the distance from the city center.
The goal is to build a regression model that can predict the Price of a property based on the provided features.
Dataset Columns:
ID: A unique identifier for each property.
Square_Feet: The area of the property in square meters.
Num_Bedrooms: The number of bedrooms in the property.
Num_Bathrooms: The number of bathrooms in the property.
Num_Floors: The number of floors in the property.
Year_Built: The year the property was built.
Has_Garden: Indicates whether the property has a garden (1 for yes, 0 for no).
Has_Pool: Indicates whether the property has a pool (1 for yes, 0 for no).
Garage_Size: The size of the garage in square meters.
Location_Score: A score from 0 to 10 indicating the quality of the neighborhood (higher scores indicate better neighborhoods).
Distance_to_Center: The distance from the property to the city center in kilometers.
Price: The target variable that represents the price of the property. This is the value we aim to predict.
Objective: The goal of this task is to develop a regression model that predicts the Price of a real estate property using the other features as inputs. The model should be able to learn the relationship between these features and the price, providing an accurate prediction for unseen data.