Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview: This dataset was collected and curated to support research on predicting real estate prices using machine learning algorithms, specifically Support Vector Regression (SVR) and Gradient Boosting Machine (GBM). The dataset includes comprehensive information on residential properties, enabling the development and evaluation of predictive models for accurate and transparent real estate appraisals.Data Source: The data was sourced from Department of Lands and Survey real estate listings.Features: The dataset contains the following key attributes for each property:Area (in square meters): The total living area of the property.Floor Number: The floor on which the property is located.Location: Geographic coordinates or city/region where the property is situated.Type of Apartment: The classification of the property, such as studio, one-bedroom, two-bedroom, etc.Number of Bathrooms: The total number of bathrooms in the property.Number of Bedrooms: The total number of bedrooms in the property.Property Age (in years): The number of years since the property was constructed.Property Condition: A categorical variable indicating the condition of the property (e.g., new, good, fair, needs renovation).Proximity to Amenities: The distance to nearby amenities such as schools, hospitals, shopping centers, and public transportation.Market Price (target variable): The actual sale price or listed price of the property.Data Preprocessing:Normalization: Numeric features such as area and proximity to amenities were normalized to ensure consistency and improve model performance.Categorical Encoding: Categorical features like property condition and type of apartment were encoded using one-hot encoding or label encoding, depending on the specific model requirements.Missing Values: Missing data points were handled using appropriate imputation techniques or by excluding records with significant missing information.Usage: This dataset was utilized to train and test machine learning models, aiming to predict the market price of residential properties based on the provided attributes. The models developed using this dataset demonstrated improved accuracy and transparency over traditional appraisal methods.Dataset Availability: The dataset is available for public use under the [CC BY 4.0]. Users are encouraged to cite the related publication when using the data in their research or applications.Citation: If you use this dataset in your research, please cite the following publication:[Real Estate Decision-Making: Precision in Price Prediction through Advanced Machine Learning Algorithms].
Problem Statement
👉 Download the case studies here
Investors and buyers in the real estate market faced challenges in accurately assessing property values and market trends. Traditional valuation methods were time-consuming and lacked precision, making it difficult to make informed investment decisions. A real estate firm sought a predictive analytics solution to provide accurate property price forecasts and market insights.
Challenge
Developing a real estate price prediction system involved addressing the following challenges:
Collecting and processing vast amounts of data, including historical property prices, economic indicators, and location-specific factors.
Accounting for diverse variables such as neighborhood quality, proximity to amenities, and market demand.
Ensuring the model’s adaptability to changing market conditions and economic fluctuations.
Solution Provided
A real estate price prediction system was developed using machine learning regression models and big data analytics. The solution was designed to:
Analyze historical and real-time data to predict property prices accurately.
Provide actionable insights on market trends, enabling better investment strategies.
Identify undervalued properties and potential growth areas for investors.
Development Steps
Data Collection
Collected extensive datasets, including property listings, sales records, demographic data, and economic indicators.
Preprocessing
Cleaned and structured data, removing inconsistencies and normalizing variables such as location, property type, and size.
Model Development
Built regression models using techniques such as linear regression, decision trees, and gradient boosting to predict property prices. Integrated feature engineering to account for location-specific factors, amenities, and market trends.
Validation
Tested the models using historical data and cross-validation to ensure high prediction accuracy and robustness.
Deployment
Implemented the prediction system as a web-based platform, allowing users to input property details and receive price estimates and market insights.
Continuous Monitoring & Improvement
Established a feedback loop to update models with new data and refine predictions as market conditions evolved.
Results
Increased Prediction Accuracy
The system delivered highly accurate property price forecasts, improving investor confidence and decision-making.
Informed Investment Decisions
Investors and buyers gained valuable insights into market trends and property values, enabling better strategies and reduced risks.
Enhanced Market Insights
The platform provided detailed analytics on neighborhood trends, demand patterns, and growth potential, helping users identify opportunities.
Scalable Solution
The system scaled seamlessly to include new locations, property types, and market dynamics.
Improved User Experience
The intuitive platform design made it easy for users to access predictions and insights, boosting engagement and satisfaction.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Housing Index in the United States decreased to 436.60 points in March from 436.80 points in February of 2025. This dataset provides the latest reported value for - United States House Price Index MoM Change - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Just as in many other countries, the housing market in the UK grew substantially during the coronavirus pandemic, fueled by robust demand and low borrowing costs. Nevertheless, high inflation and the increase in mortgage rates has led to house price growth slowing down. According to the forecast, 2024 is expected to see house prices decrease by three percent. Between 2024 and 2028, the average house price growth is projected at 2.7 percent. A contraction after a period of continuous growth In June 2022, the UK's house price index exceeded 150 index points, meaning that since 2015 which was the base year for the index, house prices had increased by 50 percent. In just two years, between 2020 and 2022, the index surged by 30 index points. As the market stood in December 2023, the average price for a home stood at approximately 284,691 British pounds. Rents are expected to continue to grow According to another forecast, the prime residential market is also expected to see rental prices grow in the next years. Growth is forecast to be stronger in 2024 and slow down in the period between 2025 and 2028. The rental market in London is expected to follow a similar trend, with Central London slightly outperforming Greater London.
Insurance companies collect multiple features of a House and select which houses can be insured and what amount they can charge the Premium from them. So here I have collected data from multiple insurance companies in USA where features with house prices are given
This data set has many property details from address to their location co ordinates nad many other features, use them to predict the House price
Multiple regression datasets have been published every one unique in their own way, Use of location coordinates and some other co-ordinates are new here.
Overview Welcome to the House Price Prediction Challenge, you will test your regression skills by designing an algorithm to accurately predict the house prices in India. Accurately predicting house prices can be a daunting task. The buyers are just not concerned about the size(square feet) of the house and there are various other factors that play a key role to decide the price of a house/property. It can be extremely difficult to figure out the right set of attributes that are contributing to understanding the buyer's behavior as such. This dataset has been collected across various property aggregators across India. In this competition, provided the 12 influencing factors your role as a data scientist is to predict the prices as accurately as possible.
Also, in this competition, you will get a lot of room for feature engineering and mastering advanced regression techniques such as Random Forest, Deep Neural Nets, and various other ensembling techniques.
Train.csv - 29451 rows x 12 columns Test.csv - 68720 rows x 11 columns Sample Submission - Acceptable submission format. (.csv/.xlsx file with 68720 rows)
POSTED_BY - Category marking who has listed the property UNDER_CONSTRUCTION - Under Construction or Not RERA - Rera approved or Not BHK_NO - Number of Rooms BHK_OR_RK - Type of property SQUARE_FT - Total area of the house in square feet READY_TO_MOVE - Category marking Ready to move or Not RESALE - Category marking Resale or not ADDRESS - Address of the property LONGITUDE - Longitude of the property LATITUDE - Latitude of the property
What is the Metric In this competition? How is the Leaderboard Calculated ?? The submission will be evaluated using the RMSLE (Root Mean Squared Logarithmic Error) metric. One can use np.sqrt(mean_squared_log_error( actual, predicted)) This hackathon supports private and public leaderboards The public leaderboard is evaluated on 30% of Test data The private leaderboard will be made available at the end of the hackathon which will be evaluated on 100% Test data
This is a data Shared by Machine Hack you can participate in Hackathon and submit your own submissions Link to Machine Hack, Hackathon- https://www.machinehack.com/hackathons/house_price_prediction_beat_the_benchmark/overview
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Single Family Home Prices in the United States increased to 414000 USD in April from 403700 USD in March of 2025. This dataset provides - United States Existing Single Family Home Prices- actual values, historical data, forecast, chart, statistics, economic calendar and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for HOUSE PRICE INDEX YOY reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
The UK House Price Index is a National Statistic.
Download the full UK House Price Index data below, or use our tool to https://landregistry.data.gov.uk/app/ukhpi?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=tool&utm_term=9.30_18_08_21" class="govuk-link">create your own bespoke reports.
Datasets are available as CSV files. Find out about republishing and making use of the data.
Google Chrome is blocking downloads of our UK HPI data files (Chrome 88 onwards). Please use another internet browser while we resolve this issue. We apologise for any inconvenience caused.
This file includes a derived back series for the new UK HPI. Under the UK HPI, data is available from 1995 for England and Wales, 2004 for Scotland and 2005 for Northern Ireland. A longer back series has been derived by using the historic path of the Office for National Statistics HPI to construct a series back to 1968.
Download the full UK HPI background file:
If you are interested in a specific attribute, we have separated them into these CSV files:
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Average-prices-2021-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=average_price&utm_term=9.30_18_08_21" class="govuk-link">Average price (CSV, 9.2MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Average-prices-Property-Type-2021-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=average_price_property_price&utm_term=9.30_18_08_21" class="govuk-link">Average price by property type (CSV, 28MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Sales-2021-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=sales&utm_term=9.30_18_08_21" class="govuk-link">Sales (CSV, 4.7MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Cash-mortgage-sales-2021-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=cash_mortgage-sales&utm_term=9.30_18_08_21" class="govuk-link">Cash mortgage sales (CSV, 6.1MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/First-Time-Buyer-Former-Owner-Occupied-2021-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=FTNFOO&utm_term=9.30_18_08_21" class="govuk-link">First time buyer and former owner occupier (CSV, 5.9MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/New-and-Old-2021-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=new_build&utm_term=9.30_18_08_21" class="govuk-link">New build and existing resold property (CSV, 17MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Indices-2021-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=index&utm_term=9.30_18_08_21" class="govuk-link">Index (CSV, 6MB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Indices-seasonally-adjusted-2021-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=index_season_adjusted&utm_term=9.30_18_08_21" class="govuk-link">Index seasonally adjusted (CSV, 192KB)
http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Average-price-seasonally-adjusted-2021-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=average-price_season_adjusted&utm_term=9.30_18_08_21" class="govuk-link">Average price seasonally adjusted</a
The U.S. housing market has slowed, after ** consecutive years of rising home prices. In 2021, house prices surged by an unprecedented ** percent, marking the highest increase on record. However, the market has since cooled, with the Freddie Mac House Price Index showing more modest growth between 2022 and 2024. In 2024, home prices increased by *** percent. That was lower than the long-term average of *** percent since 1990. Impact of mortgage rates on homebuying The recent cooling in the housing market can be partly attributed to rising mortgage rates. After reaching a record low of **** percent in 2021, the average annual rate on a 30-year fixed-rate mortgage more than doubled in 2023. This significant increase has made homeownership less affordable for many potential buyers, contributing to a substantial decline in home sales. Despite these challenges, forecasts suggest a potential recovery in the coming years. How much does it cost to buy a house in the U.S.? In 2023, the median sales price of an existing single-family home reached a record high of over ******* U.S. dollars. Newly built homes were even pricier, despite a slight decline in the median sales price in 2023. Naturally, home prices continue to vary significantly across the country, with West Virginia being the most affordable state for homebuyers.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We construct daily house price indices for 10 major US metropolitan areas. Our calculations are based on a comprehensive database of several million residential property transactions and a standard repeat-sales method that closely mimics the methodology of the popular monthly Case-Shiller house price indices. Our new daily house price indices exhibit dynamic features similar to those of other daily asset prices, with mild autocorrelation and strong conditional heteroskedasticity of the corresponding daily returns. A relatively simple multivariate time series model for the daily house price index returns, explicitly allowing for commonalities across cities and GARCH effects, produces forecasts of longer-run monthly house price changes that are superior to various alternative forecast procedures based on lower-frequency data.
A US-based housing company named Surprise Housing has decided to enter the Australian market. The company uses data analytics to purchase houses at a price below their actual values and flip them on at a higher price. The company is looking at prospective properties to buy to enter the market. You are required to build a regression model using regularization in order to predict the actual value of the prospective properties and decide whether to invest in them or not. The company wants to know the following things about the prospective properties: 1) Which variables are significant in predicting the price of a house, and 2)How well those variables describe the price of a house.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
we will predict home prices using linear regression. We use training data that has home areas in square feet and corresponding prices and train a linear regression model using sklearn linear regression.
This whole dataset plus the learning I got to know from the Codebasics Machine learning classes.. Also I learnt the whole concept from there.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The US residential real estate market, a cornerstone of the American economy, is projected to experience steady growth over the next decade. While the provided CAGR of 2.04% is a modest figure, it reflects a market maturing after a period of significant expansion. This sustained growth is driven by several key factors. Firstly, population growth and urbanization continue to fuel demand for housing, particularly in densely populated areas and emerging suburban markets. Secondly, low interest rates (historically, though this can fluctuate) have made mortgages more accessible, stimulating buyer activity. Thirdly, a robust construction sector, though facing challenges in material costs and labor shortages, is gradually increasing the housing supply, mitigating some of the upward pressure on prices. However, challenges remain. Rising inflation and potential interest rate hikes pose a risk to affordability, potentially dampening demand. Furthermore, the ongoing evolution of remote work is reshaping residential preferences, with a shift toward larger homes in suburban or exurban locations. This trend impacts the relative demand for various property types, potentially increasing the appeal of landed houses and villas compared to apartments and condominiums in certain regions. The segmentation of the market into apartments/condominiums and landed houses/villas provides crucial insights into consumer preferences and investment strategies. High-density urban areas will continue to see strong demand for apartments and condos, while suburban and rural areas are likely to experience a greater increase in landed property sales. Major players like Simon Property Group, Mill Creek Residential, and others are strategically adapting to these trends, focusing on both development and management across various property types and geographic locations. Analyzing regional data within the US (e.g., comparing growth in the Northeast versus the Southwest) will highlight market nuances and potential investment opportunities. While the global data provided is valuable for understanding broader market forces, focusing the analysis on the US market allows for a more granular understanding of the specific drivers, trends, and challenges within this significant segment of the real estate sector. The forecast period (2025-2033) suggests continued, albeit measured, expansion. Recent developments include: May 2022: Resource REIT Inc. completed the sale of all of its outstanding shares of common stock to Blackstone Real Estate Income Trust Inc. for USD 14.75 per share in an all-cash deal valued at USD 3.7 billion, including the assumption of the REIT's debt., February 2022: The largest owner of commercial real estate in the world and private equity company Blackstone is growing its portfolio of residential rentals and commercial properties in the United States. The company revealed that it would shell out about USD 6 billion to buy Preferred Apartment Communities, an Atlanta-based real estate investment trust that owns 44 multifamily communities and roughly 12,000 homes in the Southeast, mostly in Atlanta, Nashville, Charlotte, North Carolina, and the Florida cities of Jacksonville, Orlando, and Tampa.. Key drivers for this market are: Investment Plan Towards Urban Rail Development. Potential restraints include: Italy’s Fragmented Approach to Tenders. Notable trends are: Existing Home Sales Witnessing Strong Growth.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mapping spatial processes at a small scale is a challenge when observed data are not abundant. The article examines the residential housing market in Fort Worth, Texas, and builds price indices at the inter- and intra-neighborhood levels. To accomplish our objectives, we initially model price variability in the joint space–time continuum. We then use geostatistics to predict and map monthly housing prices across the area of interest over a period of 4 years. For this analysis, we introduce the Bayesian maximum entropy (BME) method into real estate research. We use BME because it rigorously integrates uncertain or secondary soft data, which are needed to build the price indices. The soft data in our analysis are property tax values, which are plentiful, publicly available, and highly correlated with transaction prices. The results demonstrate how the use of the soft data provides the ability to map house prices within a small areal unit such as a subdivision or neighborhood.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Housing Index in Spain increased to 1972.10 EUR/SQ. METRE in the fourth quarter of 2024 from 1921 EUR/SQ. METRE in the third quarter of 2024. This dataset provides the latest reported value for - Spain House Prices - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This project combines data extraction, predictive modeling, and geospatial mapping to analyze housing trends in Mercer County, New Jersey. It consists of three core components: Census Data Extraction: Gathers U.S. Census data (2012–2022) on median house value, household income, and racial demographics for all census tracts in the county. It accounts for changes in census tract boundaries between 2010 and 2020 by approximating values for newly defined tracts. House Value Prediction: Uses an LSTM model with k-fold cross-validation to forecast median house values through 2025. Multiple feature combinations and sequence lengths are tested to optimize prediction accuracy, with the final model selected based on MSE and MAE scores. Data Mapping: Visualizes historical and predicted housing data using GeoJSON files from the TIGERWeb API. It generates interactive maps showing raw values, changes over time, and percent differences, with customization options to handle outliers and improve interpretability. This modular workflow can be adapted to other regions by changing the input FIPS codes and feature selections.
The average Canadian house price declined slightly in 2023, after four years of consecutive growth. The average house price stood at 678,282 Canadian dollars in 2023 and was forecast to reach 746,379 Canadian dollars by 2026. Home sales on the rise The number of housing units sold is also set to increase over the two-year period. From 443,511 units sold, the annual number of home sales in the country is expected to rise to 453,704 in 2025. British Columbia and Ontario have traditionally been housing markets with prices above the Canadian average, and both are set to witness an increase in sales in 2025. How did Canadians feel about the future development of house prices? When it comes to consumer confidence in the performance of the real estate market in the next six months, Canadian consumers in 2024 mostly expected that the market would go up. A slightly lower share of the respondents believed real estate prices would remain the same.
Residential Real Estate Market Size 2024-2028
The residential real estate market size is forecast to increase by USD 482.1 billion, at a CAGR of 4.6% between 2023 and 2028.
The market is experiencing robust growth, fueled by increasing marketing initiatives and a burgeoning population driving demand for housing solutions. Notably, innovative smart home technologies, such as voice-activated assistants and energy-efficient appliances, are gaining traction, offering enhanced convenience and sustainability for homeowners. However, regulatory uncertainty looms large as governments grapple with implementing policies that balance affordability, safety, and environmental concerns. Meanwhile, the rise of co-living spaces and serviced apartments caters to the growing trend of flexible living arrangements, particularly among millennials and urban professionals. Additionally, the increasing popularity of real estate crowdfunding platforms enables smaller investors to participate in the market, broadening its reach and democratizing access to real estate investments.
Despite these opportunities, challenges persist, including escalating property prices in major metropolitan areas and the potential for economic downturns, which can dampen demand and negatively impact investor confidence. Moreover, the ongoing digital transformation of the industry necessitates continuous adaptation and innovation to remain competitive and meet evolving consumer expectations. Companies must navigate these challenges and capitalize on market opportunities by offering differentiated products and services, ensuring regulatory compliance, and embracing technology to streamline operations and enhance the customer experience.
What will be the Size of the Residential Real Estate Market during the forecast period?
Explore in-depth regional segment analysis with market size data - historical 2018-2022 and forecasts 2024-2028 - in the full report.
Request Free Sample
The market continues to evolve, shaped by various interconnected factors. Property data and valuation are crucial elements, influencing rental rates, property surveys, and home appraisals. Open houses and virtual tours cater to diverse buyer preferences, while office buildings and industrial properties adapt to the changing business landscape. Sustainable development and green building gain traction, impacting property taxes and zoning regulations. Real estate agents and brokers leverage property technology and analytics to optimize property listings and investments. Building codes and real estate law ensure safety and compliance, while property management and retail spaces cater to evolving consumer needs.
Cap rates and investment returns shape real estate developers' strategies, as they navigate market cycles and respond to shifting trends in single-family homes, multi-family dwellings, and luxury properties. Property taxes, escrow services, home staging, and title insurance further complexify the intricacies of this dynamic industry. Commercial real estate and property portfolio management require a nuanced understanding of these interconnected factors, as they continue to unfold and reshape the market.
How is this Residential Real Estate Industry segmented?
The residential real estate industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.
Mode Of Booking
Sales
Rental/Lease
Type
Single-Family Homes
Apartments and Condominiums
Landed Houses and Villas
Townhouses
Budget
Affordable Housing (USD 300,000 or less)
Mid-Range (USD 300,001-USD1,000,000)
Luxury (USD 1,000,001 or more)
Size
Small (Less than 80 square meters)
Medium (81-200 square meters)
Large (More than 200 square meters)
Geography
North America
US
Canada
Europe
France
Germany
Italy
UK
Middle East and Africa
Egypt
KSA
Oman
UAE
APAC
China
India
Japan
South America
Argentina
Brazil
Rest of World (ROW)
By Mode Of Booking Insights
The sales segment is estimated to witness significant growth during the forecast period.
Request Free Sample
The Sales segment was valued at USD 896.60 billion in 2018 and showed a gradual increase during the forecast period.
Regional Analysis
APAC is estimated to contribute 54% to the growth of the global market during the forecast period.Technavio's analysts have elaborately explained the regional trends and drivers that shape the market during the forecast period.
Request Free Sample
The market is experiencing significant activity and trends in various sectors. In APAC, this region held the largest share in 2023 and is projected to continue leading the market. Factors such as ra
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
General file description This xlsx document contains the literature list that forms the basis of the paper 'A Survey of Methods and Input Data Types for House Price Prediction' by Geerts, M., vanden Broucke, S. and De Weerdt, J. The Excel document contains seven sheets, relating to the phases described in the survey. Phase3 This sheet contains the literature list for the end of Phase 2 and the start of Phase 3. It has 590 rows and 19 columns. Each row contains the citation information of one article. The columns describe the ID, Authors, Title, Year, Source title, Volume, Issue, DOI, ISSN, ISBN, PubMed, Publisher, Document Type, Language, Keywords, Link, Book DOI, Algorithmic (Title) and Algorithmic (Abstract). The latter two columns are used to indicate whether the articles describe an algorithmic approach to predict house prices based on the title and the abstract respectively. These two columns take the values 'Yes', 'No', and 'Maybe', and were completed during Phase 3. Phase4 This sheet contains the literature list for the end of Phase 3 and the start of Phase 4. It has 116 rows and 20 columns. Each row contains the citation information of one article. The columns describe the ID, Authors, Title, Year, Source title, Volume, Issue, DOI, ISSN, ISBN, PubMed, Publisher, Document Type, Language, Keywords, Link, Book DOI, Algorithmic (Title), Algorithmic (Abstract) and Reading. All columns are the same as in the first sheet, except for the three last columns. The columns Algorithmic (Title) and Algorithmic (Abstract) now only contain the value 'Yes' as only the articles that describe an algorithm are retained in Phase 3. The column Reading describes the outcome of Phase 4. This columns is empty if the article is retained in this phase and describes the reason if it is not retained. Phase4(end) This sheet contains the literature list for the end of Phase 4. It has 94 rows and 20 columns. Each row contains the citation information of one article. The columns describe the ID, Authors, Title, Year, Source title, Volume, Issue, DOI, ISSN, ISBN, PubMed, Publisher, Document Type, Language, Keywords, Link, Book DOI, Algorithmic (Title), Algorithmic (Abstract) and Reading. All columns are the same as in the second sheet. The column Reading is now empty because the articles that were not retained in Phase 4 are removed from the list. Data table This sheet contains a table of the literature at the end of Phase 4 with indications of input data types used in the articles, the data novelty score and the cluster that the articles belong to. It has 95 rows, where each row contains the information of one article, except the last 'Total' row. It contains 21 columns : ID: This is the same identifier as in the previous sheets. Column1: This is a new identifier, based on an ordering on year and author. Authors: Same as before. Title: Same as before. Year: Same as before. Structural, Temporal data, Socioeconomic, Environmental, POI, Basic spatial, Location, Eucl Distances, Adv Spatial, Network Distance, Topographical data, Graphs, Images, Text: These are the different input data types. The cell is filled with 'X' if the corresponding article is using the input data type described in the column name. Score: This column indicates the data novelty score, calculated as explained in the paper based on the sheet 'Rules Data novelty score'. Cluster: This column indicates the cluster number as explained in the Discussion section of the paper. Rules Data novelty score This sheet contains 15 rows, of which the first contains the titles, and two columns. The first columns contains the input data types as in the previous sheet and the second column contains the respective novelty scores. Model table This sheet contains a table of the literature at the end of Phase 4 with indications of model types used in the articles, the model novelty score and the cluster that the articles belong to. It has 95 rows, where each row contains the information of one article, except the last 'Total' row. It contains 21 columns : ID: Same as before. Column1: Same as before Authors: Same as before. Title: Same as before. Year: Same as before. MRA, Kriging, SEM, SVC, Time Series, FL, NN, DT, RF, GBT, SVM, ANN, (Other) Ensembles, DL: These are the different model types. The cell is filled with 'X' if the corresponding article is using the model type described in the column name. Score: This column indicates the model novelty score, calculated as explained in the paper based on the sheet 'Rules Model novelty score'. Cluster: This column indicates the cluster number as explained in the Discussion section of the paper. Rules Model novelty score This sheet contains 15 rows, of which the first contains the titles, and two columns. The first columns contains the model types as in the previous sheet and the second column contains the respective novelty scores.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview: This dataset was collected and curated to support research on predicting real estate prices using machine learning algorithms, specifically Support Vector Regression (SVR) and Gradient Boosting Machine (GBM). The dataset includes comprehensive information on residential properties, enabling the development and evaluation of predictive models for accurate and transparent real estate appraisals.Data Source: The data was sourced from Department of Lands and Survey real estate listings.Features: The dataset contains the following key attributes for each property:Area (in square meters): The total living area of the property.Floor Number: The floor on which the property is located.Location: Geographic coordinates or city/region where the property is situated.Type of Apartment: The classification of the property, such as studio, one-bedroom, two-bedroom, etc.Number of Bathrooms: The total number of bathrooms in the property.Number of Bedrooms: The total number of bedrooms in the property.Property Age (in years): The number of years since the property was constructed.Property Condition: A categorical variable indicating the condition of the property (e.g., new, good, fair, needs renovation).Proximity to Amenities: The distance to nearby amenities such as schools, hospitals, shopping centers, and public transportation.Market Price (target variable): The actual sale price or listed price of the property.Data Preprocessing:Normalization: Numeric features such as area and proximity to amenities were normalized to ensure consistency and improve model performance.Categorical Encoding: Categorical features like property condition and type of apartment were encoded using one-hot encoding or label encoding, depending on the specific model requirements.Missing Values: Missing data points were handled using appropriate imputation techniques or by excluding records with significant missing information.Usage: This dataset was utilized to train and test machine learning models, aiming to predict the market price of residential properties based on the provided attributes. The models developed using this dataset demonstrated improved accuracy and transparency over traditional appraisal methods.Dataset Availability: The dataset is available for public use under the [CC BY 4.0]. Users are encouraged to cite the related publication when using the data in their research or applications.Citation: If you use this dataset in your research, please cite the following publication:[Real Estate Decision-Making: Precision in Price Prediction through Advanced Machine Learning Algorithms].