Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://raw.githubusercontent.com/Masterx-AI/Project_Housing_Price_Prediction_/main/hs.jpg" alt="">
A simple yet challenging project, to predict the housing price based on certain factors like house area, bedrooms, furnished, nearness to mainroad, etc. The dataset is small yet, it's complexity arises due to the fact that it has strong multicollinearity. Can you overcome these obstacles & build a decent predictive model?
Harrison, D. and Rubinfeld, D.L. (1978) Hedonic prices and the demand for clean air. J. Environ. Economics and Management 5, 81–102. Belsley D.A., Kuh, E. and Welsch, R.E. (1980) Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset comprises detailed information on apartment rentals, ideal for various machine learning tasks including clustering, classification, and regression. It features a comprehensive set of attributes that capture essential aspects of rental listings, such as:
Identifiers & Location: Includes unique identifiers (id), geographic details (address, cityname, state, latitude, longitude), and the source of the classified listing. Property Details: Provides information on the apartment's category, title, body, amenities, number of bathrooms, bedrooms, and square_feet (size of the apartment). Pricing Information: Contains multiple features related to pricing, including price (rental price), price_display (displayed price), price_type (price in USD), and fee. Additional Features: Indicates whether the apartment has a photo (has_photo), whether pets are allowed (pets_allowed), and other relevant details such as currency and time of listing creation. The dataset is well-cleaned, ensuring that critical columns like price and square_feet are never empty. This makes it a robust resource for developing predictive models and performing in-depth analyses on rental trends and property characteristics.
Facebook
TwitterA list of job applications filed for a particular day and associated data. Prior weekly and monthly reports are archived at DOB and are not available on NYC Open Data.
Facebook
TwitterThis dataset provides a comprehensive collection of features related to houses in California, with the primary aim of facilitating the prediction of house rent prices. It includes 80 columns and 1460 rows, offering a rich set of information for model training and evaluation. Target Variable: The dataset aims to predict the house rent prices, making it suitable for regression models. The 'SalePrice' column can be used as the target variable for training and evaluating predictive models.
Columns:
Use Case: Ideal for exploring and implementing regression models, particularly Linear Regression, to predict house rent prices based on various features associated with the properties.
Dataset Size: 80 columns 1460 rows
Source: This dataset is based on houses in California, making it relevant for studying the factors influencing house rent prices in this region.
Note: Please refer to the dataset documentation for details on each column and additional information regarding the data. Feel free to use this dataset for your machine learning projects, research, or educational purposes. Happy coding!
Facebook
TwitterDisplacement risk indicator showing how many households within the specified groups are facing either housing cost burden (contributing more than 30% of monthly income toward housing costs) or severe housing cost burden (contributing more than 50% of monthly income toward housing costs).
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Median price paid for residential property in England and Wales, by property type and administrative geographies. Annual data.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains 2000 rows of house-related data, representing various features that could influence house prices. Below, we discuss key aspects of the dataset, which include its structure, the choice of features, and potential use cases for analysis.
The dataset is designed to capture essential attributes for predicting house prices, including:
Area: Square footage of the house, which is generally one of the most important predictors of price. Bedrooms & Bathrooms: The number of rooms in a house significantly affects its value. Homes with more rooms tend to be priced higher. Floors: The number of floors in a house could indicate a larger, more luxurious home, potentially raising its price. Year Built: The age of the house can affect its condition and value. Newly built houses are generally more expensive than older ones. Location: Houses in desirable locations such as downtown or urban areas tend to be priced higher than those in suburban or rural areas. Condition: The current condition of the house is critical, as well-maintained houses (in 'Excellent' or 'Good' condition) will attract higher prices compared to houses in 'Fair' or 'Poor' condition. Garage: Availability of a garage can increase the price due to added convenience and space. Price: The target variable, representing the sale price of the house, used to train machine learning models to predict house prices based on the other features.
Area Distribution: The area of the houses in the dataset ranges from 500 to 5000 square feet, which allows analysis across different types of homes, from smaller apartments to larger luxury houses. Bedrooms and Bathrooms: The number of bedrooms varies from 1 to 5, and bathrooms from 1 to 4. This variance enables analysis of homes with different sizes and layouts. Floors: Houses in the dataset have between 1 and 3 floors. This feature could be useful for identifying the influence of multi-level homes on house prices. Year Built: The dataset contains houses built from 1900 to 2023, giving a wide range of house ages to analyze the effects of new vs. older construction. Location: There is a mix of urban, suburban, downtown, and rural locations. Urban and downtown homes may command higher prices due to proximity to amenities. Condition: Houses are labeled as 'Excellent', 'Good', 'Fair', or 'Poor'. This feature helps model the price differences based on the current state of the house. Price Distribution: Prices range between $50,000 and $1,000,000, offering a broad spectrum of property values. This range makes the dataset appropriate for predicting a wide variety of housing prices, from affordable homes to luxury properties.
3. Correlation Between Features
A key area of interest is the relationship between various features and house price: Area and Price: Typically, a strong positive correlation is expected between the size of the house (Area) and its price. Larger homes are likely to be more expensive. Location and Price: Location is another major factor. Houses in urban or downtown areas may show a higher price on average compared to suburban and rural locations. Condition and Price: The condition of the house should show a positive correlation with price. Houses in better condition should be priced higher, as they require less maintenance and repair. Year Built and Price: Newer houses might command a higher price due to better construction standards, modern amenities, and less wear-and-tear, but some older homes in good condition may retain historical value. Garage and Price: A house with a garage may be more expensive than one without, as it provides extra storage or parking space.
The dataset is well-suited for various machine learning and data analysis applications, including:
House Price Prediction: Using regression techniques, this dataset can be used to build a model to predict house prices based on the available features. Feature Importance Analysis: By using techniques such as feature importance ranking, data scientists can determine which features (e.g., location, area, or condition) have the greatest impact on house prices. Clustering: Clustering techniques like k-means could help identify patterns in the data, such as grouping houses into segments based on their characteristics (e.g., luxury homes, affordable homes). Market Segmentation: The dataset can be used to perform segmentation by location, price range, or house type to analyze trends in specific sub-markets, like luxury vs. affordable housing. Time-Based Analysis: By studying how house prices vary with the year built or the age of the house, analysts can derive insights into the trends of older vs. newer homes.
Facebook
TwitterThis table contains data on the percent of households paying more than 30% (or 50%) of monthly household income towards housing costs for California, its regions, counties, cities/towns, and census tracts. Data is from the U.S. Department of Housing and Urban Development (HUD), Consolidated Planning Comprehensive Housing Affordability Strategy (CHAS) and the U.S. Census Bureau, American Community Survey (ACS). The table is part of a series of indicators in the [Healthy Communities Data and Indicators Project of the Office of Health Equity] Affordable, quality housing is central to health, conferring protection from the environment and supporting family life. Housing costs—typically the largest, single expense in a family's budget—also impact decisions that affect health. As housing consumes larger proportions of household income, families have less income for nutrition, health care, transportation, education, etc. Severe cost burdens may induce poverty—which is associated with developmental and behavioral problems in children and accelerated cognitive and physical decline in adults. Low-income families and minority communities are disproportionately affected by the lack of affordable, quality housing. More information about the data table and a data dictionary can be found in the Attachments.
Facebook
TwitterDisplacement risk indicator showing how many households within the specified groups are facing severely housing cost burden (contributing more than 50% of monthly income toward housing costs).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Housing Index in Germany increased to 220.43 points in October from 219.91 points in September of 2025. This dataset provides the latest reported value for - Germany House Price Index - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Median price paid for residential property in England and Wales by property type and electoral ward. Annual data.
Facebook
TwitterVirginia (VA) has the 19th highest rent in the country out of 56 states and territories. The Fair Market Rent in Virginia ranges from $701 for a 2-bedroom apartment in Grayson County, VA to $1,765 for a 2-bedroom unit in Washington-Arlington-Alexandria, DC-VA-MD HUD Metro FMR Area.
For FY 2024, the Washington-Arlington-Alexandria, DC-VA-MD HUD Metro FMR Area (Arlington County) rent for a studio or efficiency is $1,772 per month and $3,015 per month to rent a house or an apartment with 4 bedrooms. The average Fair Market Rent for a 2-bedroom home in Virginia is $1,056 per month.
Approximately 15% of Americans qualify for some level of housing assistance. The population in Virginia is around 2,038,847 people. So, there are around 305,827 people in Virginia who could be receiving housing benefits from the HUD. For FY 2025, the Washington-Arlington-Alexandria, DC-VA-MD HUD Metro FMR Area (Arlington County) rent for a studio or efficiency is $2,012 per month and $3,413 per month to rent a house or an apartment with 4 bedrooms. The average Fair Market Rent for a 2-bedroom home in Virginia is $1,059 per month.
Facebook
TwitterDisplacement risk indicator showing how many households within the specified groups are facing housing cost burden (contributing more than 30% of monthly income toward housing costs).
Facebook
TwitterThis dataset contains prices of New York houses, providing valuable insights into the real estate market in the region. It includes information such as broker titles, house types, prices, number of bedrooms and bathrooms, property square footage, addresses, state, administrative and local areas, street names, and geographical coordinates.
- BROKERTITLE: Title of the broker
- TYPE: Type of the house
- PRICE: Price of the house
- BEDS: Number of bedrooms
- BATH: Number of bathrooms
- PROPERTYSQFT: Square footage of the property
- ADDRESS: Full address of the house
- STATE: State of the house
- MAIN_ADDRESS: Main address information
- ADMINISTRATIVE_AREA_LEVEL_2: Administrative area level 2 information
- LOCALITY: Locality information
- SUBLOCALITY: Sublocality information
- STREET_NAME: Street name
- LONG_NAME: Long name
- FORMATTED_ADDRESS: Formatted address
- LATITUDE: Latitude coordinate of the house
- LONGITUDE: Longitude coordinate of the house
- Price analysis: Analyze the distribution of house prices to understand market trends and identify potential investment opportunities.
- Property size analysis: Explore the relationship between property square footage and prices to assess the value of different-sized houses.
- Location-based analysis: Investigate geographical patterns to identify areas with higher or lower property prices.
- Bedroom and bathroom trends: Analyze the impact of the number of bedrooms and bathrooms on house prices.
- Broker performance analysis: Evaluate the influence of different brokers on the pricing of houses.
If you find this dataset useful, your support through an upvote would be greatly appreciated ❤️🙂 Thank you
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Quarterly summary of median private rent in South Australia by: suburb, postcode, State Government regions and Local Government Areas. The information relates to bonds lodged with Consumer and Business Services for private rental properties in South Australia.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
This dataset contains various features of residential properties along with their corresponding prices. It is suitable for exploring and analyzing factors influencing housing prices and for building predictive models to estimate the price of a property based on its attributes.
| Feature | Description |
|---|---|
| price | The price of the property. |
| area | The total area of the property in square feet. |
| bedrooms | The number of bedrooms in the property. |
| bathrooms | The number of bathrooms in the property. |
| stories | The number of stories (floors) in the property. |
| mainroad | Indicates whether the property is located on a main road (binary: yes/no). |
| guestroom | Indicates whether the property has a guest room (binary: yes/no). |
| basement | Indicates whether the property has a basement (binary: yes/no). |
| hotwaterheating | Indicates whether the property has hot water heating (binary: yes/no). |
| airconditioning | Indicates whether the property has air conditioning (binary: yes/no). |
| parking | The number of parking spaces available with the property. |
| prefarea | Indicates whether the property is in a preferred area (binary: yes/no). |
| furnishingstatus | The furnishing status of the property (e.g., furnished, semi-furnished, unfurnished). |
License: This dataset is made available under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Facebook
TwitterThe spectrum of housing options in India is incredibly diverse, spanning from the opulent palaces once inhabited by maharajas of yore, to the contemporary high-rise apartment complexes in bustling metropolitan areas, and even to the humble abodes in remote villages, consisting of modest huts. This wide-ranging tapestry of residential choices reflects the significant expansion witnessed in India's housing sector, which has paralleled the upward trajectory of income levels in the country. According to the findings of the Human Rights Measurement Initiative, India currently achieves 60.9% of what is theoretically attainable, considering its current income levels, in ensuring the fundamental right to housing for its citizens. In the realm of housing arrangements, renting, known interchangeably as hiring or letting, constitutes an agreement wherein compensation is provided for the temporary utilization of a resource, service, or property owned by another party. Within this arrangement, a gross lease is one where the tenant is obligated to pay a fixed rental amount, and the landlord assumes responsibility for covering all ongoing property-related expenses. The concept of renting also aligns with the principles of the sharing economy, as it fosters the utilization of assets and resources among individuals or entities, promoting efficiency and access to housing solutions for a broad spectrum of individuals.
Within this dataset, you will find a comprehensive collection of data pertaining to nearly 4700+ available residential properties, encompassing houses, apartments, and flats offered for rent. This dataset is rich with various attributes, including the number of bedrooms (BHK), rental rates, property size, number of floors, area type, locality, city, furnishing status, tenant preferences, bathroom count, and contact information for the respective point of contact.
https://i.imgur.com/KbU8rxD.png" alt="">
This Dataset is created from https://www.magicbricks.com/. If you want to learn more, you can visit the Website.
Cover Photo by: Alexander Andrews on Unsplash
Facebook
TwitterRedfin is a real estate brokerage and publishes the US housing market data on a regular basis. Using this dataset, you can analyze and visualize housing market data for US cities. Timeline: Starting from February 2012 until the present time (Data is refreshed and updated on a monthly basis)
The dataset has the following columns:
- period_begin
- period_end
- period_duration
- region_type
- region_type_id
- table_id
- is_seasonally_adjusted. (indicates if prices are seasonally adjusted; f represents False)
- region
- city
- state
- state_code
- property_type
- property_type_id
- median_sale_price
- median_sale_price_mom (median sale price changes month over month)
- median_sale_price_yoy (median sale price changes year over year)
- median_list_price
- median_list_price_mom (median list price changes month over month)
- median_list_price_yoy (median list price changes year over year)
- median_ppsf (median sale price per square foot)
- median_ppsf_mom (median sale price per square foot changes month over month)
- median_ppsf_yoy (median sale price per square foot changes year over year)
- median_list_ppsf (median list price per square foot)
- median_list_ppsf_mom (median list price per square foot changes month over month)
- median_list_ppsf_yoy. (median list price per square foot changes year over year)
- homes_sold (number of homes sold)
- homes_sold_mom (number of homes sold month over month)
- homes_sold_yoy (number of homes sold year over year)
- pending_sales
- pending_sales_mom
- pending_sales_yoy
- new_listings
- new_listings_mom
- new_listings_yoy
- inventory
- inventory_mom
- inventory_yoy
- months_of_supply
- months_of_supply_mom
- months_of_supply_yoy
- median_dom (median days on market until property is sold)
- median_dom_mom (median days on market changes month over month)
- median_dom_yoy (median days on market changes year over year)
- avg_sale_to_list (average sale price to list price ratio)
- avg_sale_to_list_mom (average sale price to list price ratio changes month over month)
- avg_sale_to_list_yoy (average sale price to list price ratio changes year over year)
- sold_above_list
- sold_above_list_mom
- sold_above_list_yoy
- price_drops
- price_drops_mom
- price_drops_yoy
- off_market_in_two_weeks (number of properties that will be taken off the market within 2 weeks)
- off_market_in_two_weeks_mom (changes in number of properties that will be taken off the market within 2 weeks, month over month)
- off_market_in_two_weeks_yoy (changes in number of properties that will be taken off the market within 2 weeks, year over year)
- parent_metro_region
- parent_metro_region_metro_code
- last_updated
Filetype: gzip (gz) Support for gzip files in Python: https://docs.python.org/3/library/gzip.html
Data Source & Credit: Redfin.com
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The purpose of this dataset is to provide updated data on the Zillow Observed Rent Index (ZORI). Most of the Zillow datasets on Kaggle have not been updated in four years, and no other dataset except one contains information related to rent. Providing updated data on this will also allow the community to analyze the effects of COVID-19 on rent prices, which could not be done with previous available data sets.
Zillow Observed Rent Index (ZORI): A smoothed measure of the typical observed market rate rent across a given region. ZORI is a repeat-rent index that is weighted to the rental housing stock to ensure representativeness across the entire market, not just those homes currently listed for-rent. The index is dollar-denominated by computing the mean of listed rents that fall into the 40th to 60th percentile range for all homes and apartments in a given region, which is once again weighted to reflect the rental housing stock. Details available in ZORI methodology. https://www.zillow.com/research/methodology-zori-repeat-rent-27092/
This dataset contains two files. The Metro dataset looks at the median rent prices for large US cities. The ZIP code dataset breaks the US cities down by their ZIP codes. Note that the region IDs in both datasets are only used for tracking purposes. Also, some of the ZIP codes under the Region Name are less than the standard five-digit zip code and unreliable. Even if you add zeros in accounting for possible formatting mistakes. It is recommended to remove these entries since there is no way to identify which ZIP code the entry actually represents. These entries are left in here in case some analyst can solve the issue.
Zillow provides many useful open source datasets that relate to housing, which can be found at Zillow Research Data. https://www.zillow.com/research/data/ This dataset was also prompted by an older dataset I came across that only lacked updated data. https://www.kaggle.com/zillow/rent-index Thumbnail and banner picture is from this pixabay artist https://pixabay.com/users/pexels-2286921/
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
### The Dataset Description
Pune is IT capital of india. Every software engineer from india wanted to work in this city.So many apartments has rented. I wanted to predict rent for both. 1. for owner who wanted to rent their home/ apartment 2. for customers who wanted to find home on rent
My aim is that predict home rent price on given data.
This are the few columns which I have inside my dataset
I wanted answers following questions: 1. Predict a proper rent price 2. Which area has maximum infulace on data
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://raw.githubusercontent.com/Masterx-AI/Project_Housing_Price_Prediction_/main/hs.jpg" alt="">
A simple yet challenging project, to predict the housing price based on certain factors like house area, bedrooms, furnished, nearness to mainroad, etc. The dataset is small yet, it's complexity arises due to the fact that it has strong multicollinearity. Can you overcome these obstacles & build a decent predictive model?
Harrison, D. and Rubinfeld, D.L. (1978) Hedonic prices and the demand for clean air. J. Environ. Economics and Management 5, 81–102. Belsley D.A., Kuh, E. and Welsch, R.E. (1980) Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.