100+ datasets found
  1. Melbourne Housing Dataset

    • kaggle.com
    Updated Feb 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ronik Malhotra (2023). Melbourne Housing Dataset [Dataset]. https://www.kaggle.com/datasets/ronikmalhotra/melbourne-housing-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 4, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ronik Malhotra
    Area covered
    Melbourne
    Description

    As a Data scientist, who yearns to experiment, learn and explore different techniques applied in this field, one cannot overlook the importance of application of Exploratory Data Analysis on various datasets out there.

    This housing dataset provides a thorough analysis of the current state of the housing market. It includes information on housing prices, availability, and key trends, allowing you to gain a better understanding of the market and make informed decisions. Whether you're a homebuyer, investor, or simply interested in the state of the housing market, this dataset has valuable insights to offer.

  2. N

    Housing Database

    • data.cityofnewyork.us
    • catalog.data.gov
    application/rdfxml +5
    Updated Mar 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of City Planning (DCP) (2021). Housing Database [Dataset]. https://data.cityofnewyork.us/Housing-Development/Housing-Database/6umk-irkx
    Explore at:
    application/rssxml, application/rdfxml, tsv, csv, xml, jsonAvailable download formats
    Dataset updated
    Mar 19, 2021
    Dataset authored and provided by
    Department of City Planning (DCP)
    Description
    The NYC Department of City Planning’s (DCP) Housing Database contains all NYC Department of Buildings (DOB) approved housing construction and demolition jobs filed or completed in NYC since January 1, 2010. It includes the three primary construction job types that add or remove residential units: new buildings, major alterations, and demolitions, and can be used to determine the change in legal housing units across time and space. Records in the Housing Database Project-Level Files are geocoded to the greatest level of precision possible, subject to numerous quality assurance and control checks, recoded for usability, and joined to other housing data sources relevant to city planners and analysts.

    Data are updated semiannually, at the end of the second and fourth quarters of each year.

    Please see DCP’s annual Housing Production Snapshot summarizing findings from the 21Q4 data release here. Additional Housing and Economic analyses are also available.

    The NYC Department of City Planning’s (DCP) Housing Database Unit Change Summary Files provide the net change in Class A housing units since 2010, and the count of units pending completion for commonly used political and statistical boundaries (Census Block, Census Tract, City Council district, Community District, Community District Tabulation Area (CDTA), Neighborhood Tabulation Area (NTA). These tables are aggregated from the DCP Housing Database Project-Level Files, which is derived from Department of Buildings (DOB) approved housing construction and demolition jobs filed or completed in NYC since January 1, 2010. Net housing unit change is calculated as the sum of all three construction job types that add or remove residential units: new buildings, major alterations, and demolitions. These files can be used to determine the change in legal housing units across time and space.

  3. Public Housing Agency

    • catalog.data.gov
    Updated Mar 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Housing and Urban Development (2024). Public Housing Agency [Dataset]. https://catalog.data.gov/dataset/public-housing-agency-pha-inventory
    Explore at:
    Dataset updated
    Mar 1, 2024
    Dataset provided by
    United States Department of Housing and Urban Developmenthttp://www.hud.gov/
    Description

    The dataset contains current data on low rent and Section 8 units in PHA's administered by HUD. The Section 8 Rental Voucher Program increases affordable housing choices for very low-income households by allowing families to choose privately owned rental housing. Through the Section 8 Rental Voucher Program, the administering housing authority issues a voucher to an income-qualified household, which then finds a unit to rent. If the unit meets the Section 8 quality standards, the PHA then pays the landlord the amount equal to the difference between 30 percent of the tenant's adjusted income (or 10 percent of the gross income or the portion of welfare assistance designated for housing) and the PHA-determined payment standard for the area. The rent must be reasonable compared with similar unassisted units.

  4. F

    All-Transactions House Price Index for the United States

    • fred.stlouisfed.org
    json
    Updated Aug 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). All-Transactions House Price Index for the United States [Dataset]. https://fred.stlouisfed.org/series/USSTHPI
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Aug 26, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain

    Area covered
    United States
    Description

    Graph and download economic data for All-Transactions House Price Index for the United States (USSTHPI) from Q1 1975 to Q2 2025 about appraisers, HPI, housing, price index, indexes, price, and USA.

  5. F

    Housing Inventory: Median Days on Market in the United States

    • fred.stlouisfed.org
    json
    Updated Oct 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Housing Inventory: Median Days on Market in the United States [Dataset]. https://fred.stlouisfed.org/series/MEDDAYONMARUS
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Oct 2, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required

    Area covered
    United States
    Description

    Graph and download economic data for Housing Inventory: Median Days on Market in the United States (MEDDAYONMARUS) from Jul 2016 to Sep 2025 about median and USA.

  6. Real Estate Price Prediction Data

    • figshare.com
    txt
    Updated Aug 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammad Shbool; Rand Al-Dmour; Bashar Al-Shboul; Nibal Albashabsheh; Najat Almasarwah (2024). Real Estate Price Prediction Data [Dataset]. http://doi.org/10.6084/m9.figshare.26517325.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 8, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Mohammad Shbool; Rand Al-Dmour; Bashar Al-Shboul; Nibal Albashabsheh; Najat Almasarwah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview: This dataset was collected and curated to support research on predicting real estate prices using machine learning algorithms, specifically Support Vector Regression (SVR) and Gradient Boosting Machine (GBM). The dataset includes comprehensive information on residential properties, enabling the development and evaluation of predictive models for accurate and transparent real estate appraisals.Data Source: The data was sourced from Department of Lands and Survey real estate listings.Features: The dataset contains the following key attributes for each property:Area (in square meters): The total living area of the property.Floor Number: The floor on which the property is located.Location: Geographic coordinates or city/region where the property is situated.Type of Apartment: The classification of the property, such as studio, one-bedroom, two-bedroom, etc.Number of Bathrooms: The total number of bathrooms in the property.Number of Bedrooms: The total number of bedrooms in the property.Property Age (in years): The number of years since the property was constructed.Property Condition: A categorical variable indicating the condition of the property (e.g., new, good, fair, needs renovation).Proximity to Amenities: The distance to nearby amenities such as schools, hospitals, shopping centers, and public transportation.Market Price (target variable): The actual sale price or listed price of the property.Data Preprocessing:Normalization: Numeric features such as area and proximity to amenities were normalized to ensure consistency and improve model performance.Categorical Encoding: Categorical features like property condition and type of apartment were encoded using one-hot encoding or label encoding, depending on the specific model requirements.Missing Values: Missing data points were handled using appropriate imputation techniques or by excluding records with significant missing information.Usage: This dataset was utilized to train and test machine learning models, aiming to predict the market price of residential properties based on the provided attributes. The models developed using this dataset demonstrated improved accuracy and transparency over traditional appraisal methods.Dataset Availability: The dataset is available for public use under the [CC BY 4.0]. Users are encouraged to cite the related publication when using the data in their research or applications.Citation: If you use this dataset in your research, please cite the following publication:[Real Estate Decision-Making: Precision in Price Prediction through Advanced Machine Learning Algorithms].

  7. T

    United States House Price Index YoY

    • tradingeconomics.com
    • fa.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Mar 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States House Price Index YoY [Dataset]. https://tradingeconomics.com/united-states/house-price-index-yoy
    Explore at:
    json, excel, xml, csvAvailable download formats
    Dataset updated
    Mar 15, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 31, 1992 - Jul 31, 2025
    Area covered
    United States
    Description

    House Price Index YoY in the United States decreased to 2.30 percent in July from 2.70 percent in June of 2025. This dataset includes a chart with historical data for the United States FHFA House Price Index YoY.

  8. Housing Price Dataset of Delhi(India)

    • kaggle.com
    Updated Nov 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yash Goel (2021). Housing Price Dataset of Delhi(India) [Dataset]. https://www.kaggle.com/datasets/goelyash/housing-price-dataset-of-delhiindia
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 23, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Yash Goel
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    India, Delhi
    Description

    Context

    So this data set is collected for completing a college project ,which is an android app for calculating the price of houses.

    Content

    This data is scraped from magic bricks website between june 2021 and july 2021 .

    Acknowledgements

    magicbricks.com

    Inspiration

    With the help of the data available one can make a regression model to predict house prices.

  9. T

    United States Housing Starts

    • tradingeconomics.com
    • zh.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Sep 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States Housing Starts [Dataset]. https://tradingeconomics.com/united-states/housing-starts
    Explore at:
    json, excel, csv, xmlAvailable download formats
    Dataset updated
    Sep 17, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 31, 1959 - Aug 31, 2025
    Area covered
    United States
    Description

    Housing Starts in the United States decreased to 1307 Thousand units in August from 1429 Thousand units in July of 2025. This dataset provides the latest reported value for - United States Housing Starts - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.

  10. C

    Housing Market Value Analysis 2021

    • data.wprdc.org
    • gimi9.com
    • +1more
    geojson, html, pdf +2
    Updated Jul 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Allegheny County (2025). Housing Market Value Analysis 2021 [Dataset]. https://data.wprdc.org/dataset/market-value-analysis-2021
    Explore at:
    html, geojson(10301172), zip(2039140), pdf(881980), xlsx(22669), pdf(28782887), zip(1996574)Available download formats
    Dataset updated
    Jul 8, 2025
    Dataset authored and provided by
    Allegheny County
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    In 2021, Allegheny County Economic Development (ACED), in partnership with Urban Redevelopment Authority of Pittsburgh(URA), completed the a Market Value Analysis (MVA) for Allegheny County. This analysis services as both an update to previous MVA’s commissioned separately by ACED and the URA and combines the MVA for the whole of Allegheny County (inclusive of the City of Pittsburgh). The MVA is a unique tool for characterizing markets because it creates an internally referenced index of a municipality’s residential real estate market. It identifies areas that are the highest demand markets as well as areas of greatest distress, and the various markets types between. The MVA offers insight into the variation in market strength and weakness within and between traditional community boundaries because it uses Census block groups as the unit of analysis. Where market types abut each other on the map becomes instructive about the potential direction of market change, and ultimately, the appropriateness of types of investment or intervention strategies.

    This MVA utilized data that helps to define the local real estate market. The data used covers the 2017-2019 period, and data used in the analysis includes:

    • Residential Real Estate Sales
    • Mortgage Foreclosures
    • Residential Vacancy
    • Parcel Year Built
    • Parcel Condition
    • Building Violations
    • Owner Occupancy
    • Subsidized Housing Units

    The MVA uses a statistical technique known as cluster analysis, forming groups of areas (i.e., block groups) that are similar along the MVA descriptors, noted above. The goal is to form groups within which there is a similarity of characteristics within each group, but each group itself different from the others. Using this technique, the MVA condenses vast amounts of data for the universe of all properties to a manageable, meaningful typology of market types that can inform area-appropriate programs and decisions regarding the allocation of resources.

    Please refer to the presentation and executive summary for more information about the data, methodology, and findings.

  11. e

    Social Housing Asset Data

    • data.europa.eu
    • data.wu.ac.at
    csv
    Updated Oct 11, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salford City Council (2021). Social Housing Asset Data [Dataset]. https://data.europa.eu/88u/dataset/social-housing-asset-data
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 11, 2021
    Dataset authored and provided by
    Salford City Council
    License

    http://reference.data.gov.uk/id/open-government-licencehttp://reference.data.gov.uk/id/open-government-licence

    Description

    This dataset provides information on Social Housing Asset Data at Salford City Council. Details are provided to meet the required standards of the Local Government Transparency Code 2014.

  12. F

    Real Residential Property Prices for United States

    • fred.stlouisfed.org
    json
    Updated Sep 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Real Residential Property Prices for United States [Dataset]. https://fred.stlouisfed.org/series/QUSR628BIS
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Sep 25, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required

    Area covered
    United States
    Description

    Graph and download economic data for Real Residential Property Prices for United States (QUSR628BIS) from Q1 1970 to Q2 2025 about residential, HPI, housing, real, price index, indexes, price, and USA.

  13. Data from: Comprehensive Housing Affordability Strategy (CHAS)

    • catalog.data.gov
    Updated Mar 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Housing and Urban Development (2024). Comprehensive Housing Affordability Strategy (CHAS) [Dataset]. https://catalog.data.gov/dataset/comprehensive-housing-affordability-strategy-chas-2008-2010
    Explore at:
    Dataset updated
    Mar 1, 2024
    Dataset provided by
    United States Department of Housing and Urban Developmenthttp://www.hud.gov/
    Description

    The U.S. Department of Housing and Urban Development (HUD) periodically receives custom tabulations of data from the U.S. Census Bureau that are largely not available through standard Census products. These data, known as the CHAS data (Comprehensive Housing Affordability Strategy), demonstrate the extent of housing problems and housing needs, particularly for low income households. The CHAS data are used by local governments to plan how to spend HUD funds, and may also be used by HUD to distribute grant funds

  14. C

    Phoenix, AZ Housing Data

    • phoenixopendata.com
    csv
    Updated Mar 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    External Data (2023). Phoenix, AZ Housing Data [Dataset]. https://www.phoenixopendata.com/dataset/phoenix-az-housing-data
    Explore at:
    csv(1797), csv(1391), csv(595), csv(581)Available download formats
    Dataset updated
    Mar 24, 2023
    Dataset authored and provided by
    External Data
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Area covered
    Phoenix, Arizona
    Description

    Phoenix housing data from the American Community Survey (ACS) 1-year estimates

  15. Housing Affordability Data System (HADS), 2004

    • icpsr.umich.edu
    • search.datacite.org
    ascii, delimited, sas +2
    Updated Oct 29, 2009
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vandenbroucke, David A. (2009). Housing Affordability Data System (HADS), 2004 [Dataset]. http://doi.org/10.3886/ICPSR25204.v1
    Explore at:
    spss, delimited, ascii, sas, stataAvailable download formats
    Dataset updated
    Oct 29, 2009
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Vandenbroucke, David A.
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/25204/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/25204/terms

    Time period covered
    2004
    Area covered
    Washington, Connecticut, Hartford, Oklahoma, Pittsburgh, Ohio, United States, Cleveland, Missouri, Pennsylvania
    Description

    The Housing Affordability Data System (HADS) is a set of housing unit level datasets that measures the affordability of housing units and the housing cost burdens of households, relative to area median incomes, poverty level incomes, and Fair Market Rents. The purpose of these datasets is to provide housing analysts with consistent measures of affordability and burdens over a long period. The datasets are based on the American Housing Survey (AHS) national files from 1985 through 2005 and the metropolitan files for 2002 and 2004. Users can link records in HADS files to AHS records, allowing access to all of the AHS variables. Housing-level variables include information on the number of rooms in the housing unit, the year the unit was built, whether it was occupied or vacant, whether the unit was rented or owned, whether it was a single family or multiunit structure, the number of units in the building, the current market value of the unit, and measures of relative housing costs. The dataset also includes variables describing the number of people living in the household, household income, and the type of residential area (e.g., urban or suburban).

  16. O

    Comprehensive Affordable Housing Directory

    • data.austintexas.gov
    • catalog.data.gov
    Updated Apr 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Austin, Texas - data.austintexas.gov (2025). Comprehensive Affordable Housing Directory [Dataset]. https://data.austintexas.gov/widgets/4syj-z4ky?mobile_redirect=true
    Explore at:
    kmz, xml, csv, application/rdfxml, application/rssxml, kml, application/geo+json, tsvAvailable download formats
    Dataset updated
    Apr 8, 2025
    Dataset authored and provided by
    City of Austin, Texas - data.austintexas.gov
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    ***********************NOTICE******************************** This dataset is incomplete and in the process of being updated. Please contact david.cruz@austintexas.gov with any questions.

    This dataset contains all income-restricted housing within the Austin Full Purpose and into the 5-mile Extra Territorial Jurisdiction. This includes properties funded by the City of Austin along with the Housing Authority City of Austin, Housing Authority of Travis County, and Texas Department of Housing and Community Affairs. Some properties may be funded by more than one entity. The property attributes are intended to help Austin residents find income-restricted housing that best suits their needs.

    The dataset is connected to the affordable housing data hub which is consistently updated with the most current property information. A Feature Manipulation Engine Script pulls a new dataset to the Open Data Portal on a daily basis.

  17. New housing price index, monthly

    • www150.statcan.gc.ca
    • open.canada.ca
    • +1more
    Updated Sep 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). New housing price index, monthly [Dataset]. http://doi.org/10.25318/1810020501-eng
    Explore at:
    Dataset updated
    Sep 23, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    New housing price index (NHPI). Monthly data are available from January 1981. The table presents data for the most recent reference period and the last four periods. The base period for the index is (201612=100).

  18. F

    Housing Inventory: Price Reduced Count in the United States

    • fred.stlouisfed.org
    json
    Updated Oct 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Housing Inventory: Price Reduced Count in the United States [Dataset]. https://fred.stlouisfed.org/series/PRIREDCOUUS
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Oct 2, 2025
    License

    https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required

    Area covered
    United States
    Description

    Graph and download economic data for Housing Inventory: Price Reduced Count in the United States (PRIREDCOUUS) from Jul 2016 to Sep 2025 about reduced count, price, and USA.

  19. Housing Data Finder

    • ons.gov.uk
    • cy.ons.gov.uk
    xls
    Updated Aug 5, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2015). Housing Data Finder [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/housing/datasets/housingdatafinder
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Aug 5, 2015
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This tool is a searchable data catalogue containing links to a range of official statistics on housing. It forms a part of the ONS Housing Statistics Portal.

  20. California Housing Data (1990)

    • kaggle.com
    Updated May 10, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harry Wang (2018). California Housing Data (1990) [Dataset]. https://www.kaggle.com/harrywang/housing/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 10, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Harry Wang
    Area covered
    California
    Description

    Source

    This is the dataset used in this book: https://github.com/ageron/handson-ml/tree/master/datasets/housing to illustrate a sample end-to-end ML project workflow (pipeline). This is a great book - I highly recommend!

    The data is based on California Census in 1990.

    About the Data (from the book):

    "This dataset is a modified version of the California Housing dataset available from Luís Torgo's page (University of Porto). Luís Torgo obtained it from the StatLib repository (which is closed now). The dataset may also be downloaded from StatLib mirrors.

    The following is the description from the book author:

    This dataset appeared in a 1997 paper titled Sparse Spatial Autoregressions by Pace, R. Kelley and Ronald Barry, published in the Statistics and Probability Letters journal. They built it using the 1990 California census data. It contains one row per census block group. A block group is the smallest geographical unit for which the U.S. Census Bureau publishes sample data (a block group typically has a population of 600 to 3,000 people).

    The dataset in this directory is almost identical to the original, with two differences: 207 values were randomly removed from the total_bedrooms column, so we can discuss what to do with missing data. An additional categorical attribute called ocean_proximity was added, indicating (very roughly) whether each block group is near the ocean, near the Bay area, inland or on an island. This allows discussing what to do with categorical data. Note that the block groups are called "districts" in the Jupyter notebooks, simply because in some contexts the name "block group" was confusing."

    About the Data (From Luís Torgo page):

    http://www.dcc.fc.up.pt/%7Eltorgo/Regression/cal_housing.html

    This is a dataset obtained from the StatLib repository. Here is the included description:

    "We collected information on the variables using all the block groups in California from the 1990 Cens us. In this sample a block group on average includes 1425.5 individuals living in a geographically co mpact area. Naturally, the geographical area included varies inversely with the population density. W e computed distances among the centroids of each block group as measured in latitude and longitude. W e excluded all the block groups reporting zero entries for the independent and dependent variables. T he final data contained 20,640 observations on 9 variables. The dependent variable is ln(median house value)."

    End-to-End ML Project Steps (Chapter 2 of the book)

    1. Look at the big picture
    2. Get the data
    3. Discover and visualize the data to gain insights
    4. Prepare the data for Machine Learning algorithms
    5. Select a model and train it
    6. Fine-tune your model
    7. Present your solution
    8. Launch, monitor, and maintain your system

    The 10-Step Machine Learning Project Workflow (My Version)

    1. Define business object
    2. Make sense of the data from a high level
      • data types (number, text, object, etc.)
      • continuous/discrete
      • basic stats (min, max, std, median, etc.) using boxplot
      • frequency via histogram
      • scales and distributions of different features
    3. Create the traning and test sets using proper sampling methods, e.g., random vs. stratified
    4. Correlation analysis (pair-wise and attribute combinations)
    5. Data cleaning (missing data, outliers, data errors)
    6. Data transformation via pipelines (categorical text to number using one hot encoding, feature scaling via normalization/standardization, feature combinations)
    7. Train and cross validate different models and select the most promising one (Linear Regression, Decision Tree, and Random Forest were tried in this tutorial)
    8. Fine tune the model using trying different combinations of hyperparameters
    9. Evaluate the model with best estimators in the test set
    10. Launch, monitor, and refresh the model and system
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ronik Malhotra (2023). Melbourne Housing Dataset [Dataset]. https://www.kaggle.com/datasets/ronikmalhotra/melbourne-housing-dataset
Organization logo

Melbourne Housing Dataset

Discover Insights and Trends from Housing Market

Explore at:
392 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 4, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ronik Malhotra
Area covered
Melbourne
Description

As a Data scientist, who yearns to experiment, learn and explore different techniques applied in this field, one cannot overlook the importance of application of Exploratory Data Analysis on various datasets out there.

This housing dataset provides a thorough analysis of the current state of the housing market. It includes information on housing prices, availability, and key trends, allowing you to gain a better understanding of the market and make informed decisions. Whether you're a homebuyer, investor, or simply interested in the state of the housing market, this dataset has valuable insights to offer.

Search
Clear search
Close search
Google apps
Main menu