The NYC Department of City Planning’s (DCP) Housing Database contains all NYC Department of Buildings (DOB) approved housing construction and demolition jobs filed or completed in NYC since January 1, 2010. It includes the three primary construction job types that add or remove residential units: new buildings, major alterations, and demolitions, and can be used to determine the change in legal housing units across time and space. Records in the Housing Database Project-Level Files are geocoded to the greatest level of precision possible, subject to numerous quality assurance and control checks, recoded for usability, and joined to other housing data sources relevant to city planners and analysts. Data are updated semiannually, at the end of the second and fourth quarters of each year. Please see DCP’s annual Housing Production Snapshot summarizing findings from the 21Q4 data release here. Additional Housing and Economic analyses are also available. The NYC Department of City Planning’s (DCP) Housing Database Unit Change Summary Files provide the net change in Class A housing units since 2010, and the count of units pending completion for commonly used political and statistical boundaries (Census Block, Census Tract, City Council district, Community District, Community District Tabulation Area (CDTA), Neighborhood Tabulation Area (NTA). These tables are aggregated from the DCP Housing Database Project-Level Files, which is derived from Department of Buildings (DOB) approved housing construction and demolition jobs filed or completed in NYC since January 1, 2010. Net housing unit change is calculated as the sum of all three construction job types that add or remove residential units: new buildings, major alterations, and demolitions. These files can be used to determine the change in legal housing units across time and space.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Housing Affordability Data System (HADS) is a set of files derived from the 1985 and later national American Housing Survey (AHS) and the 2002 and later Metro AHS. This system categorizes housing units by affordability and households by income, with respect to the Adjusted Median Income, Fair Market Rent (FMR), and poverty income. It also includes housing cost burden for owner and renter households. These files have been the basis for the worst case needs tables since 2001. The data files are available for public use, since they were derived from AHS public use files and the published income limits and FMRs. We are providing these files give the community of housing analysts the opportunity to use a consistent set of affordability measures.This data set appears to not be upated after 2013
New housing price index (NHPI). Monthly data are available from January 1981. The table presents data for the most recent reference period and the last four periods. The base period for the index is (201612=100).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview: This dataset was collected and curated to support research on predicting real estate prices using machine learning algorithms, specifically Support Vector Regression (SVR) and Gradient Boosting Machine (GBM). The dataset includes comprehensive information on residential properties, enabling the development and evaluation of predictive models for accurate and transparent real estate appraisals.Data Source: The data was sourced from Department of Lands and Survey real estate listings.Features: The dataset contains the following key attributes for each property:Area (in square meters): The total living area of the property.Floor Number: The floor on which the property is located.Location: Geographic coordinates or city/region where the property is situated.Type of Apartment: The classification of the property, such as studio, one-bedroom, two-bedroom, etc.Number of Bathrooms: The total number of bathrooms in the property.Number of Bedrooms: The total number of bedrooms in the property.Property Age (in years): The number of years since the property was constructed.Property Condition: A categorical variable indicating the condition of the property (e.g., new, good, fair, needs renovation).Proximity to Amenities: The distance to nearby amenities such as schools, hospitals, shopping centers, and public transportation.Market Price (target variable): The actual sale price or listed price of the property.Data Preprocessing:Normalization: Numeric features such as area and proximity to amenities were normalized to ensure consistency and improve model performance.Categorical Encoding: Categorical features like property condition and type of apartment were encoded using one-hot encoding or label encoding, depending on the specific model requirements.Missing Values: Missing data points were handled using appropriate imputation techniques or by excluding records with significant missing information.Usage: This dataset was utilized to train and test machine learning models, aiming to predict the market price of residential properties based on the provided attributes. The models developed using this dataset demonstrated improved accuracy and transparency over traditional appraisal methods.Dataset Availability: The dataset is available for public use under the [CC BY 4.0]. Users are encouraged to cite the related publication when using the data in their research or applications.Citation: If you use this dataset in your research, please cite the following publication:[Real Estate Decision-Making: Precision in Price Prediction through Advanced Machine Learning Algorithms].
https://brightdata.com/licensehttps://brightdata.com/license
Enrich your real estate strategies and market insights with our comprehensive New York Housing dataset. Analyzing this dataset can aid in understanding housing market dynamics and trends, empowering organizations to refine their investment strategies and business decisions. Access the entire dataset or tailor a subset to fit your requirements.
Popular use cases include optimizing investment strategies based on neighborhood engagement and property popularity, performing detailed user behavior analysis and segmentation by housing type, price range, and location to tailor marketing and engagement efforts, and identifying and forecasting emerging trends in the New York housing market to stay ahead in the competitive real estate industry.
The dataset contains current data on low rent and Section 8 units in PHA's administered by HUD. The Section 8 Rental Voucher Program increases affordable housing choices for very low-income households by allowing families to choose privately owned rental housing. Through the Section 8 Rental Voucher Program, the administering housing authority issues a voucher to an income-qualified household, which then finds a unit to rent. If the unit meets the Section 8 quality standards, the PHA then pays the landlord the amount equal to the difference between 30 percent of the tenant's adjusted income (or 10 percent of the gross income or the portion of welfare assistance designated for housing) and the PHA-determined payment standard for the area. The rent must be reasonable compared with similar unassisted units.
https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required
Graph and download economic data for Housing Inventory: Median Days on Market in the United States (MEDDAYONMARUS) from Jul 2016 to Jun 2025 about median and USA.
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for All-Transactions House Price Index for the United States (USSTHPI) from Q1 1975 to Q1 2025 about appraisers, HPI, housing, price index, indexes, price, and USA.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Housing Starts in the United States increased to 1321 Thousand units in June from 1263 Thousand units in May of 2025. This dataset provides the latest reported value for - United States Housing Starts - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Indicators used in community profiles
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
House Price Index YoY in the United States decreased to 2.80 percent in May from 3.20 percent in April of 2025. This dataset includes a chart with historical data for the United States FHFA House Price Index YoY.
UCLA Housing dataset
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘California Housing Data (1990)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/harrywang/housing on 12 November 2021.
--- Dataset description provided by original source is as follows ---
This is the dataset used in this book: https://github.com/ageron/handson-ml/tree/master/datasets/housing to illustrate a sample end-to-end ML project workflow (pipeline). This is a great book - I highly recommend!
The data is based on California Census in 1990.
"This dataset is a modified version of the California Housing dataset available from Luís Torgo's page (University of Porto). Luís Torgo obtained it from the StatLib repository (which is closed now). The dataset may also be downloaded from StatLib mirrors.
The following is the description from the book author:
This dataset appeared in a 1997 paper titled Sparse Spatial Autoregressions by Pace, R. Kelley and Ronald Barry, published in the Statistics and Probability Letters journal. They built it using the 1990 California census data. It contains one row per census block group. A block group is the smallest geographical unit for which the U.S. Census Bureau publishes sample data (a block group typically has a population of 600 to 3,000 people).
The dataset in this directory is almost identical to the original, with two differences: 207 values were randomly removed from the total_bedrooms column, so we can discuss what to do with missing data. An additional categorical attribute called ocean_proximity was added, indicating (very roughly) whether each block group is near the ocean, near the Bay area, inland or on an island. This allows discussing what to do with categorical data. Note that the block groups are called "districts" in the Jupyter notebooks, simply because in some contexts the name "block group" was confusing."
http://www.dcc.fc.up.pt/%7Eltorgo/Regression/cal_housing.html
This is a dataset obtained from the StatLib repository. Here is the included description:
"We collected information on the variables using all the block groups in California from the 1990 Cens us. In this sample a block group on average includes 1425.5 individuals living in a geographically co mpact area. Naturally, the geographical area included varies inversely with the population density. W e computed distances among the centroids of each block group as measured in latitude and longitude. W e excluded all the block groups reporting zero entries for the independent and dependent variables. T he final data contained 20,640 observations on 9 variables. The dependent variable is ln(median house value)."
--- Original source retains full ownership of the source dataset ---
Analysis of ‘Boston-Housing-Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/simpleparadox/bostonhousingdataset on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This is a copy of the original Boston Housing Dataset. As of December 2021, the original link doesn't contain the dataset so I'm uploading it if anyone wants to use it. I'll implement a linear regression model to predict the output 'MEDV' variable using PyTorch (check the companion notebook).
I took the data given in this link and processed it to include the column names as well.
https://www.kaggle.com/prasadperera/the-boston-housing-dataset/data
Good luck on your data science career :)
--- Original source retains full ownership of the source dataset ---
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Birmingham housing data from the American Community Survey (ACS)
https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required
Graph and download economic data for Real Residential Property Prices for United States (QUSR628BIS) from Q1 1970 to Q1 2025 about residential, HPI, housing, real, price index, indexes, price, and USA.
The Boston house-price data of Harrison, D. and Rubinfeld, D.L. 'Hedonic prices and the demand for clean air', J. Environ. Economics & Management, vol.5, 81-102, 1978.
Input features in order: 1) CRIM: per capita crime rate by town 2) ZN: proportion of residential land zoned for lots over 25,000 sq.ft. 3) INDUS: proportion of non-retail business acres per town 4) CHAS: Charles River dummy variable (1 if tract bounds river; 0 otherwise) 5) NOX: nitric oxides concentration (parts per 10 million) [parts/10M] 6) RM: average number of rooms per dwelling 7) AGE: proportion of owner-occupied units built prior to 1940 8) DIS: weighted distances to five Boston employment centres 9) RAD: index of accessibility to radial highways 10) TAX: full-value property-tax rate per $10,000 [$/10k] 11) PTRATIO: pupil-teacher ratio by town 12) B: The result of the equation B=1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town 13) LSTAT: % lower status of the population
Output variable: 1) MEDV: Median value of owner-occupied homes in $1000's [k$]
StatLib - Carnegie Mellon University
Harrison, David & Rubinfeld, Daniel. (1978). Hedonic housing prices and the demand for clean air. Journal of Environmental Economics and Management. 5. 81-102. 10.1016/0095-0696(78)90006-2. LINK
Belsley, David A. & Kuh, Edwin. & Welsch, Roy E. (1980). Regression diagnostics: identifying influential data and sources of collinearity. New York: Wiley LINK
The U.S. Department of Housing and Urban Development (HUD) periodically receives custom tabulations of data from the U.S. Census Bureau that are largely not available through standard Census products. These data, known as the CHAS data (Comprehensive Housing Affordability Strategy), demonstrate the extent of housing problems and housing needs, particularly for low income households. The CHAS data are used by local governments to plan how to spend HUD funds, and may also be used by HUD to distribute grant funds
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for All-Transactions House Price Index for Idaho (IDSTHPI) from Q1 1975 to Q1 2025 about ID, appraisers, HPI, housing, price index, indexes, price, and USA.
These files are no longer being updated to include any late revisions local authorities may have reported to the department. Please use instead the Local authority housing statistics open data file for the latest data.
<p class="gem-c-attachment_metadata"><span class="gem-c-attachment_attribute">MS Excel Spreadsheet</span>, <span class="gem-c-attachment_attribute">728 KB</span></p>
<p class="gem-c-attachment_metadata"><span class="gem-c-attachment_attribute"><abbr title="Comma-separated Values" class="gem-c-attachment_abbr">CSV</abbr></span>, <span class="gem-c-attachment_attribute">286 KB</span></p>
<p class="gem-c-attachment_metadata"><a class="govuk-link" aria-label="View Local authority housing statistics - full data 2019 to 2020 online" href="/csv-preview/62b319028fa8f5356d206d53/LAHS_all_data_2019_2020_-_06_2022.csv">View online</a></p>
The NYC Department of City Planning’s (DCP) Housing Database contains all NYC Department of Buildings (DOB) approved housing construction and demolition jobs filed or completed in NYC since January 1, 2010. It includes the three primary construction job types that add or remove residential units: new buildings, major alterations, and demolitions, and can be used to determine the change in legal housing units across time and space. Records in the Housing Database Project-Level Files are geocoded to the greatest level of precision possible, subject to numerous quality assurance and control checks, recoded for usability, and joined to other housing data sources relevant to city planners and analysts. Data are updated semiannually, at the end of the second and fourth quarters of each year. Please see DCP’s annual Housing Production Snapshot summarizing findings from the 21Q4 data release here. Additional Housing and Economic analyses are also available. The NYC Department of City Planning’s (DCP) Housing Database Unit Change Summary Files provide the net change in Class A housing units since 2010, and the count of units pending completion for commonly used political and statistical boundaries (Census Block, Census Tract, City Council district, Community District, Community District Tabulation Area (CDTA), Neighborhood Tabulation Area (NTA). These tables are aggregated from the DCP Housing Database Project-Level Files, which is derived from Department of Buildings (DOB) approved housing construction and demolition jobs filed or completed in NYC since January 1, 2010. Net housing unit change is calculated as the sum of all three construction job types that add or remove residential units: new buildings, major alterations, and demolitions. These files can be used to determine the change in legal housing units across time and space.