Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The US Family Budget Dataset provides insights into the cost of living in different US counties based on the Family Budget Calculator by the Economic Policy Institute (EPI).
This dataset offers community-specific estimates for ten family types, including one or two adults with zero to four children, in all 1877 counties and metro areas across the United States.
If you find this dataset valuable, don't forget to hit the upvote button! 😊💝
Employment-to-Population Ratio for USA
Productivity and Hourly Compensation
USA Unemployment Rates by Demographics & Race
Photo by Alev Takil on Unsplash
Facebook
TwitterThis map shows the average household income in the U.S. in 2022 in a multiscale map by country, state, county, ZIP Code, tract, and block group. Information for the average household income is an estimate of income for calendar year 2022. Income amounts are expressed in current dollars, including an adjustment for inflation or cost-of-living increases.The pop-up is configured to include the following information for each geography level:Average household incomeMedian household incomeCount of households by income groupAverage household income by householder age groupThe data shown is from Esri's 2022 Updated Demographic estimates using Census 2020 geographies. The map adds increasing level of detail as you zoom in, from state, to county, to ZIP Code, to tract, to block group data.Esri's U.S. Updated Demographic (2022/2027) Data: Population, age, income, sex, race, home value, and marital status are among the variables included in the database. Each year, Esri's Data Development team employs its proven methodologies to update more than 2,000 demographic variables for a variety of U.S. geographies.Additional Esri Resources:Esri DemographicsU.S. 2022/2027 Esri Updated DemographicsEssential demographic vocabularyThis item is for visualization purposes only and cannot be exported or used in analysis.Permitted use of this data is covered in the DATA section of the Esri Master Agreement (E204CW) and these supplemental terms.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
A dataset comprising various variables around housing and demographics for the top 50 American cities by population.
Variables:
Zip Code: Zip code within which the listing is present.
Price: Listed price for the property.
Beds: Number of beds mentioned in the listing.
Baths: Number of baths mentioned in the listing.
Living Space: The total size of the living space, in square feet, mentioned in the listing.
Address: Street address of the listing.
City: City name where the listing is located.
State: State name where the listing is located.
Zip Code Population: The estimated number of individuals within the zip code. Data from Simplemaps.com.
Zip Code Density: The estimated number of individuals per square mile within the zip code. Data from Simplemaps.com.
County: County where the listing is located.
Median Household income: Estimated median household income. Data from the U.S. Census Bureau.
Latitude: Latitude of the zip code. ** Data from Simplemaps.com.**
Longitude: Longitude of the zip code. Data from Simplemaps.com.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides a detailed time-series estimate of the monthly cost of living across 20 different areas in Nairobi, Kenya from 2019 to 2024. It covers essential expenses such as rent, food, transport, utilities, and miscellaneous costs, allowing for comprehensive cost-of-living analysis.
This dataset is useful for:
✅ Individuals planning to move to Nairobi
✅ Researchers analyzing long-term cost trends
✅ Businesses assessing salary benchmarks based on inflation
✅ Data scientists developing predictive models for cost forecasting
Area: The residential area in Nairobi Rent: Estimated monthly rent (KES) Food: Grocery and dining expenses (KES) Transport: Public and private transport costs (KES) Utilities: Water, electricity, and internet bills (KES) Misc: Entertainment, personal care, and leisure expenses (KES) Total: Sum of all expenses Date: Monthly timestamp from January 2019 to December 2024 This dataset provides cost estimates for 20+ residential areas, including:
- High-End Areas 🏡: Kileleshwa, Westlands, Karen
- Mid-Range Areas 🏙️: South B, Langata, Ruaka
- Affordable Areas 🏠: Embakasi, Kasarani, Githurai, Ruiru, Umoja
- Satellite Towns 🌿: Ngong, Rongai, Thika, Kitengela, Kikuyu
This dataset was synthetically generated using Python, incorporating realistic market variations. The process includes:
✔ Inflation Modeling 📈 – A 2% annual increase in costs over time.
✔ Seasonal Effects 📅 – Higher food and transport costs in December & January (holiday season), rent spikes in June & July.
✔ Economic Shocks ⚠️ – A 5% chance per record of external economic effects (e.g., fuel price hikes, supply chain issues).
✔ Random Fluctuations 🔄 – Expenses vary slightly month-to-month to simulate real-world spending behavior.
nairobi_cost_of_living_time_series.csv – 60,000 records in CSV format (time-series structured). This dataset was generated for research and educational purposes. If you find it useful, consider citing it in your work. 🚀
This updated version makes your documentation more detailed and actionable for users interested in forecasting and economic analysis. Would you like help building a cost prediction model? 🚀
Facebook
Twitterhttps://www.census.gov/data/developers/about/terms-of-service.htmlhttps://www.census.gov/data/developers/about/terms-of-service.html
Household income bucket counts by ZIP code, city, county, state, and the U.S. overall, based on American Community Survey (ACS) 5-year estimates.
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Gain a complete view of the real estate market with our Zillow datasets. Track price trends, rental/sale status, and price per square foot with the Zillow Price History dataset and explore detailed listings with prices, locations, and features using the Zillow Properties Listing dataset. Over 134M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:
Zpid
City
State
Home Status
Street Address
Zipcode
Home Type
Living Area Value
Bedrooms
Bathrooms
Price
Property Type
Date Sold
Annual Homeowners Insurance
Price Per Square Foot
Rent Zestimate
Tax Assessed Value
Zestimate
Home Values
Lot Area
Lot Area Unit
Living Area
Living Area Units
Property Tax Rate
Page View Count
Favorite Count
Time On Zillow
Time Zone
Abbreviated Address
Brokerage Name
And much more
Facebook
TwitterRetirement Notice: This item is in mature support as of June 2023 and will be retired in December 2025. A replacement item has not been identified at this time. Esri recommends updating your maps and apps to phase out use of this item.This map shows the average household income in the U.S. in 2022 in a multiscale map by country, state, county, ZIP Code, tract, and block group. Information for the average household income is an estimate of income for calendar year 2022. Income amounts are expressed in current dollars, including an adjustment for inflation or cost-of-living increases.The pop-up is configured to include the following information for each geography level:Average household incomeMedian household incomeCount of households by income groupAverage household income by householder age group Permitted use of this data is covered in the DATA section of the Esri Master Agreement (E204CW) and these supplemental terms.
Facebook
TwitterThis dataset contains Real Estate listings in the US broken by State and zip code.
kaggle API Command
!kaggle datasets download -d ahmedshahriarsakib/usa-real-estate-dataset
The dataset has 1 CSV file with 10 columns -
NB:
1. brokered by and street addresses were categorically encoded due to data privacy policy
2. acre_lot means the total land area, and house_size denotes the living space/building area
Data was collected from - - https://www.realtor.com/ - A real estate listing website operated by the News Corp subsidiary Move, Inc. and based in Santa Clara, California. It is the second most visited real estate listing website in the United States as of 2024, with over 100 million monthly active users.
Image by Mohamed Hassan from Pixabay
The data and information in the data set provided here are intended to use for educational purposes only. I do not own any data, and all rights are reserved to the respective owners.
Facebook
TwitterThis map shows the median household income in the United States in 2012. Information for the 2012 Median Household Income is an estimate of income for calendar year 2012. Income amounts are expressed in current dollars, including an adjustment for inflation or cost-of-living increases. The median is the value that divides the distribution of household income into two equal parts. The median household income in the United States overall was $50,157 in 2012. This map shows Esri's 2012 estimates using Census 2010 geographies. The data shown is from Esri's 2012 Updated Demographics. The map adds increasing level of detail as you zoom in, from state, to county, to ZIP Code, to tract, to block group data. This map shows Esri's 2012 estimates using Census 2010 geographies.The map is designed to be displayed in conjunction with the Canvas basemap with a transparency of 25%. To use it on other basemaps, try a transparency of 25-50%.Information about the USA Median Household Income map service used in this map is here.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Detailed Real Estate Data for Predicting House Prices and Analyzing Market Trends
This dataset contains information on 21,613 properties, making it a comprehensive resource for exploring real estate market trends and building predictive models for house prices. The data includes various features capturing property details, location, and market conditions, providing ample opportunities for data exploration, visualization, and machine learning applications.
General Information:
id: Unique identifier for each property. date: Date of sale. Price Details:
price: Sale price of the house. Property Features:
bedrooms: Number of bedrooms. bathrooms: Number of bathrooms (including partials as fractions). sqft_living: Living space area in square feet. sqft_lot: Lot size in square feet. floors: Number of floors. waterfront: Whether the property has a waterfront view. view: Quality of the view rating. condition: Overall condition of the house. grade: Grade of construction and design (scale of 1–13). Additional Metrics:
sqft_above: Square footage of the property above ground. sqft_basement: Basement area in square feet. yr_built: Year the property was built. yr_renovated: Year of last renovation. Location Coordinates:
zipcode: ZIP code of the property. lat and long: Latitude and longitude coordinates. Neighbor Comparisons:
sqft_living15: Average living space of 15 nearest properties. sqft_lot15: Average lot size of 15 nearest properties. This dataset is a valuable resource for anyone interested in real estate analytics, machine learning, or geographic data visualization.
Facebook
TwitterDataset includes house sale prices for King County in USA. Homes that are sold in the time period: May, 2014 and May, 2015.
Columns: - ida: notation for a house - date: Date house was sold - price: Price is prediction target - bedrooms: Number of Bedrooms/House - bathrooms: Number of bathrooms/House - sqft_living: square footage of the home - sqft_lot: square footage of the lot - floors: Total floors (levels) in house - waterfront: House which has a view to a waterfront - view: Has been viewed - condition: How good the condition is ( Overall ) - grade: overall grade given to the housing unit, based on King County grading system - sqft_abovesquare: footage of house apart from basement - sqft_basement: square footage of the basement - yr_built: Built Year - yr_renovated: Year when house was renovated - zipcode: zip - lat: Latitude coordinate - long: Longitude coordinate - sqft_living15: Living room area in 2015(implies-- some renovations) - sqft_lot15: lotSize area in 2015(implies-- some renovations)
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Real estate markets are of great importance for both local and international investors. Sydney and Melbourne are two dynamic markets where economic and social factors have significant impacts on property prices. Below is a detailed description of each feature:
If you like this dataset, please contribute by upvoting
Facebook
TwitterContext This dataset shows real estate listing in USA. It includes the price, zip codes etc
Sources This shows real estate data of company called Realtor - https://www.realtor.com. I downloaded the dataset from kaggle.
About Dataset 1 csv. file contains 10 columns - realtor-data.csv (100k+ entries) - status (Housing status - a. ready for sale or b. ready to build) - bed (# of beds) - bath (# of bathrooms) - acre_lot (Property / Land size in acres) - city (city name) - state (state name) - zip_code (postal code of the area) - house_size (house area/size/living space in square feet) - prev_sold_date (Previously sold date) - price (Housing price, it is either the current listing price or recently sold price if the house is sold recently)
Cover Image Downloaded from Google Stock images.
Disclaimer The data and information in the data set provided here are intended to use for educational purposes only. I do not own any data, and all rights are reserved to the respective owners.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://raw.githubusercontent.com/Masterx-AI/Project_Housing_Price_Prediction_/main/hs.jpg" alt="">
A simple yet challenging project, to predict the housing price based on certain factors like house area, bedrooms, furnished, nearness to mainroad, etc. The dataset is small yet, it's complexity arises due to the fact that it has strong multicollinearity. Can you overcome these obstacles & build a decent predictive model?
Harrison, D. and Rubinfeld, D.L. (1978) Hedonic prices and the demand for clean air. J. Environ. Economics and Management 5, 81–102. Belsley D.A., Kuh, E. and Welsch, R.E. (1980) Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset supports the research article “Predicting Residential Property Values Using XGBoost and Spatial–Temporal Encoding: Evidence from Southern California” by Michael Quintana (2025).The dataset contains a cleaned and anonymized subset of residential property transactions derived from Redfin’s publicly available data export (June 2025).Each observation represents a single-family home, condominium, or townhouse sold in Southern California.Variables include sale price, living area, lot size, year built, bedrooms, bathrooms, ZIP code, and days on market.The dataset was used to train and validate an XGBoost regression model designed to estimate home prices using both structural and spatial–temporal features.All personally identifiable or proprietary location data have been removed or aggregated at the ZIP-code level to maintain privacy while preserving statistical utility.This dataset and accompanying R scripts allow replication of the core results presented in the study, including model training, feature importance analysis, and predictive performance evaluation.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
PLEASE UPVOTE IF YOU LIKE THIS CONTENT! 😍
Same dataset as "House Sales in King County, USA", but with treated content and with a split version (train-test) allowing direct use in machine learning models.
We have 14 columns in the dataset, as it follows:
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The allure of Swiss houses lies in their harmonious integration with the breathtaking natural surroundings, offering residents a unique blend of tranquility and cosmopolitan living.
Switzerland's renowned stability, both economically and politically, adds a layer of desirability to its real estate market. The country's commitment to quality living is reflected not only in its efficient infrastructure but also in its high standards of education, healthcare, and overall well-being.
However, such a high quality of life and the unparalleled Swiss experience come at a cost. Swiss house prices reflect the exclusivity and desirability of the real estate market, positioning it as an investment in both luxury and lifestyle. In this journey through Swiss real estate, I unravel the layers of this captivating narrative, helping you exploring not just the numbers but the essence of what makes owning a house in Switzerland an aspirational dream for many - probably also yours!?
house_price_switzerland.csv - The complete dataset with 11 columns.ID - A unique identifier for each object in the dataset.HouseType - Describes the type of the house, such as "Villa".Size - Represents a categorical size classification, for example, "L".Price - Indicates the value of the house in Swiss Francs (CHF). If NaN the price is "Price on Request".LotSize - Specifies the surrounding area of the property in square meters.Balcony - Binary indicator of whether the object has a balcony (Yes/No).LivingSpace - Denotes the living area of the house in square meters.NumberRooms - Indicates the total number of rooms in the house.YearBuilt - Represents the year in which the house was constructed.Locality - Specifies the city or town where the object is situated.PostalCode - Corresponds to the postal code of the locality where the object is located.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset is an engineered version of the original Ames Housing dataset from the "House Prices: Advanced Regression Techniques" Kaggle competition. The goal of this engineering was to clean the data, handle missing values, encode categorical features, scale numeric features, manage outliers, reduce skewness, select useful features, and create new features to improve model performance for house price prediction.
The original dataset contains information on 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, with the target variable being SalePrice. This engineered version has undergone several preprocessing steps to make it ready for machine learning models.
PoolQC) were filled with "None". Numeric columns were filled with median, and other categorical columns with mode.SalePrice were removed.The final dataset has fewer columns than the original (reduced from 81 to approximately 250 after one-hot encoding, then further reduced by feature selection), with improved quality for modeling.
To add more predictive power, the following new features were created based on domain knowledge:
1. HouseAge: Age of the house at the time of sale. Calculated as YrSold - YearBuilt. This captures how old the house is, which can negatively affect price due to depreciation.
- Example: A house built in 2000 and sold in 2008 has HouseAge = 8.
2. Quality_x_Size: Interaction term between overall quality and living area. Calculated as OverallQual * GrLivArea. This combines quality and size to capture the value of high-quality large homes.
- Example: A house with OverallQual = 7 and GrLivArea = 1500 has Quality_x_Size = 10500.
3. TotalSF: Total square footage of the house. Calculated as GrLivArea + TotalBsmtSF + 1stFlrSF + 2ndFlrSF (if available). This aggregates area features into a single metric for better price prediction.
- Example: If GrLivArea = 1500 and TotalBsmtSF = 1000, TotalSF = 2500.
4. Log_LotArea: Log-transformed lot area to reduce skewness. Calculated as np.log1p(LotArea). This makes the distribution of lot sizes more normal, helping models handle extreme values.
- Example: A lot area of 10000 becomes Log_LotArea ≈ 9.21.
These new features were created using the original (unscaled) values to maintain interpretability, then scaled with RobustScaler to match the rest of the dataset.
SalePrice, such as:
OverallQual: Material and finish quality (scaled, 1-10).GrLivArea: Above grade (ground) living area square feet (scaled).GarageCars: Size of garage in car capacity (scaled).TotalBsmtSF: Total square feet of basement area (scaled).FullBath, YearBuilt, etc. (see the code for the full list).ExterQual: Exterior material quality (encoded as 0=Po to 4=Ex).BsmtQual: Basement quality (encoded as 0=None to 5=Ex).MSZoning_RL: 1 if residential low density, 0 otherwise.Neighborhood_NAmes: 1 if in NAmes neighborhood, 0 otherwise.HouseAge: Age of the house (scaled).Quality_x_Size: Overall quality times living area (scaled).TotalSF: Total square footage (scaled).Log_LotArea: Log-transformed lot area (scaled).SalePrice - The property's sale price in dollars (not scaled, as it's the target).Total columns: Approximately 200-250 (after one-hot encoding and feature selection).
This dataset is derived from the Ames Housing...
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The dataset consists of Price of Houses in King County , Washington from sales between May 2014 and May 2015. Along with house price it consists of information on 18 house features, date of sale and ID of sale.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset for this project originates from the UCI Machine Learning Repository. There are similar datasets on Kaggle but this is more comprehensive. It serves to show a basic trend in the house pricing in terms of its location, the area of construction, its interior, etc.
The dataset contains 21613x22 data fields. Column names are self-explanatory.
id - Unique ID for each home sold
date - Date of the home sale
price - Price of each home sold
bedrooms - Number of bedrooms
bathrooms - Number of bathrooms, where .5 accounts for a room with a toilet but no shower
sqft_living - Square footage of the apartment interior living space
sqft_lot - Square footage of the land space
floors - Number of floors
waterfront - A dummy variable for whether the apartment was overlooking the waterfront or not
view - An index from 0 to 4 of how good the view of the property was
condition - An index from 1 to 5 on the condition of the apartment,
grade - An index from 1 to 13, where 1-3 falls short of building construction and design, 7 has an average level of construction and design, and 11-13 have a high-quality level of construction and design.
sqft_above - The square footage of the interior housing space that is above ground level
sqft_basement - The square footage of the interior housing space that is below ground level
yr_built - The year the house was initially built
yr_renovated - The year of the house’s last renovation
zipcode - What zipcode area the house is in
lat - Lattitude
long - Longitude
sqft_living15 - The square footage of interior housing living space for the nearest 15 neighbors
sqft_lot15 - The square footage of the land lots of the nearest 15 neighbors
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The US Family Budget Dataset provides insights into the cost of living in different US counties based on the Family Budget Calculator by the Economic Policy Institute (EPI).
This dataset offers community-specific estimates for ten family types, including one or two adults with zero to four children, in all 1877 counties and metro areas across the United States.
If you find this dataset valuable, don't forget to hit the upvote button! 😊💝
Employment-to-Population Ratio for USA
Productivity and Hourly Compensation
USA Unemployment Rates by Demographics & Race
Photo by Alev Takil on Unsplash