Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://raw.githubusercontent.com/Masterx-AI/Project_Housing_Price_Prediction_/main/hs.jpg" alt="">
A simple yet challenging project, to predict the housing price based on certain factors like house area, bedrooms, furnished, nearness to mainroad, etc. The dataset is small yet, it's complexity arises due to the fact that it has strong multicollinearity. Can you overcome these obstacles & build a decent predictive model?
Harrison, D. and Rubinfeld, D.L. (1978) Hedonic prices and the demand for clean air. J. Environ. Economics and Management 5, 81–102. Belsley D.A., Kuh, E. and Welsch, R.E. (1980) Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.
Facebook
TwitterThis dataset is designed for beginners to practice regression problems, particularly in the context of predicting house prices. It contains 1000 rows, with each row representing a house and various attributes that influence its price. The dataset is well-suited for learning basic to intermediate-level regression modeling techniques.
Beginner Regression Projects: This dataset can be used to practice building regression models such as Linear Regression, Decision Trees, or Random Forests. The target variable (house price) is continuous, making this an ideal problem for supervised learning techniques.
Feature Engineering Practice: Learners can create new features by combining existing ones, such as the price per square foot or age of the house, providing an opportunity to experiment with feature transformations.
Exploratory Data Analysis (EDA): You can explore how different features (e.g., square footage, number of bedrooms) correlate with the target variable, making it a great dataset for learning about data visualization and summary statistics.
Model Evaluation: The dataset allows for various model evaluation techniques such as cross-validation, R-squared, and Mean Absolute Error (MAE). These metrics can be used to compare the effectiveness of different models.
The dataset is highly versatile for a range of machine learning tasks. You can apply simple linear models to predict house prices based on one or two features, or use more complex models like Random Forest or Gradient Boosting Machines to understand interactions between variables.
It can also be used for dimensionality reduction techniques like PCA or to practice handling categorical variables (e.g., neighborhood quality) through encoding techniques like one-hot encoding.
This dataset is ideal for anyone wanting to gain practical experience in building regression models while working with real-world features.
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Use our Stock prices dataset to access comprehensive financial and corporate data, including company profiles, stock prices, market capitalization, revenue, and key performance metrics. This dataset is tailored for financial analysts, investors, and researchers to analyze market trends and evaluate company performance.
Popular use cases include investment research, competitor benchmarking, and trend forecasting. Leverage this dataset to make informed financial decisions, identify growth opportunities, and gain a deeper understanding of the business landscape. The dataset includes all major data points: company name, company ID, summary, stock ticker, earnings date, closing price, previous close, opening price, and much more.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
For a quick summary of the case study, please click "US Economy Powerpoint" and download the Powerpoint.
This dataset was inspired by rising prices for essential goods, the abnormally high inflation rate in March of 7.9 percent of this year, and the 30 trillion-dollar debt that we have. I was extremely curious to see how sustainable this is for the average American and if wages are increasing at the same rate to help combat this inflation. This is not politically driven in the slightest nor was this made to put the blame on Americans. This dataset was inspired by rising prices for essential goods and the abnormally high inflation rate in March of 7.9 percent of this year. I was extremely curious to see how sustainable this is for the average American and if wages are increasing at the same rate to help combat this inflation. This is not politically driven in the slightest nor was this made to put the blame on Americans. All of the datasets were obtained from third party sources websites such as https://dqydj.com/household-income-by-year/ and https://www.usinflationcalculator.com/inflation/historical-inflation-rates/ and only excluding https://fred.stlouisfed.org/series/ASPUS, which is first-party data.
This dataset was inspired by rising prices for essential goods and the abnormally high inflation rate in March of 7.9 percent of this year. I was extremely curious to see how sustainable this is for the average American and if wages are increasing at the same rate to help combat this inflation. This is not politically driven in the slightest nor was this made to put the blame on Americans. This dataset was inspired by rising prices for essential goods and the abnormally high inflation rate in March of 7.9 percent of this year. I was extremely curious to see how sustainable this is for the average American and if wages are increasing at the same rate to help combat this inflation. This is not politically driven in the slightest nor was this made to put the blame on Americans. All of the datasets were obtained from third party sources websites such as https://dqydj.com/household-income-by-year/ and https://www.usinflationcalculator.com/inflation/historical-inflation-rates/ and only excluding https://fred.stlouisfed.org/series/ASPUS, which is first-party data.
I labeled all of the datasets to be self-explanatory based off of the title of the datasets. The US Economy Notebook has most of the code that I used as well as the four of the six phases of data analysis. The last two phases are in the US Economy Powerpoint. The "US Historical Inflation Rates" dataset could have also been labeled "The Inflation Of The US Dollar Month By Month". Lastly, the Average Sales of Houses in Jan is just a filtered version of "Average Sales of Houses in the US" dataset.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Average house prices are derived from data supplied by the mortgage lending agencies on loans approved by them rather than loans paid. In comparing house prices figures from one period to another, account should be taken of the fact that changes in the mix of houses (incl apartments) will affect the average figures. The most current data is published on these sheets. Previously published data may be subject to revision. Any change from the originally published data will be highlighted by a comment on the cell in question. These comments will be maintained for at least a year after the date of the value change. Excluding apartments, measured in € Figure changed on the 27/6/16 as revised data received from the Local authority .hidden { display: none }
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Weekly Commodity Prices are made up of four excel spreadsheets and graphs split into commodity groups. Source agency: Environment, Food and Rural Affairs Designation: National Statistics Language: English Alternative title: Commodity Price Movements If you require datasets in a more accessible format, please contact prices@defra.gsi.gov.uk.
Facebook
Twitterhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.7910/DVN/6RQCRShttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.7910/DVN/6RQCRS
New data-gathering techniques, often referred to as “Big Data” have the potential to improve statistics and empirical research in economics. In this paper we describe our work with online data at the Billion Prices Project at MIT and discuss key lessons for both inflation measurement and some fundamental research questions in macro and international economics. In particular, we show how online prices can be used to construct daily price indexes in multiple countries and to avoid measurement biases that distort evidence of price stickiness and international relative prices. We emphasize how Big Data technologies are providing macro and international economists with opportunities to stop treating the data as “given” and to get directly involved with data collection.
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Looking for a free Walmart product dataset? The Walmart Products Free Dataset delivers a ready-to-use ecommerce product data CSV containing ~2,100 verified product records from Walmart.com. It includes vital details like product titles, prices, categories, brand info, availability, and descriptions — perfect for data analysis, price comparison, market research, or building machine-learning models.
Complete Product Metadata: Each entry includes URL, title, brand, SKU, price, currency, description, availability, delivery method, average rating, total ratings, image links, unique ID, and timestamp.
CSV Format, Ready to Use: Download instantly - no need for scraping, cleaning or formatting.
Good for E-commerce Research & ML: Ideal for product cataloging, price tracking, demand forecasting, recommendation systems, or data-driven projects.
Free & Easy Access: Priced at USD $0.0, making it a great starting point for developers, data analysts or students.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This article describes the new RPIJ measure of Consumer Price Inflation. RPIJ is a Retail Prices Index (RPI) based measure that will use a geometric (Jevons) formula in place of one type of arithmetic formula (Carli). It is being launched in response to the National Statistician's conclusion that the RPI does not meet international standards due to the use of the Carli formula in its calculation. The accompanying Excel file includes a back series for RPIJ from 1997 to 2012. Source agency: Office for National Statistics Designation: National Statistics Language: English Alternative title: New RPIJ measure of Consumer Price Inflation
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
I have previously shared a classification based dataset to classify the gender which is liked by those who are new to machine learning as it give a pretty good accuracy, which encouraged me to create a regression dataset to predict continues values. I have tried many real world datasets for regression problems which are predicting with lower accuracy and high error rate. As a beginner, I have struggled and worried why and how the dataset performs poorly. This is another main reason why I created this dataset. Although this is a made up dataset, I have considered all the features when deciding the price of the property. If you are a beginner, you would love to try this as the results are stunning..
Since this is a populated data, I will straightaway explain the features and the label. FEATURES 1. land_size_sqm - This the total size of the land in square meters. 2. house_size_sqm - This is the area in which house is located within the land. This is measured in square meters. 3. no_of_rooms - This indicates the number of rooms available in the house. 4. no_of_bathrooms - This shows the number of total bathrooms made in the house. 5. large_living_room - This indicates whether the house includes a larger living room or not. The assumption is that all the houses contain a living room. This feature attempts to classify whether it's large or small where '1' means large and '0' means small. However in the categorical dataset, 1 and 0 are represented with 'yes' and 'No' respectively. 6. parking_space - This indicates whether there is a parking space or not. '1' represents the parking available while '0' represents no parking space available. However in the categorical dataset, 1 and 0 are represented with 'yes' and 'No' respectively. 7. front_garden - This shows whether there is a garden available in front of the house. '1' means the garden available and '0' means no garden available. However in the categorical dataset, 1 and 0 are represented with 'yes' and 'No' respectively. 8. swimming_pool - This shows the availability of the swimming pool at the house. 1 represents the availability of the swimming pool while 0 represents the non availability of the same. However in the categorical dataset, 1 and 0 are represented with 'yes' and 'No' respectively. 9. distance_to_school_km - This shows the distance from the house to the nearest school in Kilometers. 10. wall_fence - This shows whether there is a wall fence or not. '1' mean there is wall fence and '0' means no wall fence. However in the categorical dataset, 1 and 0 are represented with 'yes' and 'No' respectively. 11. **house_age_or_renovated **- This is either the age of the house in years or the period from the date of renovation. 12. water_front - this indicates whether the house is located in front of the water or not. 1 means waterfront and 0 means its not located near the water. However in the categorical dataset, 1 and 0 are represented with 'yes' and 'No' respectively. 13. distance_to_supermarket_km - what is the distance to the nearest supermarket in kilometers.
LABEL property_value - This is the price of the property
Following features are only available in the "house price dataset original v2 cleaned" and "house price dataset original v2 with categorical features" data only. 14. crime_rate - its in float and falls between 0 and 7. lesser the better 15. room_size - As the name suggests, it explains the size of the room. 0 is being 'small', 1 is being 'medium', 2 is 'large' and 3 is being 'Extra large'. However in the categorical dataset, these values are categorical and self explanatory.
I spent around 3 hours creating this dataset. Enjoy..
Share your notebooks to see which algorithm predicts the house price precisely.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Average house prices are derived from data supplied by the mortgage lending agencies on loans approved by them rather than loans paid. In comparing house prices figures The most current data is published on these sheets. Previously published data may be subject to revision. Any change from the originally published data will be highlighted by a comment on the cell in question. These comments will be maintained for at least a year after the date of the value change. Figure changed on the 27/6/16 as revised data received from the Local authority Prices include houses and apartments measured in € .hidden { display: none }
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for PRODUCER PRICES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the data for the Price County, WI population pyramid, which represents the Price County population distribution across age and gender, using estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It lists the male and female population for each age group, along with the total population for those age groups. Higher numbers at the bottom of the table suggest population growth, whereas higher numbers at the top indicate declining birth rates. Furthermore, the dataset can be utilized to understand the youth dependency ratio, old-age dependency ratio, total dependency ratio, and potential support ratio.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Price County Population by Age. You can refer the same here
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Price town by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Price town. The dataset can be utilized to understand the population distribution of Price town by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Price town. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Price town.
Key observations
Largest age group (population): Male # 0-4 years (39) | Female # 55-59 years (29). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Price town Population by Gender. You can refer the same here
Facebook
TwitterOpen Government Licence 2.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/2/
License information was derived automatically
Housing affordability (lower quartile house prices to earnings ratio) *This indicator has been discontinued
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Twitter Stock Prices Dataset contains stock price data for Twitter from November 2013 to October 2022. This dataset is a time series dataset that provides daily stock trading information. • The key attributes include the stock's opening price (Open), highest price (High), lowest price (Low), closing price (Close), adjusted closing price (Adj Close), and volume (Volume).
2) Data Utilization (1) Characteristics of the Twitter Stock Prices Data • This dataset is a time series, offering daily stock price fluctuations and allows tracking of price changes over time. • It includes 7 main attributes related to stock trading, allowing for analysis of price movements (open, high, low, close) and volume, to better understand Twitter’s stock price dynamics. • This data helps analyze market trends, price volatility patterns, and price fluctuation analysis, providing insights into the dynamics of the stock market.
(2) Applications of the Twitter Stock Prices Data • Predictive Modeling: This dataset can be used to develop stock price prediction models, including predicting price increases/decreases or forecasting future stock prices using machine learning models. • Business Insights: Investment experts can use this dataset to evaluate Twitter’s stock performance, and it provides useful information for optimizing investment strategies in response to market changes. This dataset can be used for trend forecasting and investor analysis. • Trend Analysis: By analyzing stock upward/downward trends, this dataset can help evaluate the company's market performance and develop trend-based investment strategies.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
There are numerous electronic products that we use on a daily basis like computers, TV, heaters, mobile devices, etc.
We buy various electronic products on a day-to-day basis online or offline on reliable prices. There could be various parameters which decide the price of an electronic product like brand, product name, the product condition (i.e. new or old), the merchant, category of the product, date and time of buying, etc.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4789522%2F34e685fa7eb8d7d0c21ab93a9d6a1a86%2Felectronic_product.jpg?generation=1608390049274879&alt=media" alt="">
Imagine that you have recently started a new electronic shop. You want to sell your products online and you need to come up with an appropriate selling price for each product. Being a data scientist you think of utilizing the power of data science on the available online data.
This dataset contains pricing information of electronic products. There are 25 columns including the target variable. Some of the variables are listed below:
prices.availability: if the product is available at the given price prices.condition: condition of the product prices.currency: price currency prices.isSale: Is the product on sale at given price prices.merchant: The merchant imageURLs: product image url manufacturer: Manufacturer of the product manufacturerNumber: Manufacturer number name: Name of the product primaryCategories: Primary category of the product weight: weight of the product
The dataset is taken from data world.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about stocks per day. It has 838 rows and is filtered where the stock is HSH.TO. It features 3 columns: stock, and closing price.
Facebook
TwitterThis dataset contains information about world's crude oil prices for 1861-2020. Data from BP. Follow datasource.kapsarc.org for timely data to advance energy economics research.Notes: 1861-1944 US Average 1945-1983 Arabian Light posted at Ras Tanura 1984-2016 Brent dated. $2020 (deflated using the Consumer Price Index for the US
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents median household incomes for various household sizes in Price County, WI, as reported by the U.S. Census Bureau. The dataset highlights the variation in median household income with the size of the family unit, offering valuable insights into economic trends and disparities within different household sizes, aiding in data analysis and decision-making.
Key observations
https://i.neilsberg.com/ch/price-county-wi-median-household-income-by-household-size.jpeg" alt="Price County, WI median household income, by household size (in 2022 inflation-adjusted dollars)">
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Household Sizes:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Price County median household income. You can refer the same here
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://raw.githubusercontent.com/Masterx-AI/Project_Housing_Price_Prediction_/main/hs.jpg" alt="">
A simple yet challenging project, to predict the housing price based on certain factors like house area, bedrooms, furnished, nearness to mainroad, etc. The dataset is small yet, it's complexity arises due to the fact that it has strong multicollinearity. Can you overcome these obstacles & build a decent predictive model?
Harrison, D. and Rubinfeld, D.L. (1978) Hedonic prices and the demand for clean air. J. Environ. Economics and Management 5, 81–102. Belsley D.A., Kuh, E. and Welsch, R.E. (1980) Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. New York: Wiley.