100+ datasets found
  1. House Price Regression Dataset

    • kaggle.com
    zip
    Updated Sep 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prokshitha Polemoni (2024). House Price Regression Dataset [Dataset]. https://www.kaggle.com/datasets/prokshitha/home-value-insights
    Explore at:
    zip(27045 bytes)Available download formats
    Dataset updated
    Sep 6, 2024
    Authors
    Prokshitha Polemoni
    Description

    Home Value Insights: A Beginner's Regression Dataset

    This dataset is designed for beginners to practice regression problems, particularly in the context of predicting house prices. It contains 1000 rows, with each row representing a house and various attributes that influence its price. The dataset is well-suited for learning basic to intermediate-level regression modeling techniques.

    Features:

    1. Square_Footage: The size of the house in square feet. Larger homes typically have higher prices.
    2. Num_Bedrooms: The number of bedrooms in the house. More bedrooms generally increase the value of a home.
    3. Num_Bathrooms: The number of bathrooms in the house. Houses with more bathrooms are typically priced higher.
    4. Year_Built: The year the house was built. Older houses may be priced lower due to wear and tear.
    5. Lot_Size: The size of the lot the house is built on, measured in acres. Larger lots tend to add value to a property.
    6. Garage_Size: The number of cars that can fit in the garage. Houses with larger garages are usually more expensive.
    7. Neighborhood_Quality: A rating of the neighborhood’s quality on a scale of 1-10, where 10 indicates a high-quality neighborhood. Better neighborhoods usually command higher prices.
    8. House_Price (Target Variable): The price of the house, which is the dependent variable you aim to predict.

    Potential Uses:

    1. Beginner Regression Projects: This dataset can be used to practice building regression models such as Linear Regression, Decision Trees, or Random Forests. The target variable (house price) is continuous, making this an ideal problem for supervised learning techniques.

    2. Feature Engineering Practice: Learners can create new features by combining existing ones, such as the price per square foot or age of the house, providing an opportunity to experiment with feature transformations.

    3. Exploratory Data Analysis (EDA): You can explore how different features (e.g., square footage, number of bedrooms) correlate with the target variable, making it a great dataset for learning about data visualization and summary statistics.

    4. Model Evaluation: The dataset allows for various model evaluation techniques such as cross-validation, R-squared, and Mean Absolute Error (MAE). These metrics can be used to compare the effectiveness of different models.

    Versatility:

    • The dataset is highly versatile for a range of machine learning tasks. You can apply simple linear models to predict house prices based on one or two features, or use more complex models like Random Forest or Gradient Boosting Machines to understand interactions between variables.

    • It can also be used for dimensionality reduction techniques like PCA or to practice handling categorical variables (e.g., neighborhood quality) through encoding techniques like one-hot encoding.

    • This dataset is ideal for anyone wanting to gain practical experience in building regression models while working with real-world features.

  2. House Price Prediction Dataset

    • kaggle.com
    zip
    Updated Sep 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zafar (2024). House Price Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/zafarali27/house-price-prediction-dataset
    Explore at:
    zip(29372 bytes)Available download formats
    Dataset updated
    Sep 21, 2024
    Authors
    Zafar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    House Price Prediction Dataset.

    The dataset contains 2000 rows of house-related data, representing various features that could influence house prices. Below, we discuss key aspects of the dataset, which include its structure, the choice of features, and potential use cases for analysis.

    1. Dataset Features

    The dataset is designed to capture essential attributes for predicting house prices, including:

    Area: Square footage of the house, which is generally one of the most important predictors of price. Bedrooms & Bathrooms: The number of rooms in a house significantly affects its value. Homes with more rooms tend to be priced higher. Floors: The number of floors in a house could indicate a larger, more luxurious home, potentially raising its price. Year Built: The age of the house can affect its condition and value. Newly built houses are generally more expensive than older ones. Location: Houses in desirable locations such as downtown or urban areas tend to be priced higher than those in suburban or rural areas. Condition: The current condition of the house is critical, as well-maintained houses (in 'Excellent' or 'Good' condition) will attract higher prices compared to houses in 'Fair' or 'Poor' condition. Garage: Availability of a garage can increase the price due to added convenience and space. Price: The target variable, representing the sale price of the house, used to train machine learning models to predict house prices based on the other features.

    2. Feature Distributions

    Area Distribution: The area of the houses in the dataset ranges from 500 to 5000 square feet, which allows analysis across different types of homes, from smaller apartments to larger luxury houses. Bedrooms and Bathrooms: The number of bedrooms varies from 1 to 5, and bathrooms from 1 to 4. This variance enables analysis of homes with different sizes and layouts. Floors: Houses in the dataset have between 1 and 3 floors. This feature could be useful for identifying the influence of multi-level homes on house prices. Year Built: The dataset contains houses built from 1900 to 2023, giving a wide range of house ages to analyze the effects of new vs. older construction. Location: There is a mix of urban, suburban, downtown, and rural locations. Urban and downtown homes may command higher prices due to proximity to amenities. Condition: Houses are labeled as 'Excellent', 'Good', 'Fair', or 'Poor'. This feature helps model the price differences based on the current state of the house. Price Distribution: Prices range between $50,000 and $1,000,000, offering a broad spectrum of property values. This range makes the dataset appropriate for predicting a wide variety of housing prices, from affordable homes to luxury properties.

    3. Correlation Between Features

    A key area of interest is the relationship between various features and house price: Area and Price: Typically, a strong positive correlation is expected between the size of the house (Area) and its price. Larger homes are likely to be more expensive. Location and Price: Location is another major factor. Houses in urban or downtown areas may show a higher price on average compared to suburban and rural locations. Condition and Price: The condition of the house should show a positive correlation with price. Houses in better condition should be priced higher, as they require less maintenance and repair. Year Built and Price: Newer houses might command a higher price due to better construction standards, modern amenities, and less wear-and-tear, but some older homes in good condition may retain historical value. Garage and Price: A house with a garage may be more expensive than one without, as it provides extra storage or parking space.

    4. Potential Use Cases

    The dataset is well-suited for various machine learning and data analysis applications, including:

    House Price Prediction: Using regression techniques, this dataset can be used to build a model to predict house prices based on the available features. Feature Importance Analysis: By using techniques such as feature importance ranking, data scientists can determine which features (e.g., location, area, or condition) have the greatest impact on house prices. Clustering: Clustering techniques like k-means could help identify patterns in the data, such as grouping houses into segments based on their characteristics (e.g., luxury homes, affordable homes). Market Segmentation: The dataset can be used to perform segmentation by location, price range, or house type to analyze trends in specific sub-markets, like luxury vs. affordable housing. Time-Based Analysis: By studying how house prices vary with the year built or the age of the house, analysts can derive insights into the trends of older vs. newer homes.

    5. Limitations and ...

  3. Average New House Price - Dataset - data.gov.ie

    • data.gov.ie
    Updated Sep 9, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.gov.ie (2016). Average New House Price - Dataset - data.gov.ie [Dataset]. https://data.gov.ie/dataset/average-new-house-price
    Explore at:
    Dataset updated
    Sep 9, 2016
    Dataset provided by
    data.gov.ie
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Average house prices are derived from data supplied by the mortgage lending agencies on loans approved by them rather than loans paid. In comparing house prices figures from one period to another, account should be taken of the fact that changes in the mix of houses (incl apartments) will affect the average figures. The most current data is published on these sheets. Previously published data may be subject to revision. Any change from the originally published data will be highlighted by a comment on the cell in question. These comments will be maintained for at least a year after the date of the value change. Excluding apartments, measured in € Figure changed on the 27/6/16 as revised data received from the Local authority .hidden { display: none }

  4. HSQ06 - Average Price of Houses - Dataset - data.gov.ie

    • data.gov.ie
    Updated Jan 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.gov.ie (2021). HSQ06 - Average Price of Houses - Dataset - data.gov.ie [Dataset]. https://data.gov.ie/dataset/hsq06-average-price-of-houses
    Explore at:
    Dataset updated
    Jan 15, 2021
    Dataset provided by
    data.gov.ie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Licensed under: Creative Commons Attribution 4.0

  5. Canada Open Government Working Group: High Value Datasets Criteria

    • open.canada.ca
    • data.wu.ac.at
    pdf
    Updated Nov 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Treasury Board of Canada Secretariat (2024). Canada Open Government Working Group: High Value Datasets Criteria [Dataset]. https://open.canada.ca/data/en/dataset/e26db340-df16-4796-8b0b-55dacacfbcd5
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Nov 20, 2024
    Dataset provided by
    Treasury Board of Canada Secretariathttp://www.tbs-sct.gc.ca/
    Treasury Board of Canadahttps://www.canada.ca/en/treasury-board-secretariat/corporate/about-treasury-board.html
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Area covered
    Canada
    Description

    This report provides common criteria to help identify high value datasets and provide examples of common types of high value datasets. It was based on jurisdictional scans of high value dataset criteria, recent surveys, and international standards

  6. US Cities Housing Market Data - Live Dataset

    • kaggle.com
    zip
    Updated Oct 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vincent Vaseghi (2025). US Cities Housing Market Data - Live Dataset [Dataset]. https://www.kaggle.com/datasets/vincentvaseghi/us-cities-housing-market-data
    Explore at:
    zip(984945960 bytes)Available download formats
    Dataset updated
    Oct 12, 2025
    Authors
    Vincent Vaseghi
    Area covered
    United States
    Description

    Redfin is a real estate brokerage and publishes the US housing market data on a regular basis. Using this dataset, you can analyze and visualize housing market data for US cities. Timeline: Starting from February 2012 until the present time (Data is refreshed and updated on a monthly basis)

    The dataset has the following columns: - period_begin - period_end - period_duration
    - region_type
    - region_type_id - table_id - is_seasonally_adjusted. (indicates if prices are seasonally adjusted; f represents False) - region - city - state - state_code - property_type - property_type_id - median_sale_price
    - median_sale_price_mom (median sale price changes month over month) - median_sale_price_yoy (median sale price changes year over year) - median_list_price
    - median_list_price_mom (median list price changes month over month) - median_list_price_yoy (median list price changes year over year) - median_ppsf (median sale price per square foot) - median_ppsf_mom (median sale price per square foot changes month over month) - median_ppsf_yoy (median sale price per square foot changes year over year) - median_list_ppsf (median list price per square foot) - median_list_ppsf_mom (median list price per square foot changes month over month) - median_list_ppsf_yoy. (median list price per square foot changes year over year) - homes_sold (number of homes sold) - homes_sold_mom (number of homes sold month over month) - homes_sold_yoy (number of homes sold year over year) - pending_sales
    - pending_sales_mom
    - pending_sales_yoy
    - new_listings - new_listings_mom
    - new_listings_yoy
    - inventory - inventory_mom
    - inventory_yoy
    - months_of_supply
    - months_of_supply_mom - months_of_supply_yoy
    - median_dom (median days on market until property is sold) - median_dom_mom (median days on market changes month over month) - median_dom_yoy (median days on market changes year over year) - avg_sale_to_list (average sale price to list price ratio) - avg_sale_to_list_mom (average sale price to list price ratio changes month over month) - avg_sale_to_list_yoy (average sale price to list price ratio changes year over year) - sold_above_list
    - sold_above_list_mom - sold_above_list_yoy - price_drops - price_drops_mom - price_drops_yoy - off_market_in_two_weeks (number of properties that will be taken off the market within 2 weeks) - off_market_in_two_weeks_mom (changes in number of properties that will be taken off the market within 2 weeks, month over month) - off_market_in_two_weeks_yoy (changes in number of properties that will be taken off the market within 2 weeks, year over year) - parent_metro_region - parent_metro_region_metro_code - last_updated

    Filetype: gzip (gz) Support for gzip files in Python: https://docs.python.org/3/library/gzip.html

    Data Source & Credit: Redfin.com

  7. m

    Consumer Price Index, All items - United States

    • macro-rankings.com
    csv, excel
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    macro-rankings (2024). Consumer Price Index, All items - United States [Dataset]. https://www.macro-rankings.com/united-states/consumer-price-index-all-items
    Explore at:
    csv, excelAvailable download formats
    Dataset updated
    Jun 25, 2024
    Dataset authored and provided by
    macro-rankings
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    Time series data for the statistic Consumer Price Index, All items and country United States. Indicator Definition:Consumer Price Index, All itemsThe indicator "Consumer Price Index, All items" stands at 148.58 as of 8/31/2025, the highest value at least since 2/28/1990, the period currently displayed. Regarding the One-Year-Change of the series, the current value constitutes an increase of 2.92 percent compared to the value the year prior.The 1 year change in percent is 2.92.The 3 year change in percent is 9.39.The 5 year change in percent is 24.65.The 10 year change in percent is 35.94.The Serie's long term average value is 95.94. It's latest available value, on 8/31/2025, is 54.87 percent higher, compared to it's long term average value.The Serie's change in percent from it's minimum value, on 1/31/1990, to it's latest available value, on 8/31/2025, is +154.30%.The Serie's change in percent from it's maximum value, on 8/31/2025, to it's latest available value, on 8/31/2025, is 0.0%.

  8. Average value of credit operations by size of the borrower - microenterprise...

    • opendata.bcb.gov.br
    Updated Jan 25, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    bcb.gov.br (2018). Average value of credit operations by size of the borrower - microenterprise - Dataset - Banco Central do Brasil Open Data Portal [Dataset]. https://opendata.bcb.gov.br/dataset/25715-average-value-of-credit-operations-by-size-of-the-borrower---microenterprise
    Explore at:
    Dataset updated
    Jan 25, 2018
    Dataset provided by
    Central Bank of Brazilhttp://www.bc.gov.br/
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Concept: Average value of credit operations by size of the borrower - microenterprise Source: Credit Information System 9ca1e64e-7fc1-4843-bfee-e936b0cfe7d4 25715-average-value-of-credit-operations-by-size-of-the-borrower---microenterprise

  9. Women's_Shoes_Prices

    • kaggle.com
    zip
    Updated Feb 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nguyen Ngoc Phung (2022). Women's_Shoes_Prices [Dataset]. https://www.kaggle.com/datasets/nguyenngocphung/womens-shoe-prices
    Explore at:
    zip(1867135 bytes)Available download formats
    Dataset updated
    Feb 12, 2022
    Authors
    Nguyen Ngoc Phung
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    UPVOTE PLS !!!! 🐋

    About This Data 🐕

    This is a list of 10,000 women's shoes and their product information provided by Datafiniti's Product Database.

    The dataset includes shoe name, brand, price, and more. Each shoe will have an entry for each price found for it and some shoes may have multiple entries.

    Note that this is a sample of a large dataset. The full dataset is available through Datafiniti.

    What You Can Do with This Data 🦝

    You can use this data to determine brand markups, pricing strategies, and trends for luxury shoes. E.g.:

    What is the average price of each distinct brand listed? Which brands have the highest prices? Which ones have the widest distribution of prices? Is there a typical price distribution (e.g., normal) across brands or within specific brands? Further processing data would also let you:

    Correlate specific product features with changes in price. You can cross-reference this data with a sample of our Men's Shoe Prices to see if there are any differences between women's brands and men's brands.

    Data Schema 🐆

    A full schema for the data is available in here

  10. Average value of credit operations - individual microentrepreneur (MEI) -...

    • opendata.bcb.gov.br
    Updated Jan 25, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    bcb.gov.br (2018). Average value of credit operations - individual microentrepreneur (MEI) - Dataset - Banco Central do Brasil Open Data Portal [Dataset]. https://opendata.bcb.gov.br/dataset/27311-average-value-of-credit-operations---individual-microentrepreneur-mei
    Explore at:
    Dataset updated
    Jan 25, 2018
    Dataset provided by
    Central Bank of Brazilhttp://www.bc.gov.br/
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Concept: Average value of credit operations - individual microentrepreneur (MEI) Source: Central Bank of Brazil - Department of Financial Education d58a9ee7-fad6-4699-841e-222a3e73bcaa 27311-average-value-of-credit-operations---individual-microentrepreneur-mei

  11. m

    Consumer Price Index, All items - Denmark

    • macro-rankings.com
    csv, excel
    Updated Nov 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    macro-rankings (2025). Consumer Price Index, All items - Denmark [Dataset]. https://www.macro-rankings.com/denmark/consumer-price-index-all-items
    Explore at:
    excel, csvAvailable download formats
    Dataset updated
    Nov 13, 2025
    Dataset authored and provided by
    macro-rankings
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Denmark
    Description

    Time series data for the statistic Consumer Price Index, All items and country Denmark. Indicator Definition:Consumer Price Index, All itemsThe indicator "Consumer Price Index, All items" stands at 121.70 as of 8/31/2025. Regarding the One-Year-Change of the series, the current value constitutes an increase of 2.01 percent compared to the value the year prior.The 1 year change in percent is 2.01.The 3 year change in percent is 5.92.The 5 year change in percent is 17.47.The 10 year change in percent is 21.70.The Serie's long term average value is 88.15. It's latest available value, on 8/31/2025, is 38.05 percent higher, compared to it's long term average value.The Serie's change in percent from it's minimum value, on 1/31/1990, to it's latest available value, on 8/31/2025, is +100.08%.The Serie's change in percent from it's maximum value, on 7/31/2025, to it's latest available value, on 8/31/2025, is -0.653%.

  12. T

    United States New Home Average Sales Price

    • tradingeconomics.com
    • it.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Oct 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States New Home Average Sales Price [Dataset]. https://tradingeconomics.com/united-states/average-house-prices
    Explore at:
    excel, csv, xml, jsonAvailable download formats
    Dataset updated
    Oct 16, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 31, 1975 - Aug 31, 2025
    Area covered
    United States
    Description

    Average House Prices in the United States increased to 534100 USD in August from 478200 USD in July of 2025. This dataset includes a chart with historical data for the United States New Home Average Sales Price.

  13. y

    Average House Price - Dataset - York Open Data

    • data.yorkopendata.org
    • ckan.york.staging.datopian.com
    Updated Feb 4, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). Average House Price - Dataset - York Open Data [Dataset]. https://data.yorkopendata.org/dataset/kpi-cjge121a
    Explore at:
    Dataset updated
    Feb 4, 2016
    License

    Open Government Licence 2.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/2/
    License information was derived automatically

    Area covered
    York
    Description

    Average House Price

  14. d

    Average Price Per Gram of Usable Cannabis

    • catalog.data.gov
    • data.ct.gov
    • +3more
    Updated Oct 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ct.gov (2025). Average Price Per Gram of Usable Cannabis [Dataset]. https://catalog.data.gov/dataset/average-price-per-gram-of-usable-cannabis
    Explore at:
    Dataset updated
    Oct 11, 2025
    Dataset provided by
    data.ct.gov
    Description

    This data set contains preliminary monthly sales data for the average price per gram of usable cannabis in both the adult-use cannabis and medical marijuana markets. For the purposes of this dataset, "usable cannabis" includes raw flower in whole, ground, or pre-rolled form, without additional extracted materials. The data reported is compiled at specific points in time and only captures data current at the time the report is generated. Data values may be updated and change over time as updates occur.

  15. m

    Consumer Price Index, All items - Trinidad and Tobago

    • macro-rankings.com
    csv, excel
    Updated Jan 31, 2000
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    macro-rankings (2000). Consumer Price Index, All items - Trinidad and Tobago [Dataset]. https://www.macro-rankings.com/trinidad-and-tobago/consumer-price-index-all-items
    Explore at:
    excel, csvAvailable download formats
    Dataset updated
    Jan 31, 2000
    Dataset authored and provided by
    macro-rankings
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Trinidad and Tobago
    Description

    Time series data for the statistic Consumer Price Index, All items and country Trinidad and Tobago. Indicator Definition:Consumer Price Index, All itemsThe indicator "Consumer Price Index, All items" stands at 125.70 as of 7/31/2025, the highest value at least since 2/29/2000, the period currently displayed. Regarding the One-Year-Change of the series, the current value constitutes an increase of 1.45 percent compared to the value the year prior.The 1 year change in percent is 1.45.The 3 year change in percent is 6.53.The 5 year change in percent is 15.32.The 10 year change in percent is 23.57.The Serie's long term average value is 82.93. It's latest available value, on 7/31/2025, is 51.57 percent higher, compared to it's long term average value.The Serie's change in percent from it's minimum value, on 3/31/2000, to it's latest available value, on 7/31/2025, is +231.26%.The Serie's change in percent from it's maximum value, on 7/31/2025, to it's latest available value, on 7/31/2025, is 0.0%.

  16. Books Sales and Ratings

    • kaggle.com
    zip
    Updated Dec 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Books Sales and Ratings [Dataset]. https://www.kaggle.com/datasets/thedevastator/books-sales-and-ratings
    Explore at:
    zip(54505 bytes)Available download formats
    Dataset updated
    Dec 6, 2023
    Authors
    The Devastator
    Description

    Books Sales and Ratings

    Books Dataset: Analyzing Sales, Ratings, and Genres

    By Josh Murrey [source]

    About this dataset

    The Books Dataset: Sales, Ratings, and Publication provides comprehensive information on various aspects of books, including their publishing year, author details, ratings given by readers, sales performance data, and genre classification. The dataset consists of several key columns that capture important attributes related to each book.

    The Publishing Year column indicates the year in which each book was published. This information helps in understanding the chronological distribution of books in the dataset.

    The Book Name column contains the titles of the books. Each book has a unique name that distinguishes it from others in the dataset.

    The Author column specifies the name(s) of the author(s) responsible for creating each book. This information is crucial for understanding different authors' contributions and analyzing their impact on sales and ratings.

    The language_code column represents a specific code assigned to indicate the language in which each book is written. This code serves as a reference point for language-based analysis within the dataset.

    Each author's rating is captured in the Author_Rating column. This rating is based on their previous works and serves as an indicator of their reputation or acclaim among readers.

    The average rating given by readers for each book is recorded in the Book_average_rating column. This value reflects how well-received a particular book is by its audience.

    The number of ratings given to each book by readers can be found in the Book_ratings_count column. This metric helps gauge reader engagement and provides insights into popular or widely-discussed books within this dataset.

    Books are classified into different genres or categories which are mentioned under the genre column. Genre classification allows for analyzing trends across specific literary genres or identifying patterns related to certain types of books.

    Sales-related data includes both gross sales revenue (gross sales) generated by each book and publisher revenue (publisher revenue) earned from these sales transactions. These numeric values provide insights into financial performance aspects associated with the book market.

    The sale price column denotes the specific price at which each book is sold. This information helps evaluate pricing strategies and their potential impact on sales figures.

    Sales performance is further quantified through the sales rank column, which assigns a numerical rank to each book based on its sales performance. This ranking system aids in identifying high-performing books within the dataset.

    Lastly, the units sold column captures the number of units of each book that have been sold. This data highlights popular books based on reader demand and serves as a crucial measure of commercial success within the dataset.

    Overall, this expansive and comprehensive Books Dataset

    How to use the dataset

    Introduction:

    • Getting Familiar with the Columns: The dataset contains multiple columns that provide different kinds of information:

    • Book Name: The title of each book.

    • Author: The name of the author who wrote the book.

    • language_code: The code representing the language in which the book is written.

    • Author_Rating: The rating assigned to the author based on their previous works.

    • Book_average_rating: The average rating given to the book by readers.

    • Book_ratings_count: The number of ratings given to the book by readers.

    • genre: The genre or category to which the book belongs.

    • gross sales: The total sales revenue generated by each book.

    • publisher revenue: The revenue earned by publishers from selling each book.

    • sale price: The price at which each copy of a book is sold.

    • sales rank: A numeric value indicating a book's rank based on its sales performance in comparison to other books within its category (genre).

    • units sold : Total number of copies sold for each specific title.

    • Understanding Numeric and Textual Data: Numeric columns in this dataset include Publishing Year, Author_Rating, Book_average_rating, Book_ratings_count,gross sales,publisher revenue,sale price,sales rank and units sold; these provide quantitative insights that can be used for statistical analysis and comparisons.

    Additionally,the columns 'Author','Book Name',and 'genre' contain textual data that provides descriptive elements such as authors' names and categorization genres.

    • Exploring Relationships Between Data Points: By combining different co...
  17. m

    Nominal Residential Property Price Index Quarterly - Finland

    • macro-rankings.com
    csv, excel
    Updated Jun 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    macro-rankings (2025). Nominal Residential Property Price Index Quarterly - Finland [Dataset]. https://www.macro-rankings.com/finland/nominal-residential-property-price-index-quarterly
    Explore at:
    csv, excelAvailable download formats
    Dataset updated
    Jun 13, 2025
    Dataset authored and provided by
    macro-rankings
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Finland
    Description

    Time series data for the statistic Nominal Residential Property Price Index Quarterly and country Finland. Indicator Definition:Nominal Residential Property Price Index QuarterlyThe indicator "Nominal Residential Property Price Index Quarterly" stands at 107.03 as of 06/30/2025. Regarding the One-Year-Change of the series, the current value constitutes a decrease of -1.37 percent compared to the value the year prior.The 1 year change in percent is -1.37.The 3 year change in percent is -11.45.The 5 year change in percent is -4.65.The 10 year change in percent is 0.0996.The Serie's long term average value is 61.26. It's latest available value, on 06/30/2025, is 74.70 percent higher, compared to it's long term average value.The Serie's change in percent from it's minimum value, on 03/31/1970, to it's latest available value, on 06/30/2025, is +1,486.88%.The Serie's change in percent from it's maximum value, on 06/30/2022, to it's latest available value, on 06/30/2025, is -11.45%.

  18. m

    Consumer Price Index, All items - Poland

    • macro-rankings.com
    csv, excel
    Updated Nov 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    macro-rankings (2025). Consumer Price Index, All items - Poland [Dataset]. https://www.macro-rankings.com/poland/consumer-price-index-all-items
    Explore at:
    csv, excelAvailable download formats
    Dataset updated
    Nov 11, 2025
    Dataset authored and provided by
    macro-rankings
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Poland
    Description

    Time series data for the statistic Consumer Price Index, All items and country Poland. Indicator Definition:Consumer Price Index, All itemsThe indicator "Consumer Price Index, All items" stands at 194.10 as of 7/31/2025, the highest value at least since 2/28/1990, the period currently displayed. Regarding the One-Year-Change of the series, the current value constitutes an increase of 3.19 percent compared to the value the year prior.The 1 year change in percent is 3.19.The 3 year change in percent is 19.37.The 5 year change in percent is 44.74.The 10 year change in percent is 57.42.The Serie's long term average value is 99.84. It's latest available value, on 7/31/2025, is 94.41 percent higher, compared to it's long term average value.The Serie's change in percent from it's minimum value, on 1/31/1990, to it's latest available value, on 7/31/2025, is +3,582.72%.The Serie's change in percent from it's maximum value, on 7/31/2025, to it's latest available value, on 7/31/2025, is 0.0%.

  19. Z

    Dataset - Dual-mode room temperature self-calibrating photodiodes...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    Updated Nov 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marit Ulset; Eivind Bardalen; Julian Gieseler (2022). Dataset - Dual-mode room temperature self-calibrating photodiodes approaching cryogenic radiometer uncertainty [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7360534
    Explore at:
    Dataset updated
    Nov 28, 2022
    Dataset provided by
    Justervesenet, Kjeller, Norway
    Physikalisch-Technische Bundesanstalt, Berlin, Germany
    University of South-Eastern Norway, Borre, Norway
    Authors
    Marit Ulset; Eivind Bardalen; Julian Gieseler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This page contains selected data from the publication "Dual-mode room temperature self-calibrating photodiodes approaching cryogenic radiometer uncertainty", Marit S Ulset et al 2022 Metrologia 59 035008, DOI 10.1088/1681-7575/ac6a94.

    Description of files:

    Fig5.txt: Data for Fig 5a. Plotted values of non-equivalence as a function of beam position on the photodiode.

    Fig6.txt: Non-equivalence (gamma) in parts per million (ppm) and responsivity in mK/mW as a function of power level P in mW.

    Fig7.txt: Spectral directional emissivity determined under 10° with respect to the sample surface normal for Wafer P7 at 20°C. Uncertainty is given as standard uncertainty (k=1).

    Fig14.txt: Calculated time constants for the four different steps in a thermal heating cycle (electrical low, optical, electrical high, optical). The average value in the published paper contains an error, as one dataset was used twice. The file also shows correct the average value, when all datasets are used only once. A corrigendum was submitted to Metrologia, but it was considered not necessary, and hence not published.

    Fig15.txt: Plotted values for apparent IQD in parts per million (ppm) for three different calculation algorithms. Uncertainty is given as propagated type A standard uncertainty.

    Fig16.txt: Plotted values for measured IQD in parts per million (ppm) as a function of absorbed optical power in µW, for two different measurement methods - OC and FB method.

  20. Median house prices for administrative geographies: HPSSA dataset 9

    • ons.gov.uk
    • cy.ons.gov.uk
    xls
    Updated Sep 20, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2023). Median house prices for administrative geographies: HPSSA dataset 9 [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/housing/datasets/medianhousepricefornationalandsubnationalgeographiesquarterlyrollingyearhpssadataset09
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 20, 2023
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Median price paid for residential property in England and Wales, by property type and administrative geographies. Annual data.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Prokshitha Polemoni (2024). House Price Regression Dataset [Dataset]. https://www.kaggle.com/datasets/prokshitha/home-value-insights
Organization logo

House Price Regression Dataset

Dataset Description: Home Value Insights

Explore at:
385 scholarly articles cite this dataset (View in Google Scholar)
zip(27045 bytes)Available download formats
Dataset updated
Sep 6, 2024
Authors
Prokshitha Polemoni
Description

Home Value Insights: A Beginner's Regression Dataset

This dataset is designed for beginners to practice regression problems, particularly in the context of predicting house prices. It contains 1000 rows, with each row representing a house and various attributes that influence its price. The dataset is well-suited for learning basic to intermediate-level regression modeling techniques.

Features:

  1. Square_Footage: The size of the house in square feet. Larger homes typically have higher prices.
  2. Num_Bedrooms: The number of bedrooms in the house. More bedrooms generally increase the value of a home.
  3. Num_Bathrooms: The number of bathrooms in the house. Houses with more bathrooms are typically priced higher.
  4. Year_Built: The year the house was built. Older houses may be priced lower due to wear and tear.
  5. Lot_Size: The size of the lot the house is built on, measured in acres. Larger lots tend to add value to a property.
  6. Garage_Size: The number of cars that can fit in the garage. Houses with larger garages are usually more expensive.
  7. Neighborhood_Quality: A rating of the neighborhood’s quality on a scale of 1-10, where 10 indicates a high-quality neighborhood. Better neighborhoods usually command higher prices.
  8. House_Price (Target Variable): The price of the house, which is the dependent variable you aim to predict.

Potential Uses:

  1. Beginner Regression Projects: This dataset can be used to practice building regression models such as Linear Regression, Decision Trees, or Random Forests. The target variable (house price) is continuous, making this an ideal problem for supervised learning techniques.

  2. Feature Engineering Practice: Learners can create new features by combining existing ones, such as the price per square foot or age of the house, providing an opportunity to experiment with feature transformations.

  3. Exploratory Data Analysis (EDA): You can explore how different features (e.g., square footage, number of bedrooms) correlate with the target variable, making it a great dataset for learning about data visualization and summary statistics.

  4. Model Evaluation: The dataset allows for various model evaluation techniques such as cross-validation, R-squared, and Mean Absolute Error (MAE). These metrics can be used to compare the effectiveness of different models.

Versatility:

  • The dataset is highly versatile for a range of machine learning tasks. You can apply simple linear models to predict house prices based on one or two features, or use more complex models like Random Forest or Gradient Boosting Machines to understand interactions between variables.

  • It can also be used for dimensionality reduction techniques like PCA or to practice handling categorical variables (e.g., neighborhood quality) through encoding techniques like one-hot encoding.

  • This dataset is ideal for anyone wanting to gain practical experience in building regression models while working with real-world features.

Search
Clear search
Close search
Google apps
Main menu