4 datasets found
  1. Wine_Test Prediction | 1600 data | yashaswi

    • kaggle.com
    Updated May 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayushman Yashaswi (2025). Wine_Test Prediction | 1600 data | yashaswi [Dataset]. https://www.kaggle.com/datasets/ayushmanyashaswi/wine-test-prediction-1600-data-yashaswi
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 19, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ayushman Yashaswi
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Sure! Here's the updated Kaggle dataset description with your data visualization work included:

    πŸ“Š Wine Quality - Red Wine Dataset

    This dataset contains physicochemical attributes of red variants of Portuguese "Vinho Verde" wine, along with their quality score (rated between 0 to 10). The goal is to predict wine quality using various classification models based on the chemical properties of the wine.

    πŸ§ͺ Features Overview (12 columns):

    • fixed acidity: most acids involved with wine are fixed/nonvolatile
    • volatile acidity: amount of acetic acid (can affect taste)
    • citric acid: adds freshness and flavor
    • residual sugar: sugar left after fermentation
    • chlorides: salt content
    • free sulfur dioxide: protects wine from microbes
    • total sulfur dioxide: total SOβ‚‚ content
    • density: wine density
    • pH: acidity level
    • sulphates: preservative and antimicrobial
    • alcohol: alcohol percentage
    • quality (target): wine quality score (0–10)

    πŸ€– Model Performance Summary:

    Multiple machine learning models were trained to predict wine quality. The following accuracy scores were observed:

    ModelTraining AccuracyTesting Accuracy
    Logistic Regression87.91%87.0%
    Random Forest100%94.0%
    Decision Tree100%88.5%
    Support Vector Machine (SVM)86.41%86.5%

    πŸ“ˆ Data Visualization:

    A comparison plot of model performance was created to visually represent the accuracy of each algorithm. This helps in understanding which models generalized well and which ones may have overfit to the training data.

    πŸ“ File Info:

    • Filename: winequality-red.csv
    • Size: ~100 KB
    • Rows: 1,599
    • Columns: 12

    πŸ“Œ Ideal For:

    • Classification model evaluation
    • Feature correlation analysis
    • EDA and visualization
    • ML model tuning and comparison
  2. White Wine Quality

    • kaggle.com
    Updated Sep 28, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Piyush Agnihotri (2020). White Wine Quality [Dataset]. https://www.kaggle.com/datasets/piyushagni5/white-wine-quality/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 28, 2020
    Dataset provided by
    Kaggle
    Authors
    Piyush Agnihotri
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Context

    The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine. For more details, refer to [Cortez et al., 2009]. Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).

    These datasets can be viewed as classification or regression tasks. The classes are ordered and not balanced (e.g. there are many more normal wines than excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent or poor wines. Also, we are not sure if all input variables are relevant. So it could be interesting to test feature selection methods.

    Content

    For more information, read [Cortez et al., 2009]. Input variables (based on physicochemical tests): 1 - fixed acidity 2 - volatile acidity 3 - citric acid 4 - residual sugar 5 - chlorides 6 - free sulfur dioxide 7 - total sulfur dioxide 8 - density 9 - pH 10 - sulphates 11 - alcohol Output variable (based on sensory data): 12 - quality (score between 0 and 10)

    Acknowledgements

    This dataset is also available from the UCI machine learning repository, https://archive.ics.uci.edu/ml/datasets/wine+quality, to get both the dataset i.e. red and white vinho verde wine samples, from the north of Portugal, please visit the above link.

    Please include this citation if you plan to use this database:

    P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.

    Inspiration

    We kagglers can apply several machine-learning algorithms to determine which physiochemical properties make a wine 'good'!

    Relevant papers

    P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.

  3. h

    feature-factory-datasets

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hassan Abedi, feature-factory-datasets [Dataset]. https://huggingface.co/datasets/habedi/feature-factory-datasets
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Hassan Abedi
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Tabular Datasets

    The datasets are used in this project: Feature Factory

    Index Dataset Name File Name Data Type

    Records (Approx.)

    Format Source

    1 Wine Quality (Red Wine) winequality-red.csv Tabular 1,599 CSV Link

    2 NYC Yellow Taxi Trip (Jan 2019) yellow_tripdata_2019.parquet Taxi Trip Data ~7M Parquet Link

    3 NYC Green Taxi Trip (Jan 2019)green_tripdata_2019.parquet Taxi Trip Data ~1M Parquet Link

    4 California Housing Prices california_housing.csv Real Estate Prices… See the full description on the dataset page: https://huggingface.co/datasets/habedi/feature-factory-datasets.

  4. White Wine Quality

    • kaggle.com
    Updated Sep 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaurav Dutta (2022). White Wine Quality [Dataset]. https://www.kaggle.com/datasets/gauravduttakiit/white-wine-quality
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 10, 2022
    Dataset provided by
    Kaggle
    Authors
    Gaurav Dutta
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    About Wine Wine is an alcoholic drink typically made from fermented grapes. Yeast consumes the sugar in the grapes and converts it to ethanol, carbon dioxide, and heat.

    White wine is primarily made with white grapes, and the skins are separated from the juice before the fermentation process. Red wine is made with darker red or black grapes, and the skins remain on the grapes during the fermentation process.

    Objective β€œWine is bottled poetry.” The wine connoisseurs in a wine factory in Portugal are debating on the quality of red and white wines. They thought to take the help of Data Science industry for this work. They hired you as a data scientist as you were the best data scientist in the world. Can you help them out?

    Data Description Input variables (based on physicochemical tests): fixed acidity volatile acidity citric acid residual sugar chlorides free sulfur dioxide total sulfur dioxide density pH sulphates alcohol Output variable (based on sensory data):quality (score between 0 and 10)

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ayushman Yashaswi (2025). Wine_Test Prediction | 1600 data | yashaswi [Dataset]. https://www.kaggle.com/datasets/ayushmanyashaswi/wine-test-prediction-1600-data-yashaswi
Organization logo

Wine_Test Prediction | 1600 data | yashaswi

Red wine classification whether it is good or bad.

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 19, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ayushman Yashaswi
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Sure! Here's the updated Kaggle dataset description with your data visualization work included:

πŸ“Š Wine Quality - Red Wine Dataset

This dataset contains physicochemical attributes of red variants of Portuguese "Vinho Verde" wine, along with their quality score (rated between 0 to 10). The goal is to predict wine quality using various classification models based on the chemical properties of the wine.

πŸ§ͺ Features Overview (12 columns):

  • fixed acidity: most acids involved with wine are fixed/nonvolatile
  • volatile acidity: amount of acetic acid (can affect taste)
  • citric acid: adds freshness and flavor
  • residual sugar: sugar left after fermentation
  • chlorides: salt content
  • free sulfur dioxide: protects wine from microbes
  • total sulfur dioxide: total SOβ‚‚ content
  • density: wine density
  • pH: acidity level
  • sulphates: preservative and antimicrobial
  • alcohol: alcohol percentage
  • quality (target): wine quality score (0–10)

πŸ€– Model Performance Summary:

Multiple machine learning models were trained to predict wine quality. The following accuracy scores were observed:

ModelTraining AccuracyTesting Accuracy
Logistic Regression87.91%87.0%
Random Forest100%94.0%
Decision Tree100%88.5%
Support Vector Machine (SVM)86.41%86.5%

πŸ“ˆ Data Visualization:

A comparison plot of model performance was created to visually represent the accuracy of each algorithm. This helps in understanding which models generalized well and which ones may have overfit to the training data.

πŸ“ File Info:

  • Filename: winequality-red.csv
  • Size: ~100 KB
  • Rows: 1,599
  • Columns: 12

πŸ“Œ Ideal For:

  • Classification model evaluation
  • Feature correlation analysis
  • EDA and visualization
  • ML model tuning and comparison
Search
Clear search
Close search
Google apps
Main menu