100+ datasets found
  1. Wine Quality dataset - Classification

    • kaggle.com
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ta-wei Lo (2025). Wine Quality dataset - Classification [Dataset]. https://www.kaggle.com/datasets/taweilo/wine-quality-dataset-balanced-classification
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 7, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ta-wei Lo
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    1. Data Source & Process

    1. Original data is from UCI ML wine dataset: Wine Quality Dataset
    2. Use cGAN to deal with imbalanced data.
    3. Combine synthetic cGAN data with original data, ensuring each class has 3000 instances.

    2. Metadata

    The dataset contains 21,000 records and 12 variables, each described below:

    ColumnDescriptionType
    fixed_acidityThe amount of fixed acids in the wine, which is typically a combination of tartaric, malic, and citric acids.float64
    volatile_acidityThe amount of volatile acids in the wine, primarily acetic acid.float64
    citric_acidThe amount of citric acid in the wine, contributing to the overall acidity.float64
    residual_sugarThe amount of sugar remaining after fermentation.float64
    chloridesThe amount of chlorides in the wine, which can indicate the presence of salt.float64
    free_sulfur_dioxideThe amount of free sulfur dioxide in the wine, used as a preservative.float64
    total_sulfur_dioxideThe total amount of sulfur dioxide, including bound and free forms.float64
    densityThe density of the wine, related to alcohol and sugar content.float64
    pHThe pH level of the wine, indicating its acidity.float64
    sulphatesThe amount of sulphates in the wine, contributing to its taste and preservation.float64
    alcoholThe alcohol content of the wine in percentage.float64
    qualityThe quality of the wine, rated from 3 to 9, with higher values indicating better quality.int64

    3. Data Usage

    The dataset can be used for multiple purposes:

    • Exploratory Data Analysis (EDA): Analyze key features, distribution patterns, and relationships to understand quality factors.
    • Multi Classification: Build predictive models to classify the quality variable (3~9) for potential wine.
    • Binay Classification: Build predictive models to classify the quality variable (good or bad wine benchmark a certain quality threshold, such as 6 ) for potential wine.

    Feel free to leave comments on the discussion. I'd appreciate your upvote if you find my dataset useful! 😀

  2. T

    wine_quality

    • tensorflow.org
    • kaggle.com
    Updated Nov 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). wine_quality [Dataset]. https://www.tensorflow.org/datasets/catalog/wine_quality
    Explore at:
    Dataset updated
    Nov 23, 2022
    Description

    Two datasets were created, using red and white wine samples. The inputs include objective tests (e.g. PH values) and the output is based on sensory data (median of at least 3 evaluations made by wine experts). Each expert graded the wine quality between 0 (very bad) and 10 (very excellent). Several data mining methods were applied to model these datasets under a regression approach. The support vector machine model achieved the best results. Several metrics were computed: MAD, confusion matrix for a fixed error tolerance (T), etc. Also, we plot the relative importances of the input variables (as measured by a sensitivity analysis procedure).

    The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine. For more details, consult: http://www.vinhoverde.pt/en/ or the reference [Cortez et al., 2009]. Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).

    Number of Instances: red wine - 1599; white wine - 4898

    Input variables (based on physicochemical tests):

    1. fixed acidity
    2. volatile acidity
    3. citric acid
    4. residual sugar
    5. chlorides
    6. free sulfur dioxide
    7. total sulfur dioxide
    8. density
    9. pH
    10. sulphates
    11. alcohol

    Output variable (based on sensory data):

    1. quality (score between 0 and 10)

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('wine_quality', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  3. Data from: Wine Quality

    • kaggle.com
    Updated Nov 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Đức Duy Nguyễn (2024). Wine Quality [Dataset]. https://www.kaggle.com/duckzuybidan/wine-quality/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 30, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Đức Duy Nguyễn
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Đức Duy Nguyễn

    Released under Apache 2.0

    Contents

  4. h

    wine-quality-6k4

    • huggingface.co
    Updated Oct 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mnemora (2025). wine-quality-6k4 [Dataset]. https://huggingface.co/datasets/mnemoraorg/wine-quality-6k4
    Explore at:
    Dataset updated
    Oct 20, 2025
    Dataset authored and provided by
    Mnemora
    License

    https://choosealicense.com/licenses/ecl-2.0/https://choosealicense.com/licenses/ecl-2.0/

    Description

    Wine Quality 6k4

    Contains the original (raw) and cleaned (processed) versions of the Wine Quality datasets (red and white). The raw files are the original semicolon-delimited CSVs and the processed files are cleaned, comma-delimited CSVs suitable for standard data tools and for uploading as a single Hugging Face dataset repository.

    Columns (both red and white): fixed acidity volatile acidity citric acid residual sugar chlorides free sulfur dioxide total sulfur dioxide density pH… See the full description on the dataset page: https://huggingface.co/datasets/mnemoraorg/wine-quality-6k4.

  5. Wine Quality - red or white?

    • kaggle.com
    Updated Feb 3, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ZiheTonyXu (2018). Wine Quality - red or white? [Dataset]. https://www.kaggle.com/xuzihe2010/wine-quality-red/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 3, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ZiheTonyXu
    Description

    Feature introduction:

    Fixed acidity: acids are major wine properties and contribute greatly to the wine’s taste. Usually, the total acidity is divided into two groups: the volatile acids and the nonvolatile or fixed acids. Among the fixed acids that you can find in wines are the following: tartaric, malic, citric, and succinic. This variable is expressed in g(tartaricacidtartaricacid)/dm3dm3 in the data sets.

    Volatile acidity: the volatile acidity is basically the process of wine turning into vinegar. In the U.S, the legal limits of Volatile Acidity are 1.2 g/L for red table wine and 1.1 g/L for white table wine. In these data sets, the volatile acidity is expressed in g(aceticacidaceticacid)/dm3dm3.

    Citric acid is one of the fixed acids that you’ll find in wines. It’s expressed in g/dm3dm3 in the two data sets. Residual sugar typically refers to the sugar remaining after fermentation stops, or is stopped. It’s expressed in g/dm3dm3 in the red and white data.

    Chlorides can be a major contributor to saltiness in wine. Here, you’ll see that it’s expressed in g(sodiumchloridesodiumchloride)/dm3dm3.

    Free sulfur dioxide: the part of the sulphur dioxide that is added to a wine and that is lost into it is said to be bound, while the active part is said to be free. Winemaker will always try to get the highest proportion of free sulphur to bind. This variables is expressed in mg/dm3dm3 in the data.

    Total sulfur dioxide is the sum of the bound and the free sulfur dioxide (SO2). Here, it’s expressed in mg/dm3dm3. There are legal limits for sulfur levels in wines: in the EU, red wines can only have 160mg/L, while white and rose wines can have about 210mg/L. Sweet wines are allowed to have 400mg/L. For the US, the legal limits are set at 350mg/L and for Australia, this is 250mg/L.

    Density is generally used as a measure of the conversion of sugar to alcohol. Here, it’s expressed in g/cm3cm3. pH or the potential of hydrogen is a numeric scale to specify the acidity or basicity the wine. As you might know, solutions with a pH less than 7 are acidic, while solutions with a pH greater than 7 are basic. With a pH of 7, pure water is neutral. Most wines have a pH between 2.9 and 3.9 and are therefore acidic.

    Sulphates are to wine as gluten is to food. You might already know sulphites from the headaches that they can cause. They are a regular part of the winemaking around the world and are considered necessary. In this case, they are expressed in g(potassiumsulphatepotassiumsulphate)/dm3dm3.

    Alcohol: wine is an alcoholic beverage and as you know, the percentage of alcohol can vary from wine to wine. It shouldn’t surprised that this variable is inclued in the data sets, where it’s expressed in % vol.

    Quality: wine experts graded the wine quality between 0 (very bad) and 10 (very excellent). The eventual number is the median of at least three evaluations made by those same wine experts.

  6. Data from: RED-WINE-QUALITY

    • kaggle.com
    Updated Sep 21, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prit Sheta (2021). RED-WINE-QUALITY [Dataset]. https://www.kaggle.com/datasets/pritsheta/redwinequality
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 21, 2021
    Dataset provided by
    Kaggle
    Authors
    Prit Sheta
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine. For more details, consult: [Web Link] or the reference [Cortez et al., 2009]. Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).

    These datasets can be viewed as classification or regression tasks. The classes are ordered and not balanced (e.g. there are many more normal wines than excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent or poor wines. Also, we are not sure if all input variables are relevant. So it could be interesting to test feature selection methods.

  7. h

    Data from: WineQuality

    • huggingface.co
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pulkit Gaur (2023). WineQuality [Dataset]. https://huggingface.co/datasets/ArthurX007/WineQuality
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 21, 2023
    Authors
    Pulkit Gaur
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    ArthurX007/WineQuality dataset hosted on Hugging Face and contributed by the HF Datasets community

  8. Wine Quality Full

    • figshare.com
    txt
    Updated Jul 4, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deepchecks Data (2022). Wine Quality Full [Dataset]. http://doi.org/10.6084/m9.figshare.20223303.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 4, 2022
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Deepchecks Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
  9. Data from: wine-quality

    • huggingface.co
    Updated Feb 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CodeSignal (2024). wine-quality [Dataset]. https://huggingface.co/datasets/codesignal/wine-quality
    Explore at:
    Dataset updated
    Feb 5, 2024
    Dataset authored and provided by
    CodeSignalhttps://codesignal.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    codesignal/wine-quality dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. Data from: Wine Quality dataset

    • kaggle.com
    Updated Oct 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    adarshde (2024). Wine Quality dataset [Dataset]. https://www.kaggle.com/datasets/adarshde/wine-quality-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 1, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    adarshde
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset is about wine includes fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality

  11. A

    ‘Wine Quality’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Sep 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Wine Quality’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-wine-quality-1f0a/6797a120/?iid=012-455&v=presentation
    Explore at:
    Dataset updated
    Sep 30, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Wine Quality’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/danielpanizzo/wine-quality on 30 September 2021.

    --- Dataset description provided by original source is as follows ---

    Citation Request: This dataset is public available for research. The details are described in [Cortez et al., 2009]. Please include this citation if you plan to use this database:

    P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.

    Available at: [@Elsevier] http://dx.doi.org/10.1016/j.dss.2009.05.016 [Pre-press (pdf)] http://www3.dsi.uminho.pt/pcortez/winequality09.pdf [bib] http://www3.dsi.uminho.pt/pcortez/dss09.bib

    1. Title: Wine Quality

    2. Sources Created by: Paulo Cortez (Univ. Minho), Antonio Cerdeira, Fernando Almeida, Telmo Matos and Jose Reis (CVRVV) @ 2009

    3. Past Usage:

      P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.

      In the above reference, two datasets were created, using red and white wine samples. The inputs include objective tests (e.g. PH values) and the output is based on sensory data (median of at least 3 evaluations made by wine experts). Each expert graded the wine quality between 0 (very bad) and 10 (very excellent). Several data mining methods were applied to model these datasets under a regression approach. The support vector machine model achieved the best results. Several metrics were computed: MAD, confusion matrix for a fixed error tolerance (T), etc. Also, we plot the relative importances of the input variables (as measured by a sensitivity analysis procedure).

    4. Relevant Information:

      The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine. For more details, consult: http://www.vinhoverde.pt/en/ or the reference [Cortez et al., 2009]. Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).

      These datasets can be viewed as classification or regression tasks. The classes are ordered and not balanced (e.g. there are munch more normal wines than excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent or poor wines. Also, we are not sure if all input variables are relevant. So it could be interesting to test feature selection methods.

    5. Number of Instances: red wine - 1599; white wine - 4898.

    6. Number of Attributes: 11 + output attribute

      Note: several of the attributes may be correlated, thus it makes sense to apply some sort of feature selection.

    7. Attribute information:

      For more information, read [Cortez et al., 2009].

      Input variables (based on physicochemical tests): 1 - fixed acidity (tartaric acid - g / dm^3) 2 - volatile acidity (acetic acid - g / dm^3) 3 - citric acid (g / dm^3) 4 - residual sugar (g / dm^3) 5 - chlorides (sodium chloride - g / dm^3 6 - free sulfur dioxide (mg / dm^3) 7 - total sulfur dioxide (mg / dm^3) 8 - density (g / cm^3) 9 - pH 10 - sulphates (potassium sulphate - g / dm3) 11 - alcohol (% by volume) Output variable (based on sensory data): 12 - quality (score between 0 and 10)

    8. Missing Attribute Values: None

    9. Description of attributes:

      1 - fixed acidity: most acids involved with wine or fixed or nonvolatile (do not evaporate readily)

      2 - volatile acidity: the amount of acetic acid in wine, which at too high of levels can lead to an unpleasant, vinegar taste

      3 - citric acid: found in small quantities, citric acid can add 'freshness' and flavor to wines

      4 - residual sugar: the amount of sugar remaining after fermentation stops, it's rare to find wines with less than 1 gram/liter and wines with greater than 45 grams/liter are considered sweet

      5 - chlorides: the amount of salt in the wine

      6 - free sulfur dioxide: the free form of SO2 exists in equilibrium between molecular SO2 (as a dissolved gas) and bisulfite ion; it prevents microbial growth and the oxidation of wine

      7 - total sulfur dioxide: amount of free and bound forms of S02; in low concentrations, SO2 is mostly undetectable in wine, but at free SO2 concentrations over 50 ppm, SO2 becomes evident in the nose and taste of wine

      8 - density: the density of water is close to that of water depending on the percent alcohol and sugar content

      9 - pH: describes how acidic or basic a wine is on a scale from 0 (very acidic) to 14 (very basic); most wines are between 3-4 on the pH scale

      10 - sulphates: a wine additive which can contribute to sulfur dioxide gas (S02) levels, wich acts as an antimicrobial and antioxidant

      11 - alcohol: the percent alcohol content of the wine

      Output variable (based on sensory data): 12 - quality (score between 0 and 10)

    --- Original source retains full ownership of the source dataset ---

  12. r

    WINE dataset

    • resodate.org
    • service.tib.eu
    Updated Dec 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Apimuk Sornsaeng; Ninnat Dangniam; Pantita Palittapongarnpim; Thiparat Chotibut (2024). WINE dataset [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9zZXJ2aWNlLnRpYi5ldS9sZG1zZXJ2aWNlL2RhdGFzZXQvd2luZS1kYXRhc2V0
    Explore at:
    Dataset updated
    Dec 3, 2024
    Dataset provided by
    Leibniz Data Manager
    Authors
    Apimuk Sornsaeng; Ninnat Dangniam; Pantita Palittapongarnpim; Thiparat Chotibut
    Description

    The dataset used in this paper is a collection of 13 chemical components' concentrations of 178 wines derived from 3 different cultivars grown in the same region in Italy, taken from the WINE dataset.

  13. Data from: winequality

    • kaggle.com
    Updated Aug 28, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shweta Dalal (2021). winequality [Dataset]. https://www.kaggle.com/datasets/shwetadalal/winequality
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 28, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shweta Dalal
    Description

    Dataset

    This dataset was created by Shweta Dalal

    Contents

  14. h

    Data from: winequality

    • huggingface.co
    Updated Dec 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thapa (2024). winequality [Dataset]. https://huggingface.co/datasets/Rakhit/winequality
    Explore at:
    Dataset updated
    Dec 24, 2024
    Authors
    Thapa
    Description

    Rakhit/winequality dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. f

    Data from: Quality wines in Italy and France: a dataset of protected...

    • figshare.com
    txt
    Updated Mar 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sebastian Candiago; Simon Tscholl; Leonardo Bassani; Helder Fraga; Lukas Egarter Vigl (2024). Quality wines in Italy and France: a dataset of protected designation of origin specifications [Dataset]. http://doi.org/10.6084/m9.figshare.25393261.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Mar 15, 2024
    Dataset provided by
    figshare
    Authors
    Sebastian Candiago; Simon Tscholl; Leonardo Bassani; Helder Fraga; Lukas Egarter Vigl
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Italy, France
    Description

    Italy and France are historically among the countries that produce the most prestigious wines worldwide. In Europe, these two countries together produce more than half of the wines classified under the Protected Designation of Origin (PDO) label, the strictest quality mark of food and wines in the European Union. Due to their long tradition in wine protection, Italy and France include highly detailed regulatory information in their wine PDO regulatory documents that are usually not available for other countries, such as specific information about the main cultivars that must be used to make each wine product or the related required planting density in the vineyards. However, this information is scattered throughout the documents of each wine production area and has never been extracted and homogenised in a unique dataset. Here, we present the first dataset that characterizes the PDO wines produced in Italy and France at very high detail based on the documents from the official EU geographical indication register. It includes, for each country, a standardized list of the PDO wine names, linked with their specific regulatory requirements, including the wine colour, type, cultivars used and maximum allowed yields. The unprecedent level of detail of this dataset allows for the first time the analysis of more than 5000 traditional wines and their legal and agronomic specifications. This gives insights into the interplay between the European Union quality regulation policy, the wine sector and agronomic practices, enabling researchers and practitioners to analyze wine production in the context of specific regulations or economic scenarios.

  16. Wine Quality (Red and White)

    • kaggle.com
    Updated Aug 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Turhan Can Kargin (2020). Wine Quality (Red and White) [Dataset]. https://www.kaggle.com/turhancankargin/wine-quality-red-and-white
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 2, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Turhan Can Kargin
    Description

    Context

    The dataset is related to red and white variants of the Portuguese "Vinho Verde" wine. For more details, consult the reference [Cortez et al., 2009]. These datasets can be viewed as classification or regression tasks.

    Content

    Input variables:

    1 - fixed acidity 2 - volatile acidity 3 - citric acid 4 - residual sugar 5 - chlorides 6 - free sulfur dioxide 7 - total sulfur dioxide 8 - density 9 - pH 10 - sulphates 11 - alcohol 12 - quality (score between 0 and 10) 13 - color 14 - high_quality

  17. Data from: Winequality

    • kaggle.com
    Updated Nov 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Captainlin (2024). Winequality [Dataset]. https://www.kaggle.com/datasets/captainlin/winequality
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 20, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Captainlin
    Description

    Dataset

    This dataset was created by Captainlin

    Contents

  18. h

    winequality-white

    • huggingface.co
    Updated Mar 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Phakon Rujireksareekul (2025). winequality-white [Dataset]. https://huggingface.co/datasets/pkmitl205/winequality-white
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 7, 2025
    Authors
    Phakon Rujireksareekul
    Description

    pkmitl205/winequality-white dataset hosted on Hugging Face and contributed by the HF Datasets community

  19. c

    Wine Dataset

    • cubig.ai
    zip
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Wine Dataset [Dataset]. https://cubig.ai/store/products/210/wine-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 2, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
    Description

    1) Data Introduction • The Wine Dataset is derived from a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The dataset includes 13 attributes such as alcohol, malic acid, ash, and color intensity, providing a comprehensive overview for understanding wine characteristics and aiding in classification tasks.

    2) Data Utilization (1) Wine data has characteristics that: • It includes detailed measurements of wine attributes, allowing for analysis of chemical composition, comparison between different wine types, and identification of patterns in wine quality and flavor profiles. (2) Wine data can be used to: • Wine Industry: Assists winemakers and analysts in understanding the chemical properties that influence wine quality, helping to improve production processes and quality control. • Research: Supports academic studies and the development of classification models for wine quality prediction and analysis.

  20. Brazil: national wine quality perception 2018

    • statista.com
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Brazil: national wine quality perception 2018 [Dataset]. https://www.statista.com/statistics/981402/brazilian-wine-quality-perception-consumers-brazil/
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jul 2018
    Area covered
    Brazil
    Description

    This statistic shows the results of a survey on Brazilian wine quality perceptions among consumers in Brazil as of July 2018. At that point in time, a total of ** percent of Brazilian respondents perceived national wine as having either high or very high quality, while only **** percent considered it low or very low quality.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ta-wei Lo (2025). Wine Quality dataset - Classification [Dataset]. https://www.kaggle.com/datasets/taweilo/wine-quality-dataset-balanced-classification
Organization logo

Wine Quality dataset - Classification

Wine Quality Balanced Dataset / Multi-class, Binary Classification

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 7, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ta-wei Lo
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

1. Data Source & Process

  1. Original data is from UCI ML wine dataset: Wine Quality Dataset
  2. Use cGAN to deal with imbalanced data.
  3. Combine synthetic cGAN data with original data, ensuring each class has 3000 instances.

2. Metadata

The dataset contains 21,000 records and 12 variables, each described below:

ColumnDescriptionType
fixed_acidityThe amount of fixed acids in the wine, which is typically a combination of tartaric, malic, and citric acids.float64
volatile_acidityThe amount of volatile acids in the wine, primarily acetic acid.float64
citric_acidThe amount of citric acid in the wine, contributing to the overall acidity.float64
residual_sugarThe amount of sugar remaining after fermentation.float64
chloridesThe amount of chlorides in the wine, which can indicate the presence of salt.float64
free_sulfur_dioxideThe amount of free sulfur dioxide in the wine, used as a preservative.float64
total_sulfur_dioxideThe total amount of sulfur dioxide, including bound and free forms.float64
densityThe density of the wine, related to alcohol and sugar content.float64
pHThe pH level of the wine, indicating its acidity.float64
sulphatesThe amount of sulphates in the wine, contributing to its taste and preservation.float64
alcoholThe alcohol content of the wine in percentage.float64
qualityThe quality of the wine, rated from 3 to 9, with higher values indicating better quality.int64

3. Data Usage

The dataset can be used for multiple purposes:

  • Exploratory Data Analysis (EDA): Analyze key features, distribution patterns, and relationships to understand quality factors.
  • Multi Classification: Build predictive models to classify the quality variable (3~9) for potential wine.
  • Binay Classification: Build predictive models to classify the quality variable (good or bad wine benchmark a certain quality threshold, such as 6 ) for potential wine.

Feel free to leave comments on the discussion. I'd appreciate your upvote if you find my dataset useful! 😀

Search
Clear search
Close search
Google apps
Main menu