Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains 21,000 records and 12 variables, each described below:
| Column | Description | Type |
|---|---|---|
fixed_acidity | The amount of fixed acids in the wine, which is typically a combination of tartaric, malic, and citric acids. | float64 |
volatile_acidity | The amount of volatile acids in the wine, primarily acetic acid. | float64 |
citric_acid | The amount of citric acid in the wine, contributing to the overall acidity. | float64 |
residual_sugar | The amount of sugar remaining after fermentation. | float64 |
chlorides | The amount of chlorides in the wine, which can indicate the presence of salt. | float64 |
free_sulfur_dioxide | The amount of free sulfur dioxide in the wine, used as a preservative. | float64 |
total_sulfur_dioxide | The total amount of sulfur dioxide, including bound and free forms. | float64 |
density | The density of the wine, related to alcohol and sugar content. | float64 |
pH | The pH level of the wine, indicating its acidity. | float64 |
sulphates | The amount of sulphates in the wine, contributing to its taste and preservation. | float64 |
alcohol | The alcohol content of the wine in percentage. | float64 |
quality | The quality of the wine, rated from 3 to 9, with higher values indicating better quality. | int64 |
The dataset can be used for multiple purposes:
quality variable (3~9) for potential wine.quality variable (good or bad wine benchmark a certain quality threshold, such as 6 ) for potential wine.
Facebook
TwitterTwo datasets were created, using red and white wine samples. The inputs include objective tests (e.g. PH values) and the output is based on sensory data (median of at least 3 evaluations made by wine experts). Each expert graded the wine quality between 0 (very bad) and 10 (very excellent). Several data mining methods were applied to model these datasets under a regression approach. The support vector machine model achieved the best results. Several metrics were computed: MAD, confusion matrix for a fixed error tolerance (T), etc. Also, we plot the relative importances of the input variables (as measured by a sensitivity analysis procedure).
The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine. For more details, consult: http://www.vinhoverde.pt/en/ or the reference [Cortez et al., 2009]. Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).
Number of Instances: red wine - 1599; white wine - 4898
Input variables (based on physicochemical tests):
Output variable (based on sensory data):
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('wine_quality', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Đức Duy Nguyễn
Released under Apache 2.0
Facebook
Twitterhttps://choosealicense.com/licenses/ecl-2.0/https://choosealicense.com/licenses/ecl-2.0/
Wine Quality 6k4
Contains the original (raw) and cleaned (processed) versions of the Wine Quality datasets (red and white). The raw files are the original semicolon-delimited CSVs and the processed files are cleaned, comma-delimited CSVs suitable for standard data tools and for uploading as a single Hugging Face dataset repository.
Columns (both red and white): fixed acidity volatile acidity citric acid residual sugar chlorides free sulfur dioxide total sulfur dioxide density pH… See the full description on the dataset page: https://huggingface.co/datasets/mnemoraorg/wine-quality-6k4.
Facebook
TwitterFeature introduction:
Fixed acidity: acids are major wine properties and contribute greatly to the wine’s taste. Usually, the total acidity is divided into two groups: the volatile acids and the nonvolatile or fixed acids. Among the fixed acids that you can find in wines are the following: tartaric, malic, citric, and succinic. This variable is expressed in g(tartaricacidtartaricacid)/dm3dm3 in the data sets.
Volatile acidity: the volatile acidity is basically the process of wine turning into vinegar. In the U.S, the legal limits of Volatile Acidity are 1.2 g/L for red table wine and 1.1 g/L for white table wine. In these data sets, the volatile acidity is expressed in g(aceticacidaceticacid)/dm3dm3.
Citric acid is one of the fixed acids that you’ll find in wines. It’s expressed in g/dm3dm3 in the two data sets. Residual sugar typically refers to the sugar remaining after fermentation stops, or is stopped. It’s expressed in g/dm3dm3 in the red and white data.
Chlorides can be a major contributor to saltiness in wine. Here, you’ll see that it’s expressed in g(sodiumchloridesodiumchloride)/dm3dm3.
Free sulfur dioxide: the part of the sulphur dioxide that is added to a wine and that is lost into it is said to be bound, while the active part is said to be free. Winemaker will always try to get the highest proportion of free sulphur to bind. This variables is expressed in mg/dm3dm3 in the data.
Total sulfur dioxide is the sum of the bound and the free sulfur dioxide (SO2). Here, it’s expressed in mg/dm3dm3. There are legal limits for sulfur levels in wines: in the EU, red wines can only have 160mg/L, while white and rose wines can have about 210mg/L. Sweet wines are allowed to have 400mg/L. For the US, the legal limits are set at 350mg/L and for Australia, this is 250mg/L.
Density is generally used as a measure of the conversion of sugar to alcohol. Here, it’s expressed in g/cm3cm3. pH or the potential of hydrogen is a numeric scale to specify the acidity or basicity the wine. As you might know, solutions with a pH less than 7 are acidic, while solutions with a pH greater than 7 are basic. With a pH of 7, pure water is neutral. Most wines have a pH between 2.9 and 3.9 and are therefore acidic.
Sulphates are to wine as gluten is to food. You might already know sulphites from the headaches that they can cause. They are a regular part of the winemaking around the world and are considered necessary. In this case, they are expressed in g(potassiumsulphatepotassiumsulphate)/dm3dm3.
Alcohol: wine is an alcoholic beverage and as you know, the percentage of alcohol can vary from wine to wine. It shouldn’t surprised that this variable is inclued in the data sets, where it’s expressed in % vol.
Quality: wine experts graded the wine quality between 0 (very bad) and 10 (very excellent). The eventual number is the median of at least three evaluations made by those same wine experts.
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine. For more details, consult: [Web Link] or the reference [Cortez et al., 2009]. Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).
These datasets can be viewed as classification or regression tasks. The classes are ordered and not balanced (e.g. there are many more normal wines than excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent or poor wines. Also, we are not sure if all input variables are relevant. So it could be interesting to test feature selection methods.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
ArthurX007/WineQuality dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
part of the dataset supplied in https://www.kaggle.com/datasets/uciml/red-wine-quality-cortez-et-al-2009 https://archive.ics.uci.edu/ml/datasets/wine+quality
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
codesignal/wine-quality dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset is about wine includes fixed acidity,volatile acidity,citric acid,residual sugar,chlorides,free sulfur dioxide,total sulfur dioxide,density,pH,sulphates,alcohol,quality
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Wine Quality’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/danielpanizzo/wine-quality on 30 September 2021.
--- Dataset description provided by original source is as follows ---
Citation Request: This dataset is public available for research. The details are described in [Cortez et al., 2009]. Please include this citation if you plan to use this database:
P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.
Available at: [@Elsevier] http://dx.doi.org/10.1016/j.dss.2009.05.016 [Pre-press (pdf)] http://www3.dsi.uminho.pt/pcortez/winequality09.pdf [bib] http://www3.dsi.uminho.pt/pcortez/dss09.bib
Title: Wine Quality
Sources Created by: Paulo Cortez (Univ. Minho), Antonio Cerdeira, Fernando Almeida, Telmo Matos and Jose Reis (CVRVV) @ 2009
Past Usage:
P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553. ISSN: 0167-9236.
In the above reference, two datasets were created, using red and white wine samples. The inputs include objective tests (e.g. PH values) and the output is based on sensory data (median of at least 3 evaluations made by wine experts). Each expert graded the wine quality between 0 (very bad) and 10 (very excellent). Several data mining methods were applied to model these datasets under a regression approach. The support vector machine model achieved the best results. Several metrics were computed: MAD, confusion matrix for a fixed error tolerance (T), etc. Also, we plot the relative importances of the input variables (as measured by a sensitivity analysis procedure).
Relevant Information:
The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine. For more details, consult: http://www.vinhoverde.pt/en/ or the reference [Cortez et al., 2009]. Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.).
These datasets can be viewed as classification or regression tasks. The classes are ordered and not balanced (e.g. there are munch more normal wines than excellent or poor ones). Outlier detection algorithms could be used to detect the few excellent or poor wines. Also, we are not sure if all input variables are relevant. So it could be interesting to test feature selection methods.
Number of Instances: red wine - 1599; white wine - 4898.
Number of Attributes: 11 + output attribute
Note: several of the attributes may be correlated, thus it makes sense to apply some sort of feature selection.
Attribute information:
For more information, read [Cortez et al., 2009].
Input variables (based on physicochemical tests): 1 - fixed acidity (tartaric acid - g / dm^3) 2 - volatile acidity (acetic acid - g / dm^3) 3 - citric acid (g / dm^3) 4 - residual sugar (g / dm^3) 5 - chlorides (sodium chloride - g / dm^3 6 - free sulfur dioxide (mg / dm^3) 7 - total sulfur dioxide (mg / dm^3) 8 - density (g / cm^3) 9 - pH 10 - sulphates (potassium sulphate - g / dm3) 11 - alcohol (% by volume) Output variable (based on sensory data): 12 - quality (score between 0 and 10)
Missing Attribute Values: None
Description of attributes:
1 - fixed acidity: most acids involved with wine or fixed or nonvolatile (do not evaporate readily)
2 - volatile acidity: the amount of acetic acid in wine, which at too high of levels can lead to an unpleasant, vinegar taste
3 - citric acid: found in small quantities, citric acid can add 'freshness' and flavor to wines
4 - residual sugar: the amount of sugar remaining after fermentation stops, it's rare to find wines with less than 1 gram/liter and wines with greater than 45 grams/liter are considered sweet
5 - chlorides: the amount of salt in the wine
6 - free sulfur dioxide: the free form of SO2 exists in equilibrium between molecular SO2 (as a dissolved gas) and bisulfite ion; it prevents microbial growth and the oxidation of wine
7 - total sulfur dioxide: amount of free and bound forms of S02; in low concentrations, SO2 is mostly undetectable in wine, but at free SO2 concentrations over 50 ppm, SO2 becomes evident in the nose and taste of wine
8 - density: the density of water is close to that of water depending on the percent alcohol and sugar content
9 - pH: describes how acidic or basic a wine is on a scale from 0 (very acidic) to 14 (very basic); most wines are between 3-4 on the pH scale
10 - sulphates: a wine additive which can contribute to sulfur dioxide gas (S02) levels, wich acts as an antimicrobial and antioxidant
11 - alcohol: the percent alcohol content of the wine
Output variable (based on sensory data): 12 - quality (score between 0 and 10)
--- Original source retains full ownership of the source dataset ---
Facebook
TwitterThe dataset used in this paper is a collection of 13 chemical components' concentrations of 178 wines derived from 3 different cultivars grown in the same region in Italy, taken from the WINE dataset.
Facebook
TwitterThis dataset was created by Shweta Dalal
Facebook
TwitterRakhit/winequality dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Italy and France are historically among the countries that produce the most prestigious wines worldwide. In Europe, these two countries together produce more than half of the wines classified under the Protected Designation of Origin (PDO) label, the strictest quality mark of food and wines in the European Union. Due to their long tradition in wine protection, Italy and France include highly detailed regulatory information in their wine PDO regulatory documents that are usually not available for other countries, such as specific information about the main cultivars that must be used to make each wine product or the related required planting density in the vineyards. However, this information is scattered throughout the documents of each wine production area and has never been extracted and homogenised in a unique dataset. Here, we present the first dataset that characterizes the PDO wines produced in Italy and France at very high detail based on the documents from the official EU geographical indication register. It includes, for each country, a standardized list of the PDO wine names, linked with their specific regulatory requirements, including the wine colour, type, cultivars used and maximum allowed yields. The unprecedent level of detail of this dataset allows for the first time the analysis of more than 5000 traditional wines and their legal and agronomic specifications. This gives insights into the interplay between the European Union quality regulation policy, the wine sector and agronomic practices, enabling researchers and practitioners to analyze wine production in the context of specific regulations or economic scenarios.
Facebook
TwitterThe dataset is related to red and white variants of the Portuguese "Vinho Verde" wine. For more details, consult the reference [Cortez et al., 2009]. These datasets can be viewed as classification or regression tasks.
Input variables:
1 - fixed acidity 2 - volatile acidity 3 - citric acid 4 - residual sugar 5 - chlorides 6 - free sulfur dioxide 7 - total sulfur dioxide 8 - density 9 - pH 10 - sulphates 11 - alcohol 12 - quality (score between 0 and 10) 13 - color 14 - high_quality
Facebook
TwitterThis dataset was created by Captainlin
Facebook
Twitterpkmitl205/winequality-white dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Wine Dataset is derived from a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars. The dataset includes 13 attributes such as alcohol, malic acid, ash, and color intensity, providing a comprehensive overview for understanding wine characteristics and aiding in classification tasks.
2) Data Utilization (1) Wine data has characteristics that: • It includes detailed measurements of wine attributes, allowing for analysis of chemical composition, comparison between different wine types, and identification of patterns in wine quality and flavor profiles. (2) Wine data can be used to: • Wine Industry: Assists winemakers and analysts in understanding the chemical properties that influence wine quality, helping to improve production processes and quality control. • Research: Supports academic studies and the development of classification models for wine quality prediction and analysis.
Facebook
TwitterThis statistic shows the results of a survey on Brazilian wine quality perceptions among consumers in Brazil as of July 2018. At that point in time, a total of ** percent of Brazilian respondents perceived national wine as having either high or very high quality, while only **** percent considered it low or very low quality.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains 21,000 records and 12 variables, each described below:
| Column | Description | Type |
|---|---|---|
fixed_acidity | The amount of fixed acids in the wine, which is typically a combination of tartaric, malic, and citric acids. | float64 |
volatile_acidity | The amount of volatile acids in the wine, primarily acetic acid. | float64 |
citric_acid | The amount of citric acid in the wine, contributing to the overall acidity. | float64 |
residual_sugar | The amount of sugar remaining after fermentation. | float64 |
chlorides | The amount of chlorides in the wine, which can indicate the presence of salt. | float64 |
free_sulfur_dioxide | The amount of free sulfur dioxide in the wine, used as a preservative. | float64 |
total_sulfur_dioxide | The total amount of sulfur dioxide, including bound and free forms. | float64 |
density | The density of the wine, related to alcohol and sugar content. | float64 |
pH | The pH level of the wine, indicating its acidity. | float64 |
sulphates | The amount of sulphates in the wine, contributing to its taste and preservation. | float64 |
alcohol | The alcohol content of the wine in percentage. | float64 |
quality | The quality of the wine, rated from 3 to 9, with higher values indicating better quality. | int64 |
The dataset can be used for multiple purposes:
quality variable (3~9) for potential wine.quality variable (good or bad wine benchmark a certain quality threshold, such as 6 ) for potential wine.