7 datasets found
  1. Classification Analysis Using Python

    • kaggle.com
    Updated Jul 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nibedita Sahu (2023). Classification Analysis Using Python [Dataset]. https://www.kaggle.com/datasets/nibeditasahu/classification-analysis-using-python
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 3, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nibedita Sahu
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The Iris dataset is a classic and widely used dataset in machine learning for classification tasks. It consists of measurements of different iris flowers, including sepal length, sepal width, petal length, and petal width, along with their corresponding species. With a total of 150 samples, the dataset is balanced and serves as an excellent choice for understanding and implementing classification algorithms. This notebook explores the dataset, preprocesses the data, builds a decision tree classification model, and evaluates its performance, showcasing the effectiveness of decision trees in solving classification problems.

  2. Customer Churn - Decision Tree & Random Forest

    • kaggle.com
    Updated Jul 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vikram amin (2023). Customer Churn - Decision Tree & Random Forest [Dataset]. https://www.kaggle.com/datasets/vikramamin/customer-churn-decision-tree-and-random-forest
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 6, 2023
    Dataset provided by
    Kaggle
    Authors
    vikram amin
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description
    • Main objective: Find out customers who will churn and who will not.
    • Methodology: It is a classification problem. We will use decision tree and random forest to predict the outcome.
    • Steps Involved
    • Read the data
    • Check for data types https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F1ffb600d8a4b4b36bc25e957524a3524%2FPicture1.png?generation=1688638600831386&alt=media" alt="">
    1. Change character vector to factor vector as this is as classification problem
    2. Drop the variable which is not significant for the analysis. We drop "customerID".
    3. Check for missing values. None are found.
    4. Split the data into train and test so we can use the train data for building the model and use test data for prediction. We split this into 80-20 ratio (train/test) using the sample function.
    5. Install and run libraries (rpart, rpart.plot, rattle, RColorBrewer, caret)
    6. Run decision tree using rpart function. The dependent variable is Churn and 19 other independent variables

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F8d3442e6c82d8026c6a448e4780ab38c%2FPicture2.png?generation=1688638685268853&alt=media" alt=""> 9. Plot the decision tree

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F9ab0591e323dc30fe116c79f6d014d06%2FPicture3.png?generation=1688638747644320&alt=media" alt="">

    Average customer churn is 27%. The churn can take place if the tenure is more than >=7.5 and there is no internet service

    1. Tuning the model
    2. Define the search grid using the expand.grid function
    3. Set up the control parameters through 5 fold cross validation
    4. When we print the model we get the best CP = 0.01 and an accuracy of 79.00%

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F16080ac04d3743ec238227e1ef2c8269%2FPicture4.png?generation=1688639197455166&alt=media" alt="">

    1. Predict the model
    2. Find out the variables which are most and least significant. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F61beb4224e9351cfc772147c43800502%2FPicture5.png?generation=1688639468638950&alt=media" alt="">

    Significant variables are Internet Service, Tenure and the least significant are Streaming Movies, Tech Support.

    USE RANDOM FOREST

    1. Run library(randomForest). Here we are using the default ntree (500) and mtry (p/3) where p is the number of independent variables. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fc27fe7e83f0b53b7e067371b69c7f4a7%2FPicture6.png?generation=1688640478682685&alt=media" alt="">

      Through confusion matrix, accuracy is coming 79.27%. The accuracy is marginally higher than that of decision tree i.e 79.00%. The error rate is pretty low when predicting "No" and much higher when predicting "Yes".

    2. Plot the model showing which variables reduce the gini impunity the most and least. Total charges and tenure reduce the gini impunity the most while phone service has the least impact.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fec25fc3ba74ab9cef1a81188209512b1%2FPicture7.png?generation=1688640726235724&alt=media" alt="">

    1. Predict the model and create a new data frame showing the actuals vs predicted values

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F50aa40e5dd676c8285020fd2fe627bf1%2FPicture8.png?generation=1688640896763066&alt=media" alt="">

    1. Plot the model so as to find out where the OOB (out of bag ) error stops decreasing or becoming constant. As we can see that the error stops decreasing between 100 to 200 trees. So we decide to take ntree = 200 when we tune the model.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F87211e1b218c595911fbe6ea2806e27a%2FPicture9.png?generation=1688641103367564&alt=media" alt="">

    Tune the model mtry=2 has the lowest OOB error rate

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F6057af5bb0719b16f1a97a58c3d4aa1d%2FPicture10.png?generation=1688641391027971&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fc7045eba4ee298c58f1bd0230c24c00d%2FPicture11.png?generation=1688641605829830&alt=media" alt="">

    Use random forest with mtry = 2 and ntree = 200

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F01541eff1f9c6303591aa50dd707b5f5%2FPicture12.png?generation=1688641634979403&alt=media" alt="">

    Through confusion matrix, accuracy is coming 79.71%. The accuracy is marginally higher than that of default (when ntree was 500 and mtry was 4) i.e 79.27% and of decision tree i.e 79.00%. The error rate is pretty low when predicting "No" and m...

  3. Example Churn Data

    • kaggle.com
    Updated Nov 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    George Hayduke (2021). Example Churn Data [Dataset]. https://www.kaggle.com/ban7002/example-churn-data/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 11, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    George Hayduke
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Basic telco churn dataset used to challenge students and academics

    Content

    InfoDescription
    filechurn_100k.csv
    n_samples101K
    n_features28
    pct_missing1%

    suggested model features

    numeric_features = ['monthly_minutes', 'customerServiceCalls', 'streaming_minutes', 'TotalBilled', 'PrevBalance', 'latePayments']

    categorical_features = ['ip_address_asn', 'phone_area_code', 'customer_reg_date', 'email_domain', 'phoneModel', 'billing_city', 'billing_postal', 'billing_state', 'partner', 'PhoneService', 'MultipleLines', 'streamingPlan', 'mobileHotspot', 'wifiCallingText', 'OnlineBackup', 'device_protection', 'number_phones', 'contract_code', 'currency_code', 'maling_code', 'paperlessBilling', 'paymentMethod']

    dataset performance

    random sampling 70/15/15

    Train AUC Score : 0.967279 Eval AUC Score : 0.958073 Test AUC Score : 0.946909

    Inspiration

    Fun and simple dataset to practice with.

  4. Fruits-360 dataset

    • kaggle.com
    • paperswithcode.com
    • +1more
    Updated Jun 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mihai Oltean (2025). Fruits-360 dataset [Dataset]. https://www.kaggle.com/datasets/moltean/fruits
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 7, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mihai Oltean
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Fruits-360 dataset: A dataset of images containing fruits, vegetables, nuts and seeds

    Version: 2025.06.07.0

    Content

    The following fruits, vegetables and nuts and are included: Apples (different varieties: Crimson Snow, Golden, Golden-Red, Granny Smith, Pink Lady, Red, Red Delicious), Apricot, Avocado, Avocado ripe, Banana (Yellow, Red, Lady Finger), Beans, Beetroot Red, Blackberry, Blueberry, Cabbage, Caju seed, Cactus fruit, Cantaloupe (2 varieties), Carambula, Carrot, Cauliflower, Cherimoya, Cherry (different varieties, Rainier), Cherry Wax (Yellow, Red, Black), Chestnut, Clementine, Cocos, Corn (with husk), Cucumber (ripened, regular), Dates, Eggplant, Fig, Ginger Root, Goosberry, Granadilla, Grape (Blue, Pink, White (different varieties)), Grapefruit (Pink, White), Guava, Hazelnut, Huckleberry, Kiwi, Kaki, Kohlrabi, Kumsquats, Lemon (normal, Meyer), Lime, Lychee, Mandarine, Mango (Green, Red), Mangostan, Maracuja, Melon Piel de Sapo, Mulberry, Nectarine (Regular, Flat), Nut (Forest, Pecan), Onion (Red, White), Orange, Papaya, Passion fruit, Peach (different varieties), Pepino, Pear (different varieties, Abate, Forelle, Kaiser, Monster, Red, Stone, Williams), Pepper (Red, Green, Orange, Yellow), Physalis (normal, with Husk), Pineapple (normal, Mini), Pistachio, Pitahaya Red, Plum (different varieties), Pomegranate, Pomelo Sweetie, Potato (Red, Sweet, White), Quince, Rambutan, Raspberry, Redcurrant, Salak, Strawberry (normal, Wedge), Tamarillo, Tangelo, Tomato (different varieties, Maroon, Cherry Red, Yellow, not ripened, Heart), Walnut, Watermelon, Zucchini (green and dark).

    Branches

    The dataset has 5 major branches:

    -The 100x100 branch, where all images have 100x100 pixels. See _fruits-360_100x100_ folder.

    -The original-size branch, where all images are at their original (captured) size. See _fruits-360_original-size_ folder.

    -The meta branch, which contains additional information about the objects in the Fruits-360 dataset. See _fruits-360_dataset_meta_ folder.

    -The multi branch, which contains images with multiple fruits, vegetables, nuts and seeds. These images are not labeled. See _fruits-360_multi_ folder.

    -The _3_body_problem_ branch where the Training and Test folders contain different (varieties of) the 3 fruits and vegetables (Apples, Cherries and Tomatoes). See _fruits-360_3-body-problem_ folder.

    How to cite

    Mihai Oltean, Fruits-360 dataset, 2017-

    Dataset properties

    For the 100x100 branch

    Total number of images: 138704.

    Training set size: 103993 images.

    Test set size: 34711 images.

    Number of classes: 206 (fruits, vegetables, nuts and seeds).

    Image size: 100x100 pixels.

    For the original-size branch

    Total number of images: 58363.

    Training set size: 29222 images.

    Validation set size: 14614 images

    Test set size: 14527 images.

    Number of classes: 90 (fruits, vegetables, nuts and seeds).

    Image size: various (original, captured, size) pixels.

    For the 3-body-problem branch

    Total number of images: 47033.

    Training set size: 34800 images.

    Test set size: 12233 images.

    Number of classes: 3 (Apples, Cherries, Tomatoes).

    Number of varieties: Apples = 29; Cherries = 12; Tomatoes = 19.

    Image size: 100x100 pixels.

    For the meta branch

    Number of classes: 26 (fruits, vegetables, nuts and seeds).

    For the multi branch

    Number of images: 150.

    Filename format:

    For the 100x100 branch

    image_index_100.jpg (e.g. 31_100.jpg) or

    r_image_index_100.jpg (e.g. r_31_100.jpg) or

    r?_image_index_100.jpg (e.g. r2_31_100.jpg)

    where "r" stands for rotated fruit. "r2" means that the fruit was rotated around the 3rd axis. "100" comes from image size (100x100 pixels).

    Different varieties of the same fruit (apple, for instance) are stored as belonging to different classes.

    For the original-size branch

    r?_image_index.jpg (e.g. r2_31.jpg)

    where "r" stands for rotated fruit. "r2" means that the fruit was rotated around the 3rd axis.

    The name of the image files in the new version does NOT contain the "_100" suffix anymore. This will help you to make the distinction between the original-size branch and the 100x100 branch.

    For the multi branch

    The file's name is the concatenation of the names of the fruits inside that picture.

    Alternate download

    The Fruits-360 dataset can be downloaded from:

    Kaggle https://www.kaggle.com/moltean/fruits

    GitHub https://github.com/fruits-360

    How fruits were filmed

    Fruits and vegetables were planted in the shaft of a low-speed motor (3 rpm) and a short movie of 20 seconds was recorded.

    A Logitech C920 camera was used for filming the fruits. This is one of the best webcams available.

    Behind the fruits, we placed a white sheet of paper as a background.

    Here i...

  5. ATIS Dataset Clean Resplit

    • kaggle.com
    Updated Feb 8, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    kpe (2019). ATIS Dataset Clean Resplit [Dataset]. https://www.kaggle.com/siddhadev/atis-dataset-clean/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 8, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    kpe
    Description

    ATIS DataSet

    The ATIS dataset is a standard benchmark dataset widely used as an intent classification and slot filling task.

    Content

    The data is imported from https://github.com/yvchen/JointSLU but is cleaned and resplitted by removing duplicated samples and uncommon labels (see the kaggle kernel atis-dataset-clean-re-split-kernel used for creating this dataset for additional detalis).

    Acknowledgements

    Thanks to Yun-Nung (Vivian) Chen for publishing the original dataset.

    I have previously found a version of the ATIS dataset in the MS CNTK and have written a converter as a jupyter notebook kpe/notebooks to make the dataset easily accessible in python. However I much more like the simplicity of text format representation used by yvchen/JointSLU.

  6. Iris Species Dataset and Database

    • kaggle.com
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghanshyam Saini (2025). Iris Species Dataset and Database [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/iris-species-dataset-and-database
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 15, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ghanshyam Saini
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Iris Flower Dataset

    This is a classic and very widely used dataset in machine learning and statistics, often serving as a first dataset for classification problems. Introduced by the British statistician and biologist Ronald Fisher in his 1936 paper "The use of multiple measurements in taxonomic problems," it is a foundational resource for learning classification algorithms.

    Overview:

    The dataset contains measurements for 150 samples of iris flowers. Each sample belongs to one of three species of iris:

    • Iris setosa
    • Iris versicolor
    • Iris virginica

    For each flower, four features were measured:

    • Sepal length (in cm)
    • Sepal width (in cm)
    • Petal length (in cm)
    • Petal width (in cm)

    The goal is typically to build a model that can classify iris flowers into their correct species based on these four features.

    File Structure:

    The dataset is usually provided as a single CSV (Comma Separated Values) file, often named iris.csv or similar. This file typically contains the following columns:

    1. sepal_length (cm): Numerical. The length of the sepal of the iris flower.
    2. sepal_width (cm): Numerical. The width of the sepal of the iris flower.
    3. petal_length (cm): Numerical. The length of the petal of the iris flower.
    4. petal_width (cm): Numerical. The width of the petal of the iris flower.
    5. species: Categorical. The species of the iris flower (either 'setosa', 'versicolor', or 'virginica'). This is the target variable for classification.

    Content of the Data:

    The dataset contains an equal number of samples (50) for each of the three iris species. The measurements of the sepal and petal dimensions vary between the species, allowing for their differentiation using machine learning models.

    How to Use This Dataset:

    1. Download the iris.csv file.
    2. Load the data using libraries like Pandas in Python.
    3. Explore the data through visualization and statistical analysis to understand the relationships between the features and the different species.
    4. Build classification models (e.g., Logistic Regression, Support Vector Machines, Decision Trees, K-Nearest Neighbors) using the sepal and petal measurements as features and the 'species' column as the target variable.
    5. Evaluate the performance of your model using appropriate metrics (e.g., accuracy, precision, recall, F1-score).
    6. The dataset is small and well-behaved, making it excellent for learning and experimenting with various classification techniques.

    Citation:

    When using the Iris dataset, it is common to cite Ronald Fisher's original work:

    Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2), 179-188.

    Data Contribution:

    Thank you for providing this classic and fundamental dataset to the Kaggle community. The Iris dataset remains an invaluable resource for both beginners learning the basics of classification and experienced practitioners testing new algorithms. Its simplicity and clear class separation make it an ideal starting point for many data science projects.

    If you find this dataset description helpful and the dataset itself useful for your learning or projects, please consider giving it an upvote after downloading. Your appreciation is valuable!

  7. Listening-to-Earthquakes

    • kaggle.com
    Updated Apr 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PenguinGUI (2025). Listening-to-Earthquakes [Dataset]. https://www.kaggle.com/datasets/penguingui/listening-to-earthquakes/versions/3
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 30, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    PenguinGUI
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Data Card: LANL Earthquake Prediction

    1. Data Source and Description

    • Source: The data originates from the LANL Earthquake Prediction Kaggle competition, provided by Los Alamos National Laboratory.
    • Description: The dataset comprises continuous seismic acoustic data collected from laboratory experiments simulating earthquake conditions. The objective is to predict the "time to failure"—the time remaining until the next laboratory earthquake occurs—making this a regression task.
    • Licensing and Ethical Considerations: The data is publicly available under the competition’s terms of use on Kaggle. Ethically, any models or insights derived should be applied responsibly, considering the potential implications of earthquake prediction in real-world scenarios.

    2. Data Preprocessing

    • Sliding-Window Approach: Given the large size and continuous nature of the seismic data (each segment contains 150,000 data points), training time series models like Temporal Convolutional Networks (TCN) or Long Short-Term Memory (LSTM) networks on full-length samples is computationally intensive. To address this, a sliding-window strategy was implemented:
      • Window Sizes: Multiple sizes were used—150,000, 15,000, and 1,500 data points—to capture patterns across different temporal scales.
      • Strides: Strides of 150,000, 15,000, 7,500, 1,500, and 750 were applied, creating overlapping or non-overlapping windows and resulting in five distinct processed datasets.
    • Data Segmentation: The continuous acoustic signals were segmented into fixed-length chunks of 150,000 data points each. For training data, the label assigned to each segment was the "time to failure" at its endpoint, aligning with the prediction task.
    • Data Storage: Extracted feature sequences were saved in NumPy’s compressed .npz format, ensuring efficient storage, accessibility, and consistency across training and testing phases.

    3. Feature Extraction

    • Statistical Features: For each window, eleven statistical features were calculated to summarize the seismic signals:
      • Mean, standard deviation, minimum, maximum, median, skewness, and kurtosis.
      • Quantile-based features at 1%, 5%, 95%, and 99% to detect variations and potential anomalies linked to seismic events.
    • Advanced Techniques: Drawing from top competition solutions (e.g., the 26th place approach), consider enhancing your feature set with:
      • Matched Filtering: Identifies recurring patterns in the time series, which could signal precursors to earthquakes.
      • Hilbert Transform: Extracts the analytical envelope of the signal, highlighting peak behaviors and dynamic changes.
    • Multi-Scale Analysis: Using varied window sizes and strides enables the capture of both short-term fluctuations and longer-term trends, critical for understanding seismic signal dynamics.

    4. Predictive Modeling

    • Model Selection: A diverse set of models was planned to leverage their unique strengths:
      • Random Forest: Offers a baseline and insights into feature importance.
      • Multi-Layer Perceptron (MLP): Provides a simple neural network baseline.
      • Temporal Convolutional Network (TCN): Excels at modeling temporal dependencies with computational efficiency.
      • Long Short-Term Memory (LSTM): Captures long-term sequential relationships, ideal for time series like seismic data.
    • Ensemble Potential: The competition’s winning team combined neural networks with gradient boosted decision trees (e.g., LightGBM or XGBoost), suggesting that integrating tree-based models could boost performance.
    • Training Considerations:
      • Use cross-validation strategies tailored to the data’s structure, such as "Leave One Earthquake Out," to ensure generalization (inspired by the 1st place writeup).
      • Tune hyperparameters carefully, especially for tree-based models, to optimize for the target metric.

    5. Evaluation

    • Metric: The competition evaluates submissions using Mean Absolute Error (MAE), measuring the accuracy of predicted "time to failure" against actual values.
    • Validation: Robust validation is key. Consider splitting the data by earthquake cycles to mimic the test set’s structure and avoid overfitting.
  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nibedita Sahu (2023). Classification Analysis Using Python [Dataset]. https://www.kaggle.com/datasets/nibeditasahu/classification-analysis-using-python
Organization logo

Classification Analysis Using Python

Exploring the Iris Dataset and Building a Decision Tree Classifier

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 3, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nibedita Sahu
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

The Iris dataset is a classic and widely used dataset in machine learning for classification tasks. It consists of measurements of different iris flowers, including sepal length, sepal width, petal length, and petal width, along with their corresponding species. With a total of 150 samples, the dataset is balanced and serves as an excellent choice for understanding and implementing classification algorithms. This notebook explores the dataset, preprocesses the data, builds a decision tree classification model, and evaluates its performance, showcasing the effectiveness of decision trees in solving classification problems.

Search
Clear search
Close search
Google apps
Main menu