100+ datasets found

f
Data from: Comparison of classification algorithms.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jan 18, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pálková, Martina; Apeltauer, Tomáš; Uhlík, Ondřej (2024). Comparison of classification algorithms. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001388071
Explore at:
Dataset updated
Jan 18, 2024
Authors
Pálková, Martina; Apeltauer, Tomáš; Uhlík, Ondřej
Description
Machine learning methods and agent-based models enable the optimization of the operation of high-capacity facilities. In this paper, we propose a method for automatically extracting and cleaning pedestrian traffic detector data for subsequent calibration of the ingress pedestrian model. The data was obtained from the waiting room traffic of a vaccination center. Walking speed distribution, the number of stops, the distribution of waiting times, and the locations of waiting points were extracted. Of the 9 machine learning algorithms, the random forest model achieved the highest accuracy in classifying valid data and noise. The proposed microscopic calibration allows for more accurate capacity assessment testing, procedural changes testing, and geometric modifications testing in parts of the facility adjacent to the calibrated parts. The results show that the proposed method achieves state-of-the-art performance on a violent-flows dataset. The proposed method has the potential to significantly improve the accuracy and efficiency of input model predictions and optimize the operation of high-capacity facilities.
f
Statistical analysis of classification method accuracy.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jan 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Razak, Bushroa Abdul; Jamaludin, Mohd Fadzil; Mokhtar, Norrima; Nordin, Hilman; Mehmood, Adeel (2025). Statistical analysis of classification method accuracy. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001394184
Explore at:
Dataset updated
Jan 24, 2025
Authors
Razak, Bushroa Abdul; Jamaludin, Mohd Fadzil; Mokhtar, Norrima; Nordin, Hilman; Mehmood, Adeel
Description
Statistical analysis of classification method accuracy.
Data from: color classification
kaggle.com
zip
Updated Apr 20, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aydin Ayanzadeh (2018). color classification [Dataset]. https://www.kaggle.com/ayanzadeh93/color-classification
Explore at:
zip(169343980 bytes)Available download formats
Dataset updated
Apr 20, 2018
Authors
Aydin Ayanzadeh
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
Introduction

Color classification is an important application that is used in many areas. For example, systems that perform daily. SVM classifier with an optimal hyperplane life analysis can benefit from this classification process. For the classification process, lots of classification algorithms can be used. Among them, the most popular machine learning algorithms are neural networks, decision trees, k-nearest neighbors, Bayes network, support vector machines. In this work for training, SVMs are used and a classifier model was tried to be obtained. SVMs algorithm is one of the supervised learning methods. SVM calls for solutions to regression and classification problems as in all supervised learning methods. This algorithm is usually used to training for separate and classify different labeled samples. As a result of training with SVM, it is aimed to create an optimum hyperplane and classify the data in different classes. This hyperplane is located as far away from the data as possible to avoid error conditions.

Dataset

The datasets have contained about 80 images for trainset datasets for whole color classes and 90 images for the test set. colors which are prepared for this application is y yellow, black, white, green, red, orange, blue a and violet. In this implementation, basic colors are preferred for classification. and created a dataset containing images of these basic colors. The dataset also includes masks for all images. we create these masks by binarizing the image. we did the masking on the images I collected and painted the pixels belonging to the class color to white and remaining pixels to the black color.
f
Classification accuracies obtained with the proposed hybrid model and the...
figshare.com
plos.figshare.com
xls
Updated Jun 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nader Salari; Shamarina Shohaimi; Farid Najafi; Meenakshii Nallappan; Isthrinayagy Karishnarajah (2023). Classification accuracies obtained with the proposed hybrid model and the other state-of-the-art classifiers from the recent literature for the data sets under consideration. [Dataset]. http://doi.org/10.1371/journal.pone.0112987.t023
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0112987.t023
Dataset updated
Jun 6, 2023
Dataset provided by
PLOS ONE
Authors
Nader Salari; Shamarina Shohaimi; Farid Najafi; Meenakshii Nallappan; Isthrinayagy Karishnarajah
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Classification accuracies obtained with the proposed hybrid model and the other state-of-the-art classifiers from the recent literature for the data sets under consideration.
f
Classification results for each method.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Oct 11, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nguyen, Dan; Henry, Peter; Jiang, Steve; Rozario, Timothy; Iqbal, Zohaib; Westover, Kenneth; Luo, Da; Long, Troy; Lu, Weiguo; Wang, Jing; Kazemifar, Samaneh; Choy, Hak; Yan, Yulong (2018). Classification results for each method. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000667573
Explore at:
Dataset updated
Oct 11, 2018
Authors
Nguyen, Dan; Henry, Peter; Jiang, Steve; Rozario, Timothy; Iqbal, Zohaib; Westover, Kenneth; Luo, Da; Long, Troy; Lu, Weiguo; Wang, Jing; Kazemifar, Samaneh; Choy, Hak; Yan, Yulong
Description
The scores for the precision, recall, F1-score, and accuracy metrics are shown for all four classification methods. The method with the highest score for each metric is highlighted in bold. The (†) symbol by the thresholding methods indicates that these methods only classified 18 zones, whereas the neural network methods classified 21 zones.
d
MULTI-LABEL ASRS DATASET CLASSIFICATION USING SEMI-SUPERVISED SUBSPACE...
catalog.data.gov
s.cnmilf.com
+1more
Updated Apr 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). MULTI-LABEL ASRS DATASET CLASSIFICATION USING SEMI-SUPERVISED SUBSPACE CLUSTERING [Dataset]. https://catalog.data.gov/dataset/multi-label-asrs-dataset-classification-using-semi-supervised-subspace-clustering
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
MULTI-LABEL ASRS DATASET CLASSIFICATION USING SEMI-SUPERVISED SUBSPACE CLUSTERING MOHAMMAD SALIM AHMED, LATIFUR KHAN, NIKUNJ OZA, AND MANDAVA RAJESWARI Abstract. There has been a lot of research targeting text classification. Many of them focus on a particular characteristic of text data - multi-labelity. This arises due to the fact that a document may be associated with multiple classes at the same time. The consequence of such a characteristic is the low performance of traditional binary or multi-class classification techniques on multi-label text data. In this paper, we propose a text classification technique that considers this characteristic and provides very good performance. Our multi-label text classification approach is an extension of our previously formulated [3] multi-class text classification approach called SISC (Semi-supervised Impurity based Subspace Clustering). We call this new classification model as SISC-ML(SISC Multi-Label). Empirical evaluation on real world multi-label NASA ASRS (Aviation Safety Reporting System) data set reveals that our approach outperforms state-of-theart text classification as well as subspace clustering algorithms.
i
The Data of Short Text Classification
ieee-dataport.org
Updated Oct 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zan Qiu (2025). The Data of Short Text Classification [Dataset]. https://ieee-dataport.org/documents/data-short-text-classification
Explore at:
Dataset updated
Oct 29, 2025
Authors
Zan Qiu
Description
In the experiments conducted in this paper
The assessment results of the proposed model in comparison with the all...
plos.figshare.com
xls
Updated Jun 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nader Salari; Shamarina Shohaimi; Farid Najafi; Meenakshii Nallappan; Isthrinayagy Karishnarajah (2023). The assessment results of the proposed model in comparison with the all other thirteen methods based on the two multi-class data sets by applying the nine commonly used performance evaluation criteria. [Dataset]. http://doi.org/10.1371/journal.pone.0112987.t019
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0112987.t019
Dataset updated
Jun 8, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Nader Salari; Shamarina Shohaimi; Farid Najafi; Meenakshii Nallappan; Isthrinayagy Karishnarajah
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The assessment results of the proposed model in comparison with the all other thirteen methods based on the two multi-class data sets by applying the nine commonly used performance evaluation criteria.
d
Simulated Hierarchical Benchmark Dataset to assess dendro-classification...
search.dataone.org
doi.pangaea.de
Updated Feb 14, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Schoening, Timm; Schütt, Andrea; GEOMAR - Helmholtz Centre for Ocean Research Kiel (2018). Simulated Hierarchical Benchmark Dataset to assess dendro-classification methods (hierarchical classification) [Dataset]. http://doi.org/10.1594/PANGAEA.884173
Explore at:
Unique identifier
https://doi.org/10.1594/PANGAEA.884173
Dataset updated
Feb 14, 2018
Dataset provided by
PANGAEA Data Publisher for Earth and Environmental Science
Authors
Schoening, Timm; Schütt, Andrea; GEOMAR - Helmholtz Centre for Ocean Research Kiel
Description
A hierarchically ordered distribution of 3D-points was created with matlab. It contains 120,000 datapoints in five hierarchical levels with one to four child nodes per parent. Data values for the three axes range betwwen 0 and 1. The structure can be seen in the attached figure. In each hierarchical level different distributions of datapoints are implemented. This allows to test classifiers under various conditions. The most common distribution in the dataset is a simple gaussian distributed point cloud. Other sampled distributions are a spherical distribution (sphere in 3D), or a circular (donut) distribution along different axes. XOR distributions are implemented in different patterns, e.g. four batches with crossed classes or eight batches with two or four classes. The most complex data distribution is the springroll, where the datapoints are intertwined into one another. To create indistinguishable cases, where the prediction of a classifier is supposed to perform bad, some datapoints are just randomly intermixed with another class.

The .csv-file contains four columns: label | x-coordinate | y-coordinate | z-coordinate

The label for each sample provides all hierarchical information needed. Each label is composed of five digits, one for each hierarchical level. As an example:

Sample '11421': Hierarchical level 1: class 1 Hierarchical level 2: class 1 Hierarchical level 3: class 4 Hierarchical level 4: class 2 Hierarchical level 5: class 1
Malware Classification with ML Algorithms project
kaggle.com
zip
Updated May 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tehmina Raja (2024). Malware Classification with ML Algorithms project [Dataset]. https://www.kaggle.com/datasets/tehminaasrar/malware-classification-with-ml-algorithms-projecct
Explore at:
zip(1261295 bytes)Available download formats
Dataset updated
May 19, 2024
Authors
Tehmina Raja
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Description

Contex Malware detection is a critical task in cybersecurity, aimed at identifying malicious software that can harm or exploit system vulnerabilities. With the increasing complexity and volume of malware, traditional detection methods are often inadequate. This project leverages machine learning to enhance the accuracy and efficiency of malware dataset. Source The dataset used in this project is synthetically generated to simulate a realistic distribution of malware and benign samples. It includes various features that represent typical characteristics observed in malware behavior. The dataset has been preprocessed and scaled to ensure optimal performance of machine learning models.

Inspiration The inspiration behind this project stems from the ongoing challenge in the cybersecurity field to stay ahead of evolving threats. By applying machine learning algorithms, we aim to develop more robust and adaptive detection mechanisms. This project is also inspired by the potential of AI to transform cybersecurity practices, making systems more secure and resilient against attacks. The ultimate goal is to contribute to the development of advanced tools that can better protect individuals and organizations from malicious software.
Shortvideo_classification
kaggle.com
zip
Updated Dec 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
YuanCHEN_AG (2023). Shortvideo_classification [Dataset]. https://www.kaggle.com/datasets/swufecya/q3-train-video
Explore at:
zip(4862432947 bytes)Available download formats
Dataset updated
Dec 20, 2023
Authors
YuanCHEN_AG
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by YuanCHEN_AG

Released under MIT

Contents
Lions or Cheetahs - Image Classification
kaggle.com
zip
Updated Feb 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MikołajFish99 (2023). Lions or Cheetahs - Image Classification [Dataset]. https://www.kaggle.com/datasets/mikoajfish99/lions-or-cheetahs-image-classification/code
Explore at:
zip(74771166 bytes)Available download formats
Dataset updated
Feb 26, 2023
Authors
MikołajFish99
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The "Lions or Cheetahs - Image Classification" dataset is a collection of images downloaded from the Open Images Dataset V6, containing photographs of both lions and cheetahs. This dataset has been compiled for the purpose of training and evaluating image classification algorithms.

The dataset contains a total of 200 images. The images have been labeled as either "lion" or "cheetah" and are stored in separate directories within the dataset.

This dataset can be used for a variety of tasks related to image classification, including developing and testing deep learning algorithms, evaluating the effectiveness of different image features and classification techniques, and comparing the performance of different models.

Researchers and practitioners interested in using this dataset are encouraged to cite the original source, Open Images Dataset V6, and to acknowledge any modifications made to the dataset for their particular use.
c
Simple Fruit Tabular Classification Dataset
cubig.ai
zip
Updated Jul 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Simple Fruit Tabular Classification Dataset [Dataset]. https://cubig.ai/store/products/566/simple-fruit-tabular-classification-dataset
Explore at:
zipAvailable download formats
Dataset updated
Jul 8, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The Fruit Classification Dataset is designed to classify different types of fruits based on their spatial coordinates. It includes data points with 'x' and 'y' coordinates and their corresponding fruit class labels (apple, banana, orange), facilitating the development and testing of classification models for simple geometric data.

2) Data Utilization (1) Fruit Classification data has characteristics that: • It contains detailed coordinates (x and y) for each fruit class, allowing for the visualization and analysis of fruit distribution in a two-dimensional space. This dataset is ideal for understanding basic classification algorithms and testing their performance. (2) Fruit Classification data can be used to: • Machine Learning Education: Supports the teaching and learning of classification techniques, data visualization, and feature extraction in an accessible and engaging manner. • Algorithm Testing: Provides a straightforward dataset for evaluating and comparing the performance of various classification algorithms in distinguishing between different fruit types based on coordinates.
d
Pseudo-Label Generation for Multi-Label Text Classification
catalog.data.gov
datasets.ai
+1more
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Pseudo-Label Generation for Multi-Label Text Classification [Dataset]. https://catalog.data.gov/dataset/pseudo-label-generation-for-multi-label-text-classification
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
With the advent and expansion of social networking, the amount of generated text data has seen a sharp increase. In order to handle such a huge volume of text data, new and improved text mining techniques are a necessity. One of the characteristics of text data that makes text mining difficult, is multi-labelity. In order to build a robust and effective text classification method which is an integral part of text mining research, we must consider this property more closely. This kind of property is not unique to text data as it can be found in non-text (e.g., numeric) data as well. However, in text data, it is most prevalent. This property also puts the text classification problem in the domain of multi-label classification (MLC), where each instance is associated with a subset of class-labels instead of a single class, as in conventional classification. In this paper, we explore how the generation of pseudo labels (i.e., combinations of existing class labels) can help us in performing better text classification and under what kind of circumstances. During the classification, the high and sparse dimensionality of text data has also been considered. Although, here we are proposing and evaluating a text classification technique, our main focus is on the handling of the multi-labelity of text data while utilizing the correlation among multiple labels existing in the data set. Our text classification technique is called pseudo-LSC (pseudo-Label Based Subspace Clustering). It is a subspace clustering algorithm that considers the high and sparse dimensionality as well as the correlation among different class labels during the classification process to provide better performance than existing approaches. Results on three real world multi-label data sets provide us insight into how the multi-labelity is handled in our classification process and shows the effectiveness of our approach.
Carrots vs Rockets - Image Classification
kaggle.com
zip
Updated Mar 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MikołajFish99 (2023). Carrots vs Rockets - Image Classification [Dataset]. https://www.kaggle.com/datasets/mikoajfish99/carrots-vs-rockets-image-classification
Explore at:
zip(94545825 bytes)Available download formats
Dataset updated
Mar 2, 2023
Authors
MikołajFish99
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The "Carrots vs Rockets - Image Classification" dataset is a collection of images downloaded from various sources, containing photographs of both carrots and rockets. This dataset has been compiled for the purpose of training and evaluating image classification algorithms.

The dataset contains a total of 306 images. The images have been labeled as either "carrot" or "rocket" and are stored in separate directories.

This dataset can be used for a variety of tasks related to image classification, including developing and testing deep learning algorithms, evaluating the effectiveness of different image features and classification techniques, and comparing the performance of different models.

Researchers and practitioners interested in using this dataset are encouraged to cite the original sources of the images and to acknowledge any modifications made to the dataset for their particular use. The dataset may be useful for tasks such as automated vegetable sorting or satellite image analysis.
f
Summary of classification algorithms.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Mar 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mecham, Avery; Johnson, Jérémie L.; Piccolo, Stephen R.; Golightly, Nathan P.; Miller, Dustin B. (2022). Summary of classification algorithms. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000204134
Explore at:
Dataset updated
Mar 11, 2022
Authors
Mecham, Avery; Johnson, Jérémie L.; Piccolo, Stephen R.; Golightly, Nathan P.; Miller, Dustin B.
Description
We compared the predictive ability of 52 classification algorithms that were available in ShinyLearner and had been implemented across 4 open-source machine-learning libraries. The abbreviation for each algorithm contains a prefix indicating which machine-learning library implemented the algorithm (mlr = Machine learning in R, sklearn = scikit-learn, weka = WEKA: The workbench for machine learning; keras = Keras). For each algorithm, we provide a brief description of the algorithmic approach; we extracted these descriptions from the libraries that implemented the algorithms. In addition, we assigned high-level categories that characterize the algorithmic methodology used by each algorithm. In some cases, the individual machine-learning libraries aggregated algorithm implementations from third-party packages. In these cases, we cite the machine-learning library and the third-party package. When available, we also cite papers that describe the algorithmic methodologies used. Finally, for each algorithm, we indicate the number of unique hyperparameter combinations evaluated in Analysis 4.
f
Average Rankings of the classification algorithms for binary and multi-class...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jun 19, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bashashati, Ali; Birch, Gary E.; Ward, Rabab K.; Bashashati, Hossein (2015). Average Rankings of the classification algorithms for binary and multi-class classification in synchronous datasets. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001921530
Explore at:
Dataset updated
Jun 19, 2015
Authors
Bashashati, Ali; Birch, Gary E.; Ward, Rabab K.; Bashashati, Hossein
Description
The number of subjects in binary task was 12 and the number of subjects in multi-task BCIs was 9. The number in the parenthesis corresponds to the average rank of the algorithm among different subjects. For each feature extraction method the classifiers typed in bold are the recommended ones. The recommended classifiers are selected based on the results of the statistical tests.
g
The application of object-based image classification methods for the...
gimi9.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The application of object-based image classification methods for the creation of a hierarchical classification system of open au in the Danube-Auen National Park | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_992e273c-b444-5f1e-482c-b3550bc2c031/
Explore at:
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Danube River
Description
🇦🇹 오스트리아
f
Comparison of classification methods for clonality between pairs in the...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Aug 6, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Servant, Nicolas; Pinheiro, Alice; Stoven, Véronique; Caly, Martial; Moarii, Matahi; Fourquet, Alain; Reyal, Fabien; Vert, Jean-Philippe; Sigal-Zafrani, Brigitte (2014). Comparison of classification methods for clonality between pairs in the PT/LR cohort. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001263914
Explore at:
Dataset updated
Aug 6, 2014
Authors
Servant, Nicolas; Pinheiro, Alice; Stoven, Véronique; Caly, Martial; Moarii, Matahi; Fourquet, Alain; Reyal, Fabien; Vert, Jean-Philippe; Sigal-Zafrani, Brigitte
Description
Cor (Correspondence): correspondence number with the Bollet/Servant cohort. scores: scores obtained with partial identity (PIS) or methylation (MS). Time: time elapsed between diagnosis of the PT and diagnosis of the recurrence. Classification: classification of the recurrence based on copy number (PIS), methylation (MS) or clinical features (clinical). Divergence: which method deviated from the others.
d
Data from: MULTI-TEMPORAL REMOTE SENSING IMAGE CLASSIFICATION - A MULTI-VIEW...
catalog.data.gov
datasets.ai
+2more
Updated Apr 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). MULTI-TEMPORAL REMOTE SENSING IMAGE CLASSIFICATION - A MULTI-VIEW APPROACH [Dataset]. https://catalog.data.gov/dataset/multi-temporal-remote-sensing-image-classification-a-multi-view-approach
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
MULTI-TEMPORAL REMOTE SENSING IMAGE CLASSIFICATION - A MULTI-VIEW APPROACH VARUN CHANDOLA AND RANGA RAJU VATSAVAI Abstract. Multispectral remote sensing images have been widely used for automated land use and land cover classification tasks. Often thematic classification is done using single date image, however in many instances a single date image is not informative enough to distinguish between different land cover types. In this paper we show how one can use multiple images, collected at different times of year (for example, during crop growing season), to learn a better classifier. We propose two approaches, an ensemble of classifiers approach and a co-training based approach, and show how both of these methods outperform a straightforward stacked vector approach often used in multi-temporal image classification. Additionally, the co-training based method addresses the challenge of limited labeled training data in supervised classification, as this classification scheme utilizes a large number of unlabeled samples (which comes for free) in conjunction with a small set of labeled training data.

Facebook

Twitter

Click to copy link

Link copied

Cite

Pálková, Martina; Apeltauer, Tomáš; Uhlík, Ondřej (2024). Comparison of classification algorithms. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001388071

Data from: Comparison of classification algorithms.

Explore at:

Dataset updated

Jan 18, 2024

Authors

Pálková, Martina; Apeltauer, Tomáš; Uhlík, Ondřej

Description

Machine learning methods and agent-based models enable the optimization of the operation of high-capacity facilities. In this paper, we propose a method for automatically extracting and cleaning pedestrian traffic detector data for subsequent calibration of the ingress pedestrian model. The data was obtained from the waiting room traffic of a vaccination center. Walking speed distribution, the number of stops, the distribution of waiting times, and the locations of waiting points were extracted. Of the 9 machine learning algorithms, the random forest model achieved the highest accuracy in classifying valid data and noise. The proposed microscopic calibration allows for more accurate capacity assessment testing, procedural changes testing, and geometric modifications testing in parts of the facility adjacent to the calibrated parts. The results show that the proposed method achieves state-of-the-art performance on a violent-flows dataset. The proposed method has the potential to significantly improve the accuracy and efficiency of input model predictions and optimize the operation of high-capacity facilities.

Clear search

Close search

Google apps

Main menu

Data from: Comparison of classification algorithms.

Statistical analysis of classification method accuracy.

Data from: color classification

Classification accuracies obtained with the proposed hybrid model and the...

Classification results for each method.

MULTI-LABEL ASRS DATASET CLASSIFICATION USING SEMI-SUPERVISED SUBSPACE...

The Data of Short Text Classification

The assessment results of the proposed model in comparison with the all...

Simulated Hierarchical Benchmark Dataset to assess dendro-classification...

Malware Classification with ML Algorithms project

Shortvideo_classification

Dataset

Contents

Lions or Cheetahs - Image Classification

Simple Fruit Tabular Classification Dataset

Pseudo-Label Generation for Multi-Label Text Classification

Carrots vs Rockets - Image Classification

Summary of classification algorithms.

Average Rankings of the classification algorithms for binary and multi-class...

The application of object-based image classification methods for the...

Comparison of classification methods for clonality between pairs in the...

Data from: MULTI-TEMPORAL REMOTE SENSING IMAGE CLASSIFICATION - A MULTI-VIEW...

Data from: Comparison of classification algorithms.See More Versions

Data from: Comparison of classification algorithms.