81 datasets found
  1. Name classification

    • kaggle.com
    zip
    Updated Dec 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shubham Patel (2023). Name classification [Dataset]. https://www.kaggle.com/datasets/shubhampatel231/name-classification/discussion
    Explore at:
    zip(63211 bytes)Available download formats
    Dataset updated
    Dec 20, 2023
    Authors
    Shubham Patel
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Name: Multilingual Names for RNN Classification

    Description:
    This dataset, sourced from PyTorch's official tutorial, comprises popular names across 18 distinct languages, namely Arabic, Chinese, Czech, Dutch, English, French, German, Greek, Irish, Italian, Japanese, Korean, Polish, Portuguese, Russian, Scottish, Spanish, and Vietnamese. Each language's names are contained in separate text files for easy extraction and categorization.

    Usage:
    The dataset is particularly useful for tasks like Recurrent Neural Network (RNN) classification, where the aim might be to predict the language origin of a given name based on its character sequence.

    Source:
    PyTorch Official Tutorial - Char RNN Classification

  2. Create Your image Classifier - Mabdelkarim

    • kaggle.com
    zip
    Updated Dec 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamed Abdelkarim (2020). Create Your image Classifier - Mabdelkarim [Dataset]. https://www.kaggle.com/mohamedabdelkarim/create-your-image-classifier-mabdelkarim
    Explore at:
    zip(282784 bytes)Available download formats
    Dataset updated
    Dec 20, 2020
    Authors
    Mohamed Abdelkarim
    Description

    Dataset

    This dataset was created by Mohamed Abdelkarim

    Contents

  3. f

    Tumor Classification Model

    • figshare.com
    zip
    Updated Jan 25, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emirhan Kurtuluş (2021). Tumor Classification Model [Dataset]. http://doi.org/10.6084/m9.figshare.13637966.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 25, 2021
    Dataset provided by
    figshare
    Authors
    Emirhan Kurtuluş
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    A pytorch loadable model for tumor classification

  4. Chinese-Text-Classification-Pytorch-master

    • kaggle.com
    zip
    Updated Nov 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tikadisplay (2023). Chinese-Text-Classification-Pytorch-master [Dataset]. https://www.kaggle.com/datasets/tikadisplay/chinese-text-classification-pytorch-master
    Explore at:
    zip(16734585 bytes)Available download formats
    Dataset updated
    Nov 20, 2023
    Authors
    Tikadisplay
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Tikadisplay

    Released under Apache 2.0

    Contents

  5. Melanoma Classification PyTorch

    • kaggle.com
    zip
    Updated Aug 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MohamedAmine SAIGHI (2022). Melanoma Classification PyTorch [Dataset]. https://www.kaggle.com/datasets/mohamedaminesaighi/melanoma-classification-pytorch
    Explore at:
    zip(82890419 bytes)Available download formats
    Dataset updated
    Aug 29, 2022
    Authors
    MohamedAmine SAIGHI
    Description

    Dataset

    This dataset was created by MohamedAmine SAIGHI

    Contents

  6. mriModalityClassification.h5

    • figshare.com
    zip
    Updated Jul 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nick Tustison (2023). mriModalityClassification.h5 [Dataset]. http://doi.org/10.6084/m9.figshare.23669232.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 12, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Nick Tustison
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    weights mri classification pytorch

  7. imagenet_pretrained_softmax_output

    • kaggle.com
    zip
    Updated Mar 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    my1nonly (2025). imagenet_pretrained_softmax_output [Dataset]. https://www.kaggle.com/datasets/my1nonly/imagenet-pretrained-softmax-output
    Explore at:
    zip(63627666676 bytes)Available download formats
    Dataset updated
    Mar 22, 2025
    Authors
    my1nonly
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Softmax output when passing ImageNet-1K data (train & test sets) to PyTorch's pretrained classification models.

    AlexNet

    v1: {'acc@1': 0.56522, 'acc@5': 0.79066, 'num_params': 61.10M}

    DenseNet (121, 161, 169, 201)

    v1: {'acc@1': 0.74434, 'acc@5': 0.91972, 'num_params': 7.98M}
    v1: {'acc@1': 0.77138, 'acc@5': 0.93560, 'num_params': 28.68M}
    v1: {'acc@1': 0.75600, 'acc@5': 0.92806, 'num_params': 14.15M}
    v1: {'acc@1': 0.76896, 'acc@5': 0.93370, 'num_params': 20.01M}

    VGG (11, 13, 16, 19)

    v1: {'acc@1': 0.69020, 'acc@5': 0.88628, 'num_params': 132.86M}
    v1: {'acc@1': 0.69928, 'acc@5': 0.89246, 'num_params': 133.05M}
    v1: {'acc@1': 0.71592, 'acc@5': 0.90382, 'num_params': 138.36M}
    v1: {'acc@1': 0.72376, 'acc@5': 0.90876, 'num_params': 143.67M}

  8. D

    Code for Hyperbolic Embedding Inference for Structured Multi-Label...

    • darus.uni-stuttgart.de
    Updated Jul 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bo Xiong; Mojtaba Nayyeri; Michael Cochez; Steffen Staab (2024). Code for Hyperbolic Embedding Inference for Structured Multi-Label Prediction [Dataset]. http://doi.org/10.18419/DARUS-3988
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 5, 2024
    Dataset provided by
    DaRUS
    Authors
    Bo Xiong; Mojtaba Nayyeri; Michael Cochez; Steffen Staab
    License

    https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18419/DARUS-3988https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18419/DARUS-3988

    Dataset funded by
    European Commission
    Description

    This is a PyTorch implementation of the paper Hyperbolic Embedding Inference for Structured Multi-Label Prediction published in NeurIPS 2022. The code provides the Python scripts to reproduce the experiments in the paper, as well as a proof-of-concept example of the method. To execute the code, follow the instructions in the README.md file. For more info, please check the paper. Please have no hesitation to contact the authors for any inquiries.

  9. Flower_classification_model

    • kaggle.com
    zip
    Updated Dec 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AYUSH KUMAR SINGH (2023). Flower_classification_model [Dataset]. https://www.kaggle.com/datasets/akawizard/flower-classification-model
    Explore at:
    zip(435764401 bytes)Available download formats
    Dataset updated
    Dec 17, 2023
    Authors
    AYUSH KUMAR SINGH
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by AYUSH KUMAR SINGH

    Released under MIT

    Contents

  10. Geospatial Deep Learning Seminar Online Course

    • ckan.americaview.org
    Updated Nov 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.americaview.org (2021). Geospatial Deep Learning Seminar Online Course [Dataset]. https://ckan.americaview.org/dataset/geospatial-deep-learning-seminar-online-course
    Explore at:
    Dataset updated
    Nov 2, 2021
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This seminar is an applied study of deep learning methods for extracting information from geospatial data, such as aerial imagery, multispectral imagery, digital terrain data, and other digital cartographic representations. We first provide an introduction and conceptualization of artificial neural networks (ANNs). Next, we explore appropriate loss and assessment metrics for different use cases followed by the tensor data model, which is central to applying deep learning methods. Convolutional neural networks (CNNs) are then conceptualized with scene classification use cases. Lastly, we explore semantic segmentation, object detection, and instance segmentation. The primary focus of this course is semantic segmenation for pixel-level classification. The associated GitHub repo provides a series of applied examples. We hope to continue to add examples as methods and technologies further develop. These examples make use of a vareity of datasets (e.g., SAT-6, topoDL, Inria, LandCover.ai, vfillDL, and wvlcDL). Please see the repo for links to the data and associated papers. All examples have associated videos that walk through the process, which are also linked to the repo. A variety of deep learning architectures are explored including UNet, UNet++, DeepLabv3+, and Mask R-CNN. Currenlty, two examples use ArcGIS Pro and require no coding. The remaining five examples require coding and make use of PyTorch, Python, and R within the RStudio IDE. It is assumed that you have prior knowledge of coding in the Python and R enviroinments. If you do not have experience coding, please take a look at our Open-Source GIScience and Open-Source Spatial Analytics (R) courses, which explore coding in Python and R, respectively. After completing this seminar you will be able to: explain how ANNs work including weights, bias, activation, and optimization. describe and explain different loss and assessment metrics and determine appropriate use cases. use the tensor data model to represent data as input for deep learning. explain how CNNs work including convolutional operations/layers, kernel size, stride, padding, max pooling, activation, and batch normalization. use PyTorch, Python, and R to prepare data, produce and assess scene classification models, and infer to new data. explain common semantic segmentation architectures and how these methods allow for pixel-level classification and how they are different from traditional CNNs. use PyTorch, Python, and R (or ArcGIS Pro) to prepare data, produce and assess semantic segmentation models, and infer to new data.

  11. Overview of deep learning terminology.

    • plos.figshare.com
    xls
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron E. Maxwell; Sarah Farhadpour; Srinjoy Das; Yalin Yang (2024). Overview of deep learning terminology. [Dataset]. http://doi.org/10.1371/journal.pone.0315127.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 5, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Aaron E. Maxwell; Sarah Farhadpour; Srinjoy Das; Yalin Yang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Convolutional neural network (CNN)-based deep learning (DL) methods have transformed the analysis of geospatial, Earth observation, and geophysical data due to their ability to model spatial context information at multiple scales. Such methods are especially applicable to pixel-level classification or semantic segmentation tasks. A variety of R packages have been developed for processing and analyzing geospatial data. However, there are currently no packages available for implementing geospatial DL in the R language and data science environment. This paper introduces the geodl R package, which supports pixel-level classification applied to a wide range of geospatial or Earth science data that can be represented as multidimensional arrays where each channel or band holds a predictor variable. geodl is built on the torch package, which supports the implementation of DL using the R and C++ languages without the need for installing a Python/PyTorch environment. This greatly simplifies the software environment needed to implement DL in R. Using geodl, geospatial raster-based data with varying numbers of bands, spatial resolutions, and coordinate reference systems are read and processed using the terra package, which makes use of C++ and allows for processing raster grids that are too large to fit into memory. Training loops are implemented with the luz package. The geodl package provides utility functions for creating raster masks or labels from vector-based geospatial data and image chips and associated masks from larger files and extents. It also defines a torch dataset subclass for geospatial data for use with torch dataloaders. UNet-based models are provided with a variety of optional ancillary modules or modifications. Common assessment metrics (i.e., overall accuracy, class-level recalls or producer’s accuracies, class-level precisions or user’s accuracies, and class-level F1-scores) are implemented along with a modified version of the unified focal loss framework, which allows for defining a variety of loss metrics using one consistent implementation and set of hyperparameters. Users can assess models using standard geospatial and remote sensing metrics and methods and use trained models to predict to large spatial extents. This paper introduces the geodl workflow, design philosophy, and goals for future development.

  12. flower_classification

    • kaggle.com
    zip
    Updated Apr 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aritra Sen (2019). flower_classification [Dataset]. https://www.kaggle.com/datasets/aritrase/flower-classification
    Explore at:
    zip(471790900 bytes)Available download formats
    Dataset updated
    Apr 28, 2019
    Authors
    Aritra Sen
    Description

    Dataset

    This dataset was created by Aritra Sen

    Contents

  13. g

    Mens & Womens Images for Fashion, Classification

    • gts.ai
    json
    Updated Dec 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2024). Mens & Womens Images for Fashion, Classification [Dataset]. https://gts.ai/dataset-download/mens-womens-images-for-fashion-classification/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Dec 13, 2024
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Fashion Gender Classification Dataset with labeled male and female images for binary classification tasks. Ideal for training, validating, and testing machine learning models using frameworks like TensorFlow, Keras, and PyTorch

  14. d

    MountainScape Segmentation Dataset

    • search.dataone.org
    • borealisdata.ca
    Updated Dec 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mountain Legacy Project (2024). MountainScape Segmentation Dataset [Dataset]. http://doi.org/10.5683/SP3/CEYU10
    Explore at:
    Dataset updated
    Dec 11, 2024
    Dataset provided by
    Borealis
    Authors
    Mountain Legacy Project
    Time period covered
    Jan 1, 1870 - Aug 30, 2023
    Description

    This dataset contains the MountainScape Segmentation Dataset (MS2D), a collection of oblique mountain images from the Mountain Legacy Project and corresponding manually annotated land cover masks. The dataset is split into 144 historic grayscale images collected by early phototopographic surveyors and 140 modern repeat images captured by the Mountain Legacy Project. The image resolutions range from 16 to 80 megapixels and the corresponding masks are RGB images with 8 landcover classes. The image dataset was used to train and test the Python Landscape Classifier (PyLC), a trainable segmentation network and land cover classification tool for oblique landscape photography. The dataset also contains PyTorch models trained with PyLC using the collection of images and masks.

  15. Confusion matrix and class-level user’s and producer’s accuracies for...

    • plos.figshare.com
    xls
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron E. Maxwell; Sarah Farhadpour; Srinjoy Das; Yalin Yang (2024). Confusion matrix and class-level user’s and producer’s accuracies for landcover.ai [53] classification. [Dataset]. http://doi.org/10.1371/journal.pone.0315127.t008
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 5, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Aaron E. Maxwell; Sarah Farhadpour; Srinjoy Das; Yalin Yang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overall accuracy = 0.908, macro-averaged producer’s accuracy = 0.885, macro-averaged user’s accuracy = 0.770, and macro-averaged F1-score = 0.823.

  16. Model assessment metrics based on ten model replicates with different random...

    • plos.figshare.com
    xls
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron E. Maxwell; Sarah Farhadpour; Srinjoy Das; Yalin Yang (2024). Model assessment metrics based on ten model replicates with different random seeds and training subsets. [Dataset]. http://doi.org/10.1371/journal.pone.0315127.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 5, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Aaron E. Maxwell; Sarah Farhadpour; Srinjoy Das; Yalin Yang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Model assessment metrics based on ten model replicates with different random seeds and training subsets.

  17. h

    processed-jigsaw-toxic-comments

    • huggingface.co
    Updated Oct 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    K Koushik Reddy (2023). processed-jigsaw-toxic-comments [Dataset]. https://huggingface.co/datasets/Koushim/processed-jigsaw-toxic-comments
    Explore at:
    Dataset updated
    Oct 10, 2023
    Authors
    K Koushik Reddy
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Processed Jigsaw Toxic Comments Dataset

    This is a preprocessed and tokenized version of the original Jigsaw Toxic Comment Classification Challenge dataset, prepared for multi-label toxicity classification using transformer-based models like BERT. ⚠️ Important Note: I am not the original creator of the dataset. This dataset is a cleaned and restructured version made for quick use in PyTorch deep learning models.

      📦 Dataset Features
    

    Each example contains:

    text: The… See the full description on the dataset page: https://huggingface.co/datasets/Koushim/processed-jigsaw-toxic-comments.

  18. PyTorchVGG19

    • kaggle.com
    zip
    Updated Nov 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ibrahim Salameh (2022). PyTorchVGG19 [Dataset]. https://www.kaggle.com/datasets/ibrahimsalameh/pytorchvgg19
    Explore at:
    zip(533106630 bytes)Available download formats
    Dataset updated
    Nov 16, 2022
    Authors
    Ibrahim Salameh
    Description

    Dataset

    This dataset was created by Ibrahim Salameh

    Contents

  19. IMDB 50K Movie Reviews (TEST your BERT)

    • kaggle.com
    zip
    Updated Dec 18, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atul Anand {Jha} (2019). IMDB 50K Movie Reviews (TEST your BERT) [Dataset]. https://www.kaggle.com/atulanandjha/imdb-50k-movie-reviews-test-your-bert
    Explore at:
    zip(26933554 bytes)Available download formats
    Dataset updated
    Dec 18, 2019
    Authors
    Atul Anand {Jha}
    License

    http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

    Description

    Context

    Large Movie Review Dataset v1.0 . 😃

    https://static.amazon.jobs/teams/53/images/IMDb_Header_Page.jpg?1501027252" alt="IMDB wall">

    This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. Provided a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well. Raw text and already processed bag of words formats are provided.

    In the entire collection, no more than 30 reviews are allowed for any given movie because reviews for the same movie tend to have correlated ratings. Further, the train and test sets contain a disjoint set of movies, so no significant performance is obtained by memorising movie-unique terms and their associated with observed labels. In the labelled train/test sets, a negative review has a score <= 4 out of 10, and a positive review has a score >= 7 out of 10. Thus reviews with more neutral ratings are not included in the train/test sets. In the unsupervised set, reviews of any rating are included and there are an even number of reviews > 5 and <= 5.

    Reference: http://ai.stanford.edu/~amaas/data/sentiment/

    NOTE

    A starter kernel is here : https://www.kaggle.com/atulanandjha/bert-testing-on-imdb-dataset-starter-kernel

    A kernel to expose Dataset collection :

    Content

    Now let’s understand the task in hand: given a movie review, predict whether it’s positive or negative.

    The dataset we use is 50,000 IMDB reviews (25K for train and 25K for test) from the PyTorch-NLP library.

    Each review is tagged pos or neg .

    There are 50% positive reviews and 50% negative reviews both in train and test sets.

    Columns:

    text : Reviews from people.

    Sentiment : Negative or Positive tag on the review/feedback (Boolean).

    Acknowledgements

    When using this Dataset Please Cite this ACL paper using :

    @InProceedings{

    maas-EtAl:2011:ACL-HLT2011,

    author = {Maas, Andrew L. and Daly, Raymond E. and Pham, Peter T. and Huang, Dan and Ng, Andrew Y. and Potts, Christopher},

    title = {Learning Word Vectors for Sentiment Analysis},

    booktitle = {Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies},

    month = {June},

    year = {2011},

    address = {Portland, Oregon, USA},

    publisher = {Association for Computational Linguistics},

    pages = {142--150},

    url = {http://www.aclweb.org/anthology/P11-1015}

    }

    Link to ref Dataset: https://pytorchnlp.readthedocs.io/en/latest/_modules/torchnlp/datasets/imdb.html

    https://www.samyzaf.com/ML/imdb/imdb.html

    Inspiration

    BERT and other Transformer Architecture models have always been on hype recently due to a great breakthrough by introducing Transfer Learning in NLP. So, Let's use this simple yet efficient Data-set to Test these models, and also compare our results with theirs. Also, I invite fellow researchers to try out their State of the Art Algorithms on this data-set.

  20. f

    Confusion matrix and derived metrics for topoDL [52] classification.

    • plos.figshare.com
    • figshare.com
    xls
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron E. Maxwell; Sarah Farhadpour; Srinjoy Das; Yalin Yang (2024). Confusion matrix and derived metrics for topoDL [52] classification. [Dataset]. http://doi.org/10.1371/journal.pone.0315127.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 5, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Aaron E. Maxwell; Sarah Farhadpour; Srinjoy Das; Yalin Yang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Confusion matrix and derived metrics for topoDL [52] classification.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shubham Patel (2023). Name classification [Dataset]. https://www.kaggle.com/datasets/shubhampatel231/name-classification/discussion
Organization logo

Name classification

PyTorch's name classification dataset

Explore at:
zip(63211 bytes)Available download formats
Dataset updated
Dec 20, 2023
Authors
Shubham Patel
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Dataset Name: Multilingual Names for RNN Classification

Description:
This dataset, sourced from PyTorch's official tutorial, comprises popular names across 18 distinct languages, namely Arabic, Chinese, Czech, Dutch, English, French, German, Greek, Irish, Italian, Japanese, Korean, Polish, Portuguese, Russian, Scottish, Spanish, and Vietnamese. Each language's names are contained in separate text files for easy extraction and categorization.

Usage:
The dataset is particularly useful for tasks like Recurrent Neural Network (RNN) classification, where the aim might be to predict the language origin of a given name based on its character sequence.

Source:
PyTorch Official Tutorial - Char RNN Classification

Search
Clear search
Close search
Google apps
Main menu