100+ datasets found
  1. NLP project

    • kaggle.com
    zip
    Updated Dec 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rawan7544 (2024). NLP project [Dataset]. https://www.kaggle.com/datasets/rawan7544/nlp-project
    Explore at:
    zip(1584256901 bytes)Available download formats
    Dataset updated
    Dec 21, 2024
    Authors
    Rawan7544
    Description

    Dataset

    This dataset was created by Rawan1652002

    Contents

  2. Sentiment Analysis Dataset for NLP Projects

    • kaggle.com
    zip
    Updated Nov 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AlyAhmedTS13 (2025). Sentiment Analysis Dataset for NLP Projects [Dataset]. https://www.kaggle.com/datasets/alyahmedts13/reddit-sentiment-analysis-dataset-for-nlp-projects
    Explore at:
    zip(1204347 bytes)Available download formats
    Dataset updated
    Nov 16, 2025
    Authors
    AlyAhmedTS13
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    🕹️ About Dataset

    This dataset contains short Reddit posts (≤280 characters) about pop music and pop stars, labeled for sentiment analysis.

    We collected ~124k posts using keywords like Taylor Swift, Olivia Rodrigo, Grammy, Billboard, and subreddits like popheads, Music, and Billboard. After cleaning and filtering, we kept only short-form, English posts and combined each post’s title and body into a single text column.

    The final data set is about 32,000+ rows

    Sentiment labels (positive, neutral, negative) were generated using a BERT-based model fine-tuned for social media (CardiffNLP’s Twitter RoBERTa).

    This version is ready for NLP sentiment projects — train your own model, explore pop fandom discourse, or benchmark transformer performance on real-world Reddit data.

  3. Nlp project dataset 4-6-2025

    • kaggle.com
    zip
    Updated Jun 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dao Xuan Tan (2025). Nlp project dataset 4-6-2025 [Dataset]. https://www.kaggle.com/datasets/daoxuantan/nlp-project-dataset-4-6-2025
    Explore at:
    zip(173109225 bytes)Available download formats
    Dataset updated
    Jun 4, 2025
    Authors
    Dao Xuan Tan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset includes articles, includes fake and truth articles.

    • True Articles:

      • Sources: Reputable media outlets like Reuters, The New York Times, The Washington Post, etc.
    • Fake/Misinformation/Propaganda Articles:

      • Sources: American right-wing extremist websites (e.g., Redflag Newsdesk, Breitbart, Truth Broadcast Network)
      • Public dataset from:

        • Ahmed, H., Traore, I., & Saad, S. (2017): "Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques" (Springer LNCS 10618)

    Columns:

    • Column 1: Index
    • Column 2: The articles
    • Column 3: The label of the article. 1 if true, 0 if fake

    Preprocess

    The author have drop NaN and duplicate values.

  4. NLP Project Data

    • kaggle.com
    zip
    Updated Apr 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KGopichand (2022). NLP Project Data [Dataset]. https://www.kaggle.com/datasets/kgopichand/nlp-project-data
    Explore at:
    zip(393292513 bytes)Available download formats
    Dataset updated
    Apr 26, 2022
    Authors
    KGopichand
    Description

    Dataset

    This dataset was created by KGopichand

    Contents

  5. NLP Project Dataset

    • kaggle.com
    zip
    Updated Nov 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nikunj Phutela (2024). NLP Project Dataset [Dataset]. https://www.kaggle.com/datasets/nikunjphutela/nlp-project-dataset/discussion?sort=undefined
    Explore at:
    zip(46444 bytes)Available download formats
    Dataset updated
    Nov 9, 2024
    Authors
    Nikunj Phutela
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Nikunj Phutela

    Released under MIT

    Contents

  6. NLP Project - Paraphrase Detection

    • kaggle.com
    zip
    Updated Oct 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Big D Dang (2023). NLP Project - Paraphrase Detection [Dataset]. https://www.kaggle.com/datasets/bigddang/nlp-project-paraphrase-detection
    Explore at:
    zip(522141 bytes)Available download formats
    Dataset updated
    Oct 21, 2023
    Authors
    Big D Dang
    Description

    Dataset

    This dataset was created by Big D Dang

    Contents

  7. arxiv papers dataset for NLP project

    • kaggle.com
    zip
    Updated May 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mao Lee (2022). arxiv papers dataset for NLP project [Dataset]. https://www.kaggle.com/datasets/maolee/arxiv-papers-dataset-for-nlp-project
    Explore at:
    zip(173635469 bytes)Available download formats
    Dataset updated
    May 11, 2022
    Authors
    Mao Lee
    Description

    This file contains some arxiv article titles, subject category and abstracts. One may use NLP technique to analyze the dataset, for instance topics modelling.

  8. NLP PROJECT

    • kaggle.com
    zip
    Updated Nov 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Owner (2024). NLP PROJECT [Dataset]. https://www.kaggle.com/datasets/mazens2/nlp-project/code
    Explore at:
    zip(240728 bytes)Available download formats
    Dataset updated
    Nov 16, 2024
    Authors
    Owner
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Owner

    Released under Apache 2.0

    Contents

  9. NLP Mental Health Conversations

    • kaggle.com
    zip
    Updated Nov 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). NLP Mental Health Conversations [Dataset]. https://www.kaggle.com/datasets/thedevastator/nlp-mental-health-conversations
    Explore at:
    zip(1552188 bytes)Available download formats
    Dataset updated
    Nov 24, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    NLP Mental Health Conversations

    Stimulating AI-Driven Mental Health Guidance

    By Huggingface Hub [source]

    About this dataset

    This dataset contains conversations between users and experienced psychologists related to mental health topics. Carefully collected and anonymized, the data can be used to further the development of Natural Language Processing (NLP) models which focus on providing mental health advice and guidance. It consists of a variety of questions which will help train NLP models to provide users with appropriate advice in response to their queries. Whether you're an AI developer interested in building the next wave of mental health applications or a therapist looking for insights into how technology is helping people connect; this dataset provides invaluable support for advancing our understanding of human relationships through Artificial Intelligence

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This guide will provide you with the necessary knowledge to effectively use this dataset for Natural Language Processing (NLP)-based applications.

    • Download and install the dataset: To begin using the dataset, download it from Kaggle onto your system. Once downloaded, unzip and extract the .csv file into a directory of your choice.

    • Familiarize yourself with the columns: Before working with the data, it’s important to familiarize yourself with all of its components. This dataset contains two columns - Context and Response - which are intentionally structured to produce conversations between users and psychologists related to mental health topics for NLP models dedicated to providing mental health advice and guidance.

    • Analyze data entries: If possible or desired, take time now to analyze what is included in each entry; this may help you better untangle any challenges that come up during subsequent processes yet won't be required for most steps going forward if you prefer not too jump ahead of yourself at this juncture of your work process just yet! Examine questions asked by users as well as answers provided by experts in order glean an overall picture of what types of conversations are taking place within this pool of data that can help guide further work on NLP models for AI-driven mental health guidance purposes later on down the road!

    • Cleanse any information not applicable to NLP decisioning relevant application goals: It's important that only meaningful items related towards achieving AI-driven results remain within a clean copy of this Dataset going forward; consider removing all extra many verbatim entries or other pieces uneeded while also otherwise making sure all included content adheres closely enough one particular decisions purpose expected from an end goal perspective before proceeding onwards now until an ultimate end result has been successfully achieved eventually afterwards later on next afterward soon afterwards too following conveniently satisfyingly after accordingly shortly near therefore meaningfully likewise conclusively thoroughly properly productively purposely then eventually effectively finally indeed desirably plus concludingly enjoyably popularly splendidly attractively satisfactorally propitiously outstandingly fluently promisingly opportunely in conclusion efficiently hopefully progressively breathtaking deliciousness ideally genius mayhem invented unique impossibility everlastingly intense qualitative cohesiveness behaviorally affectionately fixed voraciously like alive supportively choicest decisively luckily chaotically co-creatively introducing ageless intricacy voicing auspicious promise enterprisingly preferred mathematically godly happening humorous respective achieve ultra favorability fundamentals essentials speciality grandiose selectively perfectly

    Research Ideas

    • Creating sentence-matching algorithms for natural language processing to accurately match given questions with appropriate advice and guidance.
    • Analyzing the psychological conversations to gain insights into topics such as stress, anxiety, and depression.
    • Developing personalized natural language processing models tailored to provide users with appropriate advice based on their queries and based on their individual state of mental health

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    **License: [CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication](https://creativec...

  10. cornell-nlp-project-lstm

    • kaggle.com
    zip
    Updated Apr 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cristiano Battistini (2025). cornell-nlp-project-lstm [Dataset]. https://www.kaggle.com/datasets/cristianobattistini/cornell-nlp-project-lstm
    Explore at:
    zip(10042993 bytes)Available download formats
    Dataset updated
    Apr 16, 2025
    Authors
    Cristiano Battistini
    Description

    Dataset

    This dataset was created by Cristiano Battistini

    Contents

  11. nlp_project_dataset

    • kaggle.com
    zip
    Updated Nov 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hayfay27 (2024). nlp_project_dataset [Dataset]. https://www.kaggle.com/datasets/hayfay27/nlp-project-dataset/code
    Explore at:
    zip(1906922210 bytes)Available download formats
    Dataset updated
    Nov 10, 2024
    Authors
    hayfay27
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by hayfay27

    Released under Apache 2.0

    Contents

  12. dataset for nlp project

    • kaggle.com
    zip
    Updated Nov 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Naman Gautam (2024). dataset for nlp project [Dataset]. https://www.kaggle.com/datasets/namang04/dataset-for-nlp-project/code
    Explore at:
    zip(308714 bytes)Available download formats
    Dataset updated
    Nov 9, 2024
    Authors
    Naman Gautam
    Description

    Dataset

    This dataset was created by Naman Gautam

    Contents

  13. Natural Language Processing - IntenCareer Project

    • kaggle.com
    zip
    Updated May 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nature (2024). Natural Language Processing - IntenCareer Project [Dataset]. https://www.kaggle.com/datasets/marknature/natural-language-processing-intencareer-project
    Explore at:
    zip(141866520 bytes)Available download formats
    Dataset updated
    May 22, 2024
    Authors
    Nature
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Task 1: Natural Language Processing (NLP) - IntenCareer Project

    Overview

    This project aims to develop an NLP model for tasks like sentiment analysis, text classification, or named entity recognition.

    Steps

    1. Project Selection: Choose a specific NLP task.
    2. Data Collection: Gather and prepare a dataset relevant to the task. Dataset was too big to push
    3. Preprocessing: Clean and preprocess the text data.
    4. Model Development: Develop an NLP model using ML or DL techniques.
    5. Training and Evaluation: Train the model and evaluate its performance.
    6. Results Presentation: Present the results, including model accuracy and insights.

    For more details, refer to the project guidelines. LinkedIn: https://www.linkedin.com/in/marknature-c/ GitHub: https://github.com/marknature/

  14. Reviews Dataset for NLP Project

    • kaggle.com
    zip
    Updated Mar 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aman J (2025). Reviews Dataset for NLP Project [Dataset]. https://www.kaggle.com/python4sp/reviews-dataset-for-nlp-project
    Explore at:
    zip(2712937270 bytes)Available download formats
    Dataset updated
    Mar 22, 2025
    Authors
    Aman J
    Description

    Dataset

    This dataset was created by Aman J

    Contents

  15. nlp-project

    • kaggle.com
    zip
    Updated Nov 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    bhavya (2025). nlp-project [Dataset]. https://www.kaggle.com/datasets/bhavya260421/nlp-project
    Explore at:
    zip(1888326962 bytes)Available download formats
    Dataset updated
    Nov 5, 2025
    Authors
    bhavya
    Description

    Dataset

    This dataset was created by bhavya

    Contents

  16. nlp_project_dataset

    • kaggle.com
    zip
    Updated Nov 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abylay Zhumagaliyev (2025). nlp_project_dataset [Dataset]. https://www.kaggle.com/datasets/abylayzhumagaliyev/nlp-project-dataset
    Explore at:
    zip(19218 bytes)Available download formats
    Dataset updated
    Nov 28, 2025
    Authors
    Abylay Zhumagaliyev
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Abylay Zhumagaliyev

    Released under Apache 2.0

    Contents

  17. nlp-project-tokenized

    • kaggle.com
    zip
    Updated Apr 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WatweQwee (2023). nlp-project-tokenized [Dataset]. https://www.kaggle.com/datasets/watweqwee/nlp-project-tokenized
    Explore at:
    zip(1451087385 bytes)Available download formats
    Dataset updated
    Apr 22, 2023
    Authors
    WatweQwee
    Description

    Tokenized data were tokenized by Deepcut of pythainlp word_tokenize. Token folder is the tokens of training data. TF-IDF must be pre-tokenized also by Deepcut.

  18. Student Review dataset

    • kaggle.com
    zip
    Updated Feb 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JoyEtike (2021). Student Review dataset [Dataset]. https://www.kaggle.com/atk510/student-review-dataset
    Explore at:
    zip(16732898 bytes)Available download formats
    Dataset updated
    Feb 24, 2021
    Authors
    JoyEtike
    Description

    Dataset

    This dataset was created by JoyEtike

    Contents

  19. nlp_project_data

    • kaggle.com
    zip
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaodong Shi (2024). nlp_project_data [Dataset]. https://www.kaggle.com/datasets/xiaodongshiprince/nlp-project-data/code
    Explore at:
    zip(2764 bytes)Available download formats
    Dataset updated
    Jun 25, 2024
    Authors
    Xiaodong Shi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Xiaodong Shi

    Released under MIT

    Contents

  20. NLP(national language processing) Project

    • kaggle.com
    zip
    Updated Sep 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HIMANSHU_SURYAVANSHI1 (2022). NLP(national language processing) Project [Dataset]. https://www.kaggle.com/datasets/himanshusuryavanshi1/nlpnational-language-processing-project
    Explore at:
    zip(36701 bytes)Available download formats
    Dataset updated
    Sep 18, 2022
    Authors
    HIMANSHU_SURYAVANSHI1
    Description

    Dataset

    This dataset was created by HIMANSHU_SURYAVANSHI1

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rawan7544 (2024). NLP project [Dataset]. https://www.kaggle.com/datasets/rawan7544/nlp-project
Organization logo

NLP project

Explore at:
zip(1584256901 bytes)Available download formats
Dataset updated
Dec 21, 2024
Authors
Rawan7544
Description

Dataset

This dataset was created by Rawan1652002

Contents

Search
Clear search
Close search
Google apps
Main menu