100+ datasets found

NLP project
kaggle.com
zip
Updated Dec 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rawan7544 (2024). NLP project [Dataset]. https://www.kaggle.com/datasets/rawan7544/nlp-project
Explore at:
zip(1584256901 bytes)Available download formats
Dataset updated
Dec 21, 2024
Authors
Rawan7544
Description
Dataset

This dataset was created by Rawan1652002

Contents
Sentiment Analysis Dataset for NLP Projects
kaggle.com
zip
Updated Nov 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AlyAhmedTS13 (2025). Sentiment Analysis Dataset for NLP Projects [Dataset]. https://www.kaggle.com/datasets/alyahmedts13/reddit-sentiment-analysis-dataset-for-nlp-projects
Explore at:
zip(1204347 bytes)Available download formats
Dataset updated
Nov 16, 2025
Authors
AlyAhmedTS13
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
🕹️ About Dataset

This dataset contains short Reddit posts (≤280 characters) about pop music and pop stars, labeled for sentiment analysis.

We collected ~124k posts using keywords like Taylor Swift, Olivia Rodrigo, Grammy, Billboard, and subreddits like popheads, Music, and Billboard. After cleaning and filtering, we kept only short-form, English posts and combined each post’s title and body into a single text column.

The final data set is about 32,000+ rows

Sentiment labels (positive, neutral, negative) were generated using a BERT-based model fine-tuned for social media (CardiffNLP’s Twitter RoBERTa).

This version is ready for NLP sentiment projects — train your own model, explore pop fandom discourse, or benchmark transformer performance on real-world Reddit data.
Nlp project dataset 4-6-2025
kaggle.com
zip
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dao Xuan Tan (2025). Nlp project dataset 4-6-2025 [Dataset]. https://www.kaggle.com/datasets/daoxuantan/nlp-project-dataset-4-6-2025
Explore at:
zip(173109225 bytes)Available download formats
Dataset updated
Jun 4, 2025
Authors
Dao Xuan Tan
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset includes articles, includes fake and truth articles.

True Articles:

Sources: Reputable media outlets like Reuters, The New York Times, The Washington Post, etc.

Fake/Misinformation/Propaganda Articles:

Sources: American right-wing extremist websites (e.g., Redflag Newsdesk, Breitbart, Truth Broadcast Network)

Public dataset from:

Ahmed, H., Traore, I., & Saad, S. (2017): "Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques" (Springer LNCS 10618)

Columns:

Column 1: Index

Column 2: The articles

Column 3: The label of the article. 1 if true, 0 if fake

Preprocess

The author have drop NaN and duplicate values.
NLP Project Data
kaggle.com
zip
Updated Apr 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KGopichand (2022). NLP Project Data [Dataset]. https://www.kaggle.com/datasets/kgopichand/nlp-project-data
Explore at:
zip(393292513 bytes)Available download formats
Dataset updated
Apr 26, 2022
Authors
KGopichand
Description
Dataset

This dataset was created by KGopichand

Contents
NLP Project Dataset
kaggle.com
zip
Updated Nov 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nikunj Phutela (2024). NLP Project Dataset [Dataset]. https://www.kaggle.com/datasets/nikunjphutela/nlp-project-dataset/discussion?sort=undefined
Explore at:
zip(46444 bytes)Available download formats
Dataset updated
Nov 9, 2024
Authors
Nikunj Phutela
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Nikunj Phutela

Released under MIT

Contents
NLP Project - Paraphrase Detection
kaggle.com
zip
Updated Oct 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Big D Dang (2023). NLP Project - Paraphrase Detection [Dataset]. https://www.kaggle.com/datasets/bigddang/nlp-project-paraphrase-detection
Explore at:
zip(522141 bytes)Available download formats
Dataset updated
Oct 21, 2023
Authors
Big D Dang
Description
Dataset

This dataset was created by Big D Dang

Contents
arxiv papers dataset for NLP project
kaggle.com
zip
Updated May 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mao Lee (2022). arxiv papers dataset for NLP project [Dataset]. https://www.kaggle.com/datasets/maolee/arxiv-papers-dataset-for-nlp-project
Explore at:
zip(173635469 bytes)Available download formats
Dataset updated
May 11, 2022
Authors
Mao Lee
Description
This file contains some arxiv article titles, subject category and abstracts. One may use NLP technique to analyze the dataset, for instance topics modelling.
NLP PROJECT
kaggle.com
zip
Updated Nov 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Owner (2024). NLP PROJECT [Dataset]. https://www.kaggle.com/datasets/mazens2/nlp-project/code
Explore at:
zip(240728 bytes)Available download formats
Dataset updated
Nov 16, 2024
Authors
Owner
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Owner

Released under Apache 2.0

Contents
NLP Mental Health Conversations
kaggle.com
zip
Updated Nov 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). NLP Mental Health Conversations [Dataset]. https://www.kaggle.com/datasets/thedevastator/nlp-mental-health-conversations
Explore at:
zip(1552188 bytes)Available download formats
Dataset updated
Nov 24, 2023
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
NLP Mental Health Conversations

Stimulating AI-Driven Mental Health Guidance

By Huggingface Hub [source]

About this dataset

This dataset contains conversations between users and experienced psychologists related to mental health topics. Carefully collected and anonymized, the data can be used to further the development of Natural Language Processing (NLP) models which focus on providing mental health advice and guidance. It consists of a variety of questions which will help train NLP models to provide users with appropriate advice in response to their queries. Whether you're an AI developer interested in building the next wave of mental health applications or a therapist looking for insights into how technology is helping people connect; this dataset provides invaluable support for advancing our understanding of human relationships through Artificial Intelligence

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This guide will provide you with the necessary knowledge to effectively use this dataset for Natural Language Processing (NLP)-based applications.

Download and install the dataset: To begin using the dataset, download it from Kaggle onto your system. Once downloaded, unzip and extract the .csv file into a directory of your choice.

Familiarize yourself with the columns: Before working with the data, it’s important to familiarize yourself with all of its components. This dataset contains two columns - Context and Response - which are intentionally structured to produce conversations between users and psychologists related to mental health topics for NLP models dedicated to providing mental health advice and guidance.

Analyze data entries: If possible or desired, take time now to analyze what is included in each entry; this may help you better untangle any challenges that come up during subsequent processes yet won't be required for most steps going forward if you prefer not too jump ahead of yourself at this juncture of your work process just yet! Examine questions asked by users as well as answers provided by experts in order glean an overall picture of what types of conversations are taking place within this pool of data that can help guide further work on NLP models for AI-driven mental health guidance purposes later on down the road!

Cleanse any information not applicable to NLP decisioning relevant application goals: It's important that only meaningful items related towards achieving AI-driven results remain within a clean copy of this Dataset going forward; consider removing all extra many verbatim entries or other pieces uneeded while also otherwise making sure all included content adheres closely enough one particular decisions purpose expected from an end goal perspective before proceeding onwards now until an ultimate end result has been successfully achieved eventually afterwards later on next afterward soon afterwards too following conveniently satisfyingly after accordingly shortly near therefore meaningfully likewise conclusively thoroughly properly productively purposely then eventually effectively finally indeed desirably plus concludingly enjoyably popularly splendidly attractively satisfactorally propitiously outstandingly fluently promisingly opportunely in conclusion efficiently hopefully progressively breathtaking deliciousness ideally genius mayhem invented unique impossibility everlastingly intense qualitative cohesiveness behaviorally affectionately fixed voraciously like alive supportively choicest decisively luckily chaotically co-creatively introducing ageless intricacy voicing auspicious promise enterprisingly preferred mathematically godly happening humorous respective achieve ultra favorability fundamentals essentials speciality grandiose selectively perfectly

Research Ideas

Creating sentence-matching algorithms for natural language processing to accurately match given questions with appropriate advice and guidance.

Analyzing the psychological conversations to gain insights into topics such as stress, anxiety, and depression.

Developing personalized natural language processing models tailored to provide users with appropriate advice based on their queries and based on their individual state of mental health

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

**License: [CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication](https://creativec...
cornell-nlp-project-lstm
kaggle.com
zip
Updated Apr 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cristiano Battistini (2025). cornell-nlp-project-lstm [Dataset]. https://www.kaggle.com/datasets/cristianobattistini/cornell-nlp-project-lstm
Explore at:
zip(10042993 bytes)Available download formats
Dataset updated
Apr 16, 2025
Authors
Cristiano Battistini
Description
Dataset

This dataset was created by Cristiano Battistini

Contents
nlp_project_dataset
kaggle.com
zip
Updated Nov 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
hayfay27 (2024). nlp_project_dataset [Dataset]. https://www.kaggle.com/datasets/hayfay27/nlp-project-dataset/code
Explore at:
zip(1906922210 bytes)Available download formats
Dataset updated
Nov 10, 2024
Authors
hayfay27
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by hayfay27

Released under Apache 2.0

Contents
dataset for nlp project
kaggle.com
zip
Updated Nov 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Naman Gautam (2024). dataset for nlp project [Dataset]. https://www.kaggle.com/datasets/namang04/dataset-for-nlp-project/code
Explore at:
zip(308714 bytes)Available download formats
Dataset updated
Nov 9, 2024
Authors
Naman Gautam
Description
Dataset

This dataset was created by Naman Gautam

Contents
Natural Language Processing - IntenCareer Project
kaggle.com
zip
Updated May 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nature (2024). Natural Language Processing - IntenCareer Project [Dataset]. https://www.kaggle.com/datasets/marknature/natural-language-processing-intencareer-project
Explore at:
zip(141866520 bytes)Available download formats
Dataset updated
May 22, 2024
Authors
Nature
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Task 1: Natural Language Processing (NLP) - IntenCareer Project

Overview

This project aims to develop an NLP model for tasks like sentiment analysis, text classification, or named entity recognition.

Steps

Project Selection: Choose a specific NLP task.

Data Collection: Gather and prepare a dataset relevant to the task. Dataset was too big to push

Preprocessing: Clean and preprocess the text data.

Model Development: Develop an NLP model using ML or DL techniques.

Training and Evaluation: Train the model and evaluate its performance.

Results Presentation: Present the results, including model accuracy and insights.

For more details, refer to the project guidelines. LinkedIn: https://www.linkedin.com/in/marknature-c/ GitHub: https://github.com/marknature/
Reviews Dataset for NLP Project
kaggle.com
zip
Updated Mar 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aman J (2025). Reviews Dataset for NLP Project [Dataset]. https://www.kaggle.com/python4sp/reviews-dataset-for-nlp-project
Explore at:
zip(2712937270 bytes)Available download formats
Dataset updated
Mar 22, 2025
Authors
Aman J
Description
Dataset

This dataset was created by Aman J

Contents
nlp-project
kaggle.com
zip
Updated Nov 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
bhavya (2025). nlp-project [Dataset]. https://www.kaggle.com/datasets/bhavya260421/nlp-project
Explore at:
zip(1888326962 bytes)Available download formats
Dataset updated
Nov 5, 2025
Authors
bhavya
Description
Dataset

This dataset was created by bhavya

Contents
nlp_project_dataset
kaggle.com
zip
Updated Nov 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abylay Zhumagaliyev (2025). nlp_project_dataset [Dataset]. https://www.kaggle.com/datasets/abylayzhumagaliyev/nlp-project-dataset
Explore at:
zip(19218 bytes)Available download formats
Dataset updated
Nov 28, 2025
Authors
Abylay Zhumagaliyev
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Abylay Zhumagaliyev

Released under Apache 2.0

Contents
nlp-project-tokenized
kaggle.com
zip
Updated Apr 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
WatweQwee (2023). nlp-project-tokenized [Dataset]. https://www.kaggle.com/datasets/watweqwee/nlp-project-tokenized
Explore at:
zip(1451087385 bytes)Available download formats
Dataset updated
Apr 22, 2023
Authors
WatweQwee
Description
Tokenized data were tokenized by Deepcut of pythainlp word_tokenize. Token folder is the tokens of training data. TF-IDF must be pre-tokenized also by Deepcut.
Student Review dataset
kaggle.com
zip
Updated Feb 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
JoyEtike (2021). Student Review dataset [Dataset]. https://www.kaggle.com/atk510/student-review-dataset
Explore at:
zip(16732898 bytes)Available download formats
Dataset updated
Feb 24, 2021
Authors
JoyEtike
Description
Dataset

This dataset was created by JoyEtike

Contents
nlp_project_data
kaggle.com
zip
Updated Jun 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiaodong Shi (2024). nlp_project_data [Dataset]. https://www.kaggle.com/datasets/xiaodongshiprince/nlp-project-data/code
Explore at:
zip(2764 bytes)Available download formats
Dataset updated
Jun 25, 2024
Authors
Xiaodong Shi
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Xiaodong Shi

Released under MIT

Contents
NLP(national language processing) Project
kaggle.com
zip
Updated Sep 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HIMANSHU_SURYAVANSHI1 (2022). NLP(national language processing) Project [Dataset]. https://www.kaggle.com/datasets/himanshusuryavanshi1/nlpnational-language-processing-project
Explore at:
zip(36701 bytes)Available download formats
Dataset updated
Sep 18, 2022
Authors
HIMANSHU_SURYAVANSHI1
Description
Dataset

This dataset was created by HIMANSHU_SURYAVANSHI1

Contents

Facebook

Twitter

Click to copy link

Link copied

Cite

Rawan7544 (2024). NLP project [Dataset]. https://www.kaggle.com/datasets/rawan7544/nlp-project

NLP project

Explore at:

zip(1584256901 bytes)Available download formats

Dataset updated

Dec 21, 2024

Authors

Rawan7544

Description

Dataset

This dataset was created by Rawan1652002

Clear search

Close search

Google apps

Main menu

NLP project

Dataset

Contents

Sentiment Analysis Dataset for NLP Projects

🕹️ About Dataset

Nlp project dataset 4-6-2025

Columns:

Preprocess

NLP Project Data

Dataset

Contents

NLP Project Dataset

Dataset

Contents

NLP Project - Paraphrase Detection

Dataset

Contents

arxiv papers dataset for NLP project

NLP PROJECT

Dataset

Contents

NLP Mental Health Conversations

NLP Mental Health Conversations

Stimulating AI-Driven Mental Health Guidance

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

cornell-nlp-project-lstm

Dataset

Contents

nlp_project_dataset

Dataset

Contents

dataset for nlp project

Dataset

Contents

Natural Language Processing - IntenCareer Project

Task 1: Natural Language Processing (NLP) - IntenCareer Project

Overview

Steps

Reviews Dataset for NLP Project

Dataset

Contents

nlp-project

Dataset

Contents

nlp_project_dataset

Dataset

Contents

nlp-project-tokenized

Student Review dataset

Dataset

Contents

nlp_project_data

Dataset

Contents

NLP(national language processing) Project

Dataset

Contents

NLP projectSee More Versions

Dataset

Contents

NLP project