Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset consists of daily-updated user reviews and ratings for the ChatGPT Android App. The dataset includes several key attributes that capture various aspects of the reviews, providing insights into user experiences and feedback over time.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
We have compiled a dataset that consists of textual articles including common terminology, concepts and definitions in the field of computer science, artificial intelligence, and cyber security. This dataset consists of both human-generated text and OpenAI’s ChatGPT-generated text. Human-generated answers were collected from different computer science dictionaries and encyclopedias including “The Encyclopedia of Computer Science and Technology” and "Encyclopedia of Human-Computer Interaction". AI-generated content in our dataset was produced by simply posting questions to OpenAI’s ChatGPT and manually documenting the resulting responses. A rigorous data-cleaning process has been performed to remove unwanted Unicode characters, styling and formatting tags. To structure our dataset for binary classification, we combined both AI-generated and Human-generated answers into a single column and assigned appropriate labels to each data point (Human-generated = 0 and AI-generated = 1).
This creates our article-level dataset (article_level_data.csv) which consists of a total of 1018 articles, 509 AI-generated and 509 Human-generated. Additionally, we have divided each article into its sentences and labelled them accordingly. This is mainly to evaluate the performance of classification models and pipelines when it comes to shorter sentence-level data points. This constructs our sentence-level dataset (sentence_level_data.csv) which consists of a total of 7344 entries (4008 AI-generated and 3336 Human-generated).
We appreciate it, if you cite the following article if you happen to use this dataset in any scientific publication:
Maktab Dar Oghaz, M., Dhame, K., Singaram, G., & Babu Saheer, L. (2023). Detection and Classification of ChatGPT Generated Contents Using Deep Transformer Models. Frontiers in Artificial Intelligence.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents ChatGPT usage patterns across different age groups, showing the percentage of users who have followed its advice, used it without following advice, or have never used it, based on a 2025 U.S. survey.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The DataSet consists of user reviews of ChatGPT, including Textual Feedback, Ratings, and Review Dates. The Reviews Range from brief comments to more detailed feedback by covering a wide range of user sentiments. The ratings are on a scale of 1 to 5, representing varying levels of Satisfaction. The dataset spans multiple months, providing a temporal dimension for analysis. Each review is accompanied by a timestamp, allowing for Time-Series analysis of sentiment trends.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset shows the types of advice users sought from ChatGPT based on a 2025 U.S. survey, including education, financial, medical, and legal topics.
Facebook
TwitterThe rapid advancements in generative AI models present new opportunities in the education sector. However, it is imperative to acknowledge and address the potential risks and concerns that may arise with their use. We collected Twitter data to identify key concerns related to the use of ChatGPT in education. This dataset is used to support the study "ChatGPT in education: A discourse analysis of worries and concerns on social media."
In this study, we particularly explored two research questions. RQ1 (Concerns): What are the key concerns that Twitter users perceive with using ChatGPT in education? RQ2 (Accounts): Which accounts are implicated in the discussion of these concerns? In summary, our study underscores the importance of responsible and ethical use of AI in education and highlights the need for collaboration among stakeholders to regulate AI policy.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains academic and behavioral information of computer science students, including their CGPA, ChatGPT usage patterns, and an evaluated aptitude score. It is designed to study the correlation between AI tool usage and critical thinking ability.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset shows how men and women in the U.S. reported using ChatGPT in a 2025 survey, including whether they followed its advice or chose not to use it.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all available conversations from chatlogs.net between users and ChatGPT. Version 1 contains all conversations available up to the cutoff date of April 4, 2023. Version 1 contains all conversations available up to the cutoff date of April 20, 2023.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents how much users trust ChatGPT across different advice categories, including career, education, financial, legal, and medical advice, based on a 2025 U.S. survey.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset for this research project was meticulously constructed to investigate the adoption of ChatGPT among students in the United States. The primary objective was to gain insights into the technological barriers and resistances faced by students in integrating ChatGPT into their information systems. The dataset was designed to capture the diverse adoption patterns among students in various public and private schools and universities across the United States. By examining adoption rates, frequency of usage, and the contexts in which ChatGPT is employed, the research sought to provide a comprehensive understanding of how students are incorporating this technology into their information systems. Moreover, by including participants from diverse educational institutions, the research sought to ensure a comprehensive representation of the student population in the United States. This approach aimed to provide nuanced insights into how factors such as educational background, institution type, and technological familiarity influence ChatGPT adoption.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset summarizes how ChatGPT users rated the outcomes of the advice they received, including whether it was helpful, harmful, neutral, or uncertain, based on a 2025 U.S. survey.
Facebook
TwitterDon't forget to upvote, comment, and follow if you are using this dataset. If you have any questions about the dataset I uploaded, feel free to leave them in the comments. Thank you! :)
Jangan lupa untuk upvote, comment, follow jika anda menggunakan dataset ini, dan jika ada pertanyaan mengenai dataset yang saya upload, silahkan tinggalkan di comment. Terima kasih :)
Column Descriptions (English) 1. reviewId: A unique ID for each user review. 2. userName: The name of the user who submitted the review. 3. userImage: The URL of the user's profile picture. 4. content: The text content of the review provided by the user. 5. score: The review score given by the user, typically on a scale of 1-5. 6. thumbsUpCount: The number of likes (thumbs up) received by the review. 7. reviewCreatedVersion: The app version used by the user when creating the review (not always available). 8. at: The date and time when the review was submitted. 9. replyContent: The developer's response to the review (no data available in this column). 10. repliedAt: The date and time when the developer's response was submitted (no data available in this column). 11. appVersion: The app version used by the user when submitting the review (not always available).
Deskripsi Kolom (Bahasa Indonesia) 1. reviewId: ID unik untuk setiap ulasan yang diberikan pengguna. 2. userName: Nama pengguna yang memberikan ulasan. 3. userImage: URL gambar profil pengguna yang memberikan ulasan. 4. content: Isi teks ulasan yang diberikan oleh pengguna. 5. score: Skor ulasan yang diberikan pengguna, biasanya dalam skala 1-5. 6. thumbsUpCount: Jumlah suka (thumbs up) yang diterima oleh ulasan tersebut. 7. reviewCreatedVersion: Versi aplikasi yang digunakan pengguna saat membuat ulasan (tidak selalu tersedia). 8. at: Tanggal dan waktu saat ulasan dibuat. 9. replyContent: Isi balasan dari pengembang aplikasi terhadap ulasan (tidak ada data dalam kolom ini). 10. repliedAt: Tanggal dan waktu saat balasan dari pengembang diberikan (tidak ada data dalam kolom ini). 11. appVersion: Versi aplikasi yang digunakan pengguna saat memberikan ulasan (tidak selalu tersedia).
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Most Used Features and Usage Statistics By September 2025, ChatGPT boasts ~800 million weekly users and 355 million monthly active users, with 2.6 billion daily messages across 700 million users. It's the #1 AI tool globally (60.6% market share), especially among under-25s (45% of users). Daily usage: 9% of 18-24-year-olds. Top use cases (from OpenAI's 2025 user study):
Everyday Productivity (52% of sessions): Email drafting, brainstorming, task lists. Content Creation/Writing (28%): Essays, social media, code snippets. Research & Learning (15%): Summaries, explanations, tutoring. Coding/Development (12%): Debugging, automation scripts. Creative Tools (e.g., Image Gen in GPT-4o/5, 10%): Art, voice chats.
Enterprise adoption hit 80% of Fortune 500 companies by mid-2025, with mobile apps driving 40% of traffic. Efficiency Metrics Efficiency in ChatGPT refers to speed, cost, productivity gains, and resource use. GPT-5 processes queries 3x faster than GPT-4o (under 1s for simple tasks) while using 50% less compute. Key metrics:
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Chatbots are AI-powered programs designed to replicate human conversation. They are capable of performing a wide range of tasks, including answering questions, offering directions, controlling smart home thermostats, and playing music, among other functions. ChatGPT is a popular AI-based chatbot that generates meaningful responses to queries, aiding people in learning. While some individuals support ChatGPT, others view it as a disruptive tool in the field of education. Discussions about this tool can be found across different social media platforms. Analyzing the sentiment of such social media data, which comprises people’s opinions, is crucial for assessing public sentiment regarding the success and shortcomings of such tools. This study performs a sentiment analysis and topic modeling on ChatGPT-based tweets. ChatGPT-based tweets are the author’s extracted tweets from Twitter using ChatGPT hashtags, where users share their reviews and opinions about ChatGPT, providing a reference to the thoughts expressed by users in their tweets. The Latent Dirichlet Allocation (LDA) approach is employed to identify the most frequently discussed topics in relation to ChatGPT tweets. For the sentiment analysis, a deep transformer-based Bidirectional Encoder Representations from Transformers (BERT) model with three dense layers of neural networks is proposed. Additionally, machine and deep learning models with fine-tuned parameters are utilized for a comparative analysis. Experimental results demonstrate the superior performance of the proposed BERT model, achieving an accuracy of 96.49%.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset reflects how Americans perceive ChatGPT's broader societal impact, based on a 2025 survey that asked whether the AI will help or harm humanity.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
A comprehensive, research-grade dataset capturing the adoption, usage, and impact of leading AI tools—such as ChatGPT, Midjourney, Stable Diffusion, Bard, and Claude—across multiple industries, countries, and user demographics. This dataset is designed for advanced analytics, machine learning, natural language processing, and business intelligence applications.
This dataset provides a panoramic view of how AI technologies are transforming business, industry, and society worldwide. Drawing inspiration from real-world adoption surveys, academic research, and industry reports, it enables users to:
To add a column descriptor (column description) to your Kaggle dataset's Data Card, you should provide a clear and concise explanation for each column. This improves dataset usability and helps users understand your data structure, which is highly recommended for achieving a 10/10 usability score on Kaggle[2][9].
Below is a ready-to-copy Column Descriptions table for your dataset. You can paste this into the "Column Descriptions" section of your Kaggle Data Card (after clicking the pencil/edit icon in the Data tab)[2][9]:
| Column Name | Description |
|---|---|
country | Country where the organization or user is located (e.g., USA, India, China, etc.) |
industry | Industry sector of the organization (e.g., Technology, Healthcare, Retail, etc.) |
ai_tool | Name of the AI tool used (e.g., ChatGPT, Midjourney, Bard, Stable Diffusion, Claude) |
adoption_rate | Percentage representing the adoption rate of the AI tool within the sector or company (0–100) |
daily_active_users | Estimated number of daily active users for the AI tool in the given context |
year | Year in which the data was recorded (2023 or 2024) |
user_feedback | Free-text feedback from users about their experience with the AI tool (up to 150 characters) |
age_group | Age group of users (e.g., 18-24, 25-34, 35-44, 45-54, 55+) |
company_size | Size category of the organization (Startup, SME, Enterprise) |
country,industry,ai_tool,adoption_rate,daily_active_users,year,user_feedback,age_group,company_size
USA,Technology,ChatGPT,78.5,5423,2024,"Great productivity boost for our team!",25-34,Enterprise
India,Healthcare,Midjourney,62.3,2345,2024,"Improved patient engagement and workflow.",35-44,SME
Germany,Manufacturing,Stable Diffusion,45.1,1842,2023,"Enhanced our design process.",45-54,Enterprise
Brazil,Retail,Bard,33.2,1200,2024,"Helped automate our customer support.",18-24,Startup
UK,Finance,Claude,55.7,2100,2023,"Increased accuracy in financial forecasting.",25-34,SME
import pandas as pd
df = pd.read_csv('/path/to/ai_adoption_dataset.csv')
print(df.head())
print(df.info())
industry_adoption = df.groupby(['industry', 'country'])['adoption_rate'].mean().reset_index()
print(industry_adoption.sort_values(by='adoption_rate', ascending=False).head(10))
import matplotlib.pyplot as plt
tool_counts = df['ai_tool'].value_counts()
tool_counts.plot(kind='bar', title='AI Tool Usage Distribution')
plt.xlabel('AI Tool')
plt.ylabel('Number of Records')
plt.show()
from textblob import TextBlob
df['feedback_sentiment'] = df['user_feedback'].apply(lambda x: TextBlob(x).sentiment.polarity)
print(df[['user_feedback', 'feedback_sentiment']].head())
yearly_trends = df.groupby(['year', 'ai_tool'])['adoption_rate'].mean().unstack()
yearly_trends.plot(marker='o', title='AI Tool Adoption Rate Over Time')
plt.xlabel('Year')
plt.ylabel('Average Adoption Rate (%)')
plt.show()
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset and documents used for the study execution provided by the authors. In this raw data, you can find the scripts used by the evaluators and the TAM form submitted by them during the evaluation process.This raw data is part of the developed work published on ICEIS 2025 by the title of: "Exploring the Use of ChatGPT for the Generation of User Story Based Test Cases: An Experimental Study"All the data presented here is licensed under the CC BY 4.0 license and should be followed for any copyright matter.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
*both authors contributed equally
Automated query script for automated language bias studies in GPT 3-5
Dataset of the paper "How User Language Affects Conflict Fatality Estimates in ChatGPT" preprint available on ArXiv
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for "Collective Cognition ChatGPT Conversations"
Dataset Description
Dataset Summary
The "Collective Cognition ChatGPT Conversations" dataset is a collection of chat logs between users and the ChatGPT model. These conversations have been shared by users on the "Collective Cognition" website. The dataset provides insights into user interactions with language models and can be utilized for multiple purposes, including training, research, and… See the full description on the dataset page: https://huggingface.co/datasets/CollectiveCognition/chats-data-2023-09-27.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset consists of daily-updated user reviews and ratings for the ChatGPT Android App. The dataset includes several key attributes that capture various aspects of the reviews, providing insights into user experiences and feedback over time.