60 datasets found

ChatGPT Classification Dataset
kaggle.com
zip
Updated Sep 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mahdi (2023). ChatGPT Classification Dataset [Dataset]. https://www.kaggle.com/datasets/mahdimaktabdar/chatgpt-classification-dataset
Explore at:
zip(718710 bytes)Available download formats
Dataset updated
Sep 7, 2023
Authors
Mahdi
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
We have compiled a dataset that consists of textual articles including common terminology, concepts and definitions in the field of computer science, artificial intelligence, and cyber security. This dataset consists of both human-generated text and OpenAI’s ChatGPT-generated text. Human-generated answers were collected from different computer science dictionaries and encyclopedias including “The Encyclopedia of Computer Science and Technology” and "Encyclopedia of Human-Computer Interaction". AI-generated content in our dataset was produced by simply posting questions to OpenAI’s ChatGPT and manually documenting the resulting responses. A rigorous data-cleaning process has been performed to remove unwanted Unicode characters, styling and formatting tags. To structure our dataset for binary classification, we combined both AI-generated and Human-generated answers into a single column and assigned appropriate labels to each data point (Human-generated = 0 and AI-generated = 1).

This creates our article-level dataset (article_level_data.csv) which consists of a total of 1018 articles, 509 AI-generated and 509 Human-generated. Additionally, we have divided each article into its sentences and labelled them accordingly. This is mainly to evaluate the performance of classification models and pipelines when it comes to shorter sentence-level data points. This constructs our sentence-level dataset (sentence_level_data.csv) which consists of a total of 7344 entries (4008 AI-generated and 3336 Human-generated).

We appreciate it, if you cite the following article if you happen to use this dataset in any scientific publication:

Maktab Dar Oghaz, M., Dhame, K., Singaram, G., & Babu Saheer, L. (2023). Detection and Classification of ChatGPT Generated Contents Using Deep Transformer Models. Frontiers in Artificial Intelligence.

https://www.techrxiv.org/users/692552/articles/682641/master/file/data/ChatGPT_generated_Content_Detection/ChatGPT_generated_Content_Detection.pdf
S
Test dataset of ChatGPT in medical field
scidb.cn
Updated Mar 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
robin shen (2023). Test dataset of ChatGPT in medical field [Dataset]. http://doi.org/10.57760/sciencedb.o00130.00001
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.o00130.00001
Dataset updated
Mar 3, 2023
Dataset provided by
Science Data Bank
Authors
robin shen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The researcher tests the QA capability of ChatGPT in the medical field from the following aspects:1. Test their reserve capacity for medical knowledge2. Check their ability to read literature and understand medical literature3. Test their ability of auxiliary diagnosis after reading case data4. Test its error correction ability for case data5. Test its ability to standardize medical terms6. Test their evaluation ability to experts7. Check their ability to evaluate medical institutionsThe conclusion is:ChatGPT has great potential in the application of medical and health care, and may directly replace human beings or even professionals at a certain level in some fields;The researcher preliminarily believe that ChatGPT has basic medical knowledge and the ability of multiple rounds of dialogue, and its ability to understand Chinese is not weak;ChatGPT has the ability to read, understand and correct cases;ChatGPT has the ability of information extraction and terminology standardization, and is quite excellent;ChatGPT has the reasoning ability of medical knowledge;ChatGPT has the ability of continuous learning. After continuous training, its level has improved significantly;ChatGPT does not have the academic evaluation ability of Chinese medical talents, and the results are not ideal;ChatGPT does not have the academic evaluation ability of Chinese medical institutions, and the results are not ideal;ChatGPT is an epoch-making product, which can become a useful assistant for medical diagnosis and treatment, knowledge service, literature reading, review and paper writing.
ChatGPT User Reviews
kaggle.com
zip
Updated Jun 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bhavik Jikadara (2024). ChatGPT User Reviews [Dataset]. https://www.kaggle.com/datasets/bhavikjikadara/chatgpt-user-feedback
Explore at:
zip(5709734 bytes)Available download formats
Dataset updated
Jun 30, 2024
Authors
Bhavik Jikadara
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Description

This dataset consists of daily-updated user reviews and ratings for the ChatGPT Android App. The dataset includes several key attributes that capture various aspects of the reviews, providing insights into user experiences and feedback over time.

Columns Explanation

userName: The display name of the user who posted the review.

content: The text content of the review. This column contains the actual review text written by the user. It includes user opinions, feedback, and detailed descriptions of their experiences with the ChatGPT app.

score: The rating given by the user, typically ranging from 1 to 5. This column captures the numerical rating provided by the user. Higher scores indicate better experiences, while lower scores indicate dissatisfaction.

thumbsUpCount: The number of thumbs up (likes) the review received. This column shows how many other users found the review helpful or agreed with the sentiments expressed. It serves as a measure of the review's relevancy and impact.

at: The timestamp of when the review was posted. This column includes the date and time when the review was submitted. It is crucial for tracking the temporal distribution of reviews and analyzing trends over time.

Collection Methods

Data Source: The data is collected from user reviews submitted through the ChatGPT Android App's review section on the Google Play Store.

Frequency: The dataset is updated daily to capture the most recent user feedback and ratings.

Automation: An automated script is used to scrape and compile the reviews, ensuring that the dataset is current and comprehensive.

Data Cleaning: Basic preprocessing is performed to ensure data quality, such as removing duplicates and handling missing values.
m
The Impact of AI and ChatGPT on Bangladeshi University Students
data.mendeley.com
Updated Jan 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Md Jhirul Islam (2025). The Impact of AI and ChatGPT on Bangladeshi University Students [Dataset]. http://doi.org/10.17632/zykphpvbr7.2
Explore at:
Unique identifier
https://doi.org/10.17632/zykphpvbr7.2
Dataset updated
Jan 6, 2025
Authors
Md Jhirul Islam
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Bangladesh
Description
The data set records the perceptions of Bangladeshi university students on the influence that AI tools, especially ChatGPT, have on their academic practices, learning experiences, and problem-solving abilities. The varying role of AI in education, which covers common usage statistics, what AI does to our creative abilities, its impact on our learning, and whether it could invade our privacy. This dataset reveals perspective on how AI tools are changing education in the country and offering valuable information for researchers, educators, policymakers, to understand trends, challenges, and opportunities in the adoption of AI in the academic contex.

Methodology Data Collection Method: Online survey using google from Participants: A total of 3,512 students from various Bangladeshi universities participated. Survey Questions:The survey included questions on demographic information, frequency of AI tool usage, perceived benefits, concerns regarding privacy, and impacts on creativity and learning.

Sampling Technique: Random sampling of university students Data Collection Period: June 2024 to December 2024

Privacy Compliance This dataset has been anonymized to remove any personally identifiable information (PII). It adheres to relevant privacy regulations to ensure the confidentiality of participants.

For further inquiries, please contact: Name: Md Jhirul Islam, Daffodil International University Email: jhirul15-4063@diu.edu.bd Phone: 01316317573
h
Chatgpt
huggingface.co
Updated Apr 12, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajdeep Chatterjee (2023). Chatgpt [Dataset]. https://huggingface.co/datasets/RajChat/Chatgpt
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 12, 2023
Authors
Rajdeep Chatterjee
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
OpenAssistant Conversations Dataset (OASST1)

Dataset Summary

In an effort to democratize research on large-scale alignment, we release OpenAssistant Conversations (OASST1), a human-generated, human-annotated assistant-style conversation corpus consisting of 161,443 messages in 35 different languages, annotated with 461,292 quality ratings, resulting in over 10,000 fully annotated conversation trees. The corpus is a product of a worldwide crowd-sourcing effort… See the full description on the dataset page: https://huggingface.co/datasets/RajChat/Chatgpt.
h
awesome-chatgpt-prompts
huggingface.co
Updated Dec 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fatih Kadir Akın (2023). awesome-chatgpt-prompts [Dataset]. https://huggingface.co/datasets/fka/awesome-chatgpt-prompts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 15, 2023
Authors
Fatih Kadir Akın
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
🧠 Awesome ChatGPT Prompts [CSV dataset]

This is a Dataset Repository of Awesome ChatGPT Prompts View All Prompts on GitHub

License

CC-0
h
scraped-chatgpt-conversations
huggingface.co
Updated Apr 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arya Nistane (2023). scraped-chatgpt-conversations [Dataset]. https://huggingface.co/datasets/ar852/scraped-chatgpt-conversations
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 6, 2023
Authors
Arya Nistane
Description
Dataset Card for Dataset Name

Dataset Summary

scraped-chatgpt-conversations contains ~100k conversations between a user and chatgpt that were shared online through reddit, twitter, or sharegpt. For sharegpt, the conversations were directly scraped from the website. For reddit and twitter, images were downloaded from submissions, segmented, and run through an OCR pipeline to obtain a conversation list. For information on how the each json file is structured, please see… See the full description on the dataset page: https://huggingface.co/datasets/ar852/scraped-chatgpt-conversations.
Datasets .csv
figshare.com
txt
Updated Jan 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yaser Alhasawi (2024). Datasets .csv [Dataset]. http://doi.org/10.6084/m9.figshare.25053146.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25053146.v1
Dataset updated
Jan 24, 2024
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Yaser Alhasawi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset for this research project was meticulously constructed to investigate the adoption of ChatGPT among students in the United States. The primary objective was to gain insights into the technological barriers and resistances faced by students in integrating ChatGPT into their information systems. The dataset was designed to capture the diverse adoption patterns among students in various public and private schools and universities across the United States. By examining adoption rates, frequency of usage, and the contexts in which ChatGPT is employed, the research sought to provide a comprehensive understanding of how students are incorporating this technology into their information systems. Moreover, by including participants from diverse educational institutions, the research sought to ensure a comprehensive representation of the student population in the United States. This approach aimed to provide nuanced insights into how factors such as educational background, institution type, and technological familiarity influence ChatGPT adoption.
h
ASRS-ChatGPT
huggingface.co
Updated Jun 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archana Tikayat Ray (2023). ASRS-ChatGPT [Dataset]. http://doi.org/10.57967/hf/0830
Explore at:
Unique identifier
https://doi.org/10.57967/hf/0830
Dataset updated
Jun 29, 2023
Authors
Archana Tikayat Ray
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Summary

The dataset contains a total of 9984 incident records and 9 columns. Some of the columns contain ground truth values whereas others contain information generated by ChatGPT based on the incident Narratives. The creation of this dataset is aimed at providing researchers with columns generated by using ChatGPT API which is not freely available.

Dataset Structure

The column names present in the dataset and their descriptions are provided below:

Column… See the full description on the dataset page: https://huggingface.co/datasets/archanatikayatray/ASRS-ChatGPT.
f
Data Sheet 2_Large language models generating synthetic clinical datasets: a...
frontiersin.figshare.com
figshare.com
xlsx
Updated Feb 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Austin A. Barr; Joshua Quan; Eddie Guo; Emre Sezgin (2025). Data Sheet 2_Large language models generating synthetic clinical datasets: a feasibility and comparative analysis with real-world perioperative data.xlsx [Dataset]. http://doi.org/10.3389/frai.2025.1533508.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/frai.2025.1533508.s002
Dataset updated
Feb 5, 2025
Dataset provided by
Frontiers
Authors
Austin A. Barr; Joshua Quan; Eddie Guo; Emre Sezgin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundClinical data is instrumental to medical research, machine learning (ML) model development, and advancing surgical care, but access is often constrained by privacy regulations and missing data. Synthetic data offers a promising solution to preserve privacy while enabling broader data access. Recent advances in large language models (LLMs) provide an opportunity to generate synthetic data with reduced reliance on domain expertise, computational resources, and pre-training.ObjectiveThis study aims to assess the feasibility of generating realistic tabular clinical data with OpenAI’s GPT-4o using zero-shot prompting, and evaluate the fidelity of LLM-generated data by comparing its statistical properties to the Vital Signs DataBase (VitalDB), a real-world open-source perioperative dataset.MethodsIn Phase 1, GPT-4o was prompted to generate a dataset with qualitative descriptions of 13 clinical parameters. The resultant data was assessed for general errors, plausibility of outputs, and cross-verification of related parameters. In Phase 2, GPT-4o was prompted to generate a dataset using descriptive statistics of the VitalDB dataset. Fidelity was assessed using two-sample t-tests, two-sample proportion tests, and 95% confidence interval (CI) overlap.ResultsIn Phase 1, GPT-4o generated a complete and structured dataset comprising 6,166 case files. The dataset was plausible in range and correctly calculated body mass index for all case files based on respective heights and weights. Statistical comparison between the LLM-generated datasets and VitalDB revealed that Phase 2 data achieved significant fidelity. Phase 2 data demonstrated statistical similarity in 12/13 (92.31%) parameters, whereby no statistically significant differences were observed in 6/6 (100.0%) categorical/binary and 6/7 (85.71%) continuous parameters. Overlap of 95% CIs were observed in 6/7 (85.71%) continuous parameters.ConclusionZero-shot prompting with GPT-4o can generate realistic tabular synthetic datasets, which can replicate key statistical properties of real-world perioperative data. This study highlights the potential of LLMs as a novel and accessible modality for synthetic data generation, which may address critical barriers in clinical data access and eliminate the need for technical expertise, extensive computational resources, and pre-training. Further research is warranted to enhance fidelity and investigate the use of LLMs to amplify and augment datasets, preserve multivariate relationships, and train robust ML models.
U
Data from: Dataset of the study: "Chatbots put to the test in math and logic...
researchdata.bath.ac.uk
Updated May 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vagelis Plevris; George Papazafeiropoulos; Alejandro Jimenez Rios (2023). Dataset of the study: "Chatbots put to the test in math and logic problems: A preliminary comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard" [Dataset]. http://doi.org/10.5281/zenodo.7940781
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.7940781
Dataset updated
May 20, 2023
Dataset provided by
Zenodo
Authors
Vagelis Plevris; George Papazafeiropoulos; Alejandro Jimenez Rios
Dataset funded by
Oslo Metropolitan University
Description
This dataset contains the 30 questions that were posed to the chatbots (i) ChatGPT-3.5; (ii) ChatGPT-4; and (iii) Google Bard, in May 2023 for the study “Chatbots put to the test in math and logic problems: A preliminary comparison and assessment of ChatGPT-3.5, ChatGPT-4, and Google Bard”. These 30 questions describe mathematics and logic problems that have a unique correct answer. The questions are fully described with plain text only, without the need for any images or special formatting. The questions are divided into two sets of 15 questions each (Set A and Set B). The questions of Set A are 15 “Original” problems that cannot be found online, at least in their exact wording, while Set B contains 15 “Published” problems that one can find online by searching on the internet, usually with their solution. Each question is posed three times to each chatbot.

This dataset contains the following: (i) The full set of the 30 questions, A01-A15 and B01-B15; (ii) the correct answer for each one of them; (iii) an explanation of the solution, for the problems where such an explanation is needed, (iv) the 30 (questions) × 3 (chatbots) × 3 (answers) = 270 detailed answers of the chatbots. For the published problems of Set B, we also provide a reference to the source where each problem was taken from.
e
ChatGPT Usage by Age Group – Survey Data
expresslegalfunding.com
html
Updated Sep 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Express Legal Funding (2025). ChatGPT Usage by Age Group – Survey Data [Dataset]. https://expresslegalfunding.com/chatgpt-study/
Explore at:
htmlAvailable download formats
Dataset updated
Sep 10, 2025
Dataset authored and provided by
Express Legal Funding
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
60+, 18–29, 30–44, 45–60
Description
This dataset presents ChatGPT usage patterns across different age groups, showing the percentage of users who have followed its advice, used it without following advice, or have never used it, based on a 2025 U.S. survey.
m
Data from: Higher Education Students’ Evolving Perceptions of ChatGPT:...
data.mendeley.com
Updated Apr 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aleksander Aristovnik (2025). Higher Education Students’ Evolving Perceptions of ChatGPT: Global Survey Data from the Academic Year 2024–2025 [Dataset]. http://doi.org/10.17632/nv2343nwsb.1
Explore at:
Unique identifier
https://doi.org/10.17632/nv2343nwsb.1
Dataset updated
Apr 21, 2025
Authors
Aleksander Aristovnik
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The introduction of ChatGPT in November 2022 marked a significant milestone in the application of artificial intelligence in higher education. Due to its advanced natural language processing capabilities, ChatGPT quickly became popular among students worldwide. However, the increasing acceptance of ChatGPT among students has attracted significant attention, sparking both excitement and scepticism globally. Building on the early students' perceptions of ChatGPT after the first year of introduction, a comprehensive and large-scale global survey was repeated between October 2024 and February 2025. The questionnaire was distributed in seven different languages: English, Italian, Spanish, Turkish, Japanese, Arabic, and Hebrew. It covered several aspects relevant to ChatGPT, including sociodemographic characteristics, usage, capabilities, regulation and ethical concerns, satisfaction and attitude, study issues and outcomes, skills development, labour market and skills mismatch, emotions, study and personal information, and general reflections. The survey targeted higher education students who are currently enrolled at any level in a higher education institution, are at least 18 years old, and have the legal capacity to provide free and voluntary consent to participate in an anonymous survey. Survey participants were recruited using a convenience sampling method, which involved promoting the survey in classrooms and through advertisements on university communication systems. The final dataset consists of 22,963 student responses from 120 different countries and territories. The data may prove useful for researchers studying students' perceptions of ChatGPT, including its implications across various aspects. Moreover, also higher education stakeholders may benefit from these data. While educators may benefit from the data in formulating curricula, including designing teaching methods and assessment tools, policymakers may consider the data when formulating strategies for higher education system development in the future.
h
chats-data-2023-09-27
huggingface.co
Updated Sep 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Collective Cognition (2023). chats-data-2023-09-27 [Dataset]. https://huggingface.co/datasets/CollectiveCognition/chats-data-2023-09-27
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 27, 2023
Dataset authored and provided by
Collective Cognition
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for "Collective Cognition ChatGPT Conversations"

Dataset Description Dataset Summary

The "Collective Cognition ChatGPT Conversations" dataset is a collection of chat logs between users and the ChatGPT model. These conversations have been shared by users on the "Collective Cognition" website. The dataset provides insights into user interactions with language models and can be utilized for multiple purposes, including training, research, and… See the full description on the dataset page: https://huggingface.co/datasets/CollectiveCognition/chats-data-2023-09-27.
e
ChatGPT Usage by Gender – Survey Data
expresslegalfunding.com
html
Updated Sep 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Express Legal Funding (2025). ChatGPT Usage by Gender – Survey Data [Dataset]. https://expresslegalfunding.com/chatgpt-study/
Explore at:
htmlAvailable download formats
Dataset updated
Sep 10, 2025
Dataset authored and provided by
Express Legal Funding
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Men, Women
Description
This dataset shows how men and women in the U.S. reported using ChatGPT in a 2025 survey, including whether they followed its advice or chose not to use it.
m
Higher Education Students' Early Perceptions of ChatGPT: Global Survey Data
data.mendeley.com
Updated Aug 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dejan Ravšelj (2024). Higher Education Students' Early Perceptions of ChatGPT: Global Survey Data [Dataset]. http://doi.org/10.17632/ymg9nsn6kn.1
Explore at:
Unique identifier
https://doi.org/10.17632/ymg9nsn6kn.1
Dataset updated
Aug 13, 2024
Authors
Dejan Ravšelj
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The introduction of ChatGPT in November 2022 marked a significant milestone in the application of artificial intelligence in higher education. Due to its advanced natural language processing capabilities, ChatGPT quickly became popular among students worldwide. However, the increasing acceptance of ChatGPT among students has attracted significant attention, sparking both excitement and scepticism globally. In order to capture early students' perceptions about ChatGPT, the most comprehensive and large-scale global survey to date was conducted between the beginning of October 2023 and the end of February 2024. The questionnaire was prepared in seven different languages: English, Italian, Spanish, Turkish, Japanese, Arabic, and Hebrew. It covered several aspects relevant to ChatGPT, including sociodemographic characteristics, usage, capabilities, regulation and ethical concerns, satisfaction and attitude, study issues and outcomes, skills development, labour market and skills mismatch, emotions, study and personal information, and general reflections. The survey targeted higher education students who are currently enrolled at any level in a higher education institution, are at least 18 years old, and have the legal capacity to provide free and voluntary consent to participate in an anonymous survey. Survey participants were recruited using a convenience sampling method, which involved promoting the survey in classrooms and through advertisements on university communication systems. The final dataset consists of 23,218 student responses from 109 different countries and territories. The data may prove useful for researchers studying students' perceptions of ChatGPT, including its implications across various aspects. Moreover, also higher education stakeholders may benefit from these data. While educators may benefit from the data in formulating curricula, including designing teaching methods and assessment tools, policymakers may consider the data when formulating strategies for higher education system development in the future.
e
Beliefs About ChatGPT’s Impact on Humanity – Survey Data
expresslegalfunding.com
html
Updated Sep 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Express Legal Funding (2025). Beliefs About ChatGPT’s Impact on Humanity – Survey Data [Dataset]. https://expresslegalfunding.com/chatgpt-study/
Explore at:
htmlAvailable download formats
Dataset updated
Sep 10, 2025
Dataset authored and provided by
Express Legal Funding
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Agree, Neutral, Disagree, Strongly disagree, Strongly agree – Will help humanity
Description
This dataset reflects how Americans perceive ChatGPT's broader societal impact, based on a 2025 survey that asked whether the AI will help or harm humanity.
Data from an Evaluation of ChatGPT for Nutrient Content Estimation from Meal...
figshare.com
xlsx
Updated Feb 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cathal O'Hara; Gráinne Kent; Angela Flynn; Eileen Gibney; Claire Timon (2025). Data from an Evaluation of ChatGPT for Nutrient Content Estimation from Meal Photographs [Dataset]. http://doi.org/10.6084/m9.figshare.28271003.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28271003.v1
Dataset updated
Feb 10, 2025
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Cathal O'Hara; Gráinne Kent; Angela Flynn; Eileen Gibney; Claire Timon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Background/Objectives: Advances in artificial intelligence now allow combined use of large language and vision models; however, there has been limited evaluation of their potential in dietary assessment. This data arose from a study that aimed to evaluate the accuracy of ChatGPT-4 in estimating nutritional content of commonly consumed meals from meal photographs.Methods: Meal photographs (n=114) were uploaded to ChatGPT, and it was asked to identify the foods in each meal, estimate their weight, and estimate the nutrient content of the meals for 16 nutrients for comparison with the known values. There were a total of 39 unique meals with each one photographed 3 times for 3 different portion sizes giving rise to 114 photographs. This dataset is in the form of an excel workbook containing four worksheets. The worksheet titled "ChatGPT Foods & Weights" contains the foods identified by ChatGPT in each of the 114 meal photographs as well as its estimate for the weight of each of those foods. The worksheet titled "Actual Foods & Weights" contains the true foods and weights for each of the meal photographs. The worksheet "ChatGPT Nutrition Estimates" contains ChatGPT's estimates of the nutrition content of each of the 114 meal photographs for 16 different nutrients. The worksheet "Actual Nutrition Content" contains the true nutrition content of the meals in the photographs.
chatGPT reviews from google play store
kaggle.com
zip
Updated Dec 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmad Selo Abadi (2024). chatGPT reviews from google play store [Dataset]. https://www.kaggle.com/datasets/ahmadseloabadi/chatgpt-reviews-from-google-play-store
Explore at:
zip(10517568 bytes)Available download formats
Dataset updated
Dec 13, 2024
Authors
Ahmad Selo Abadi
Description
Don't forget to upvote, comment, and follow if you are using this dataset. If you have any questions about the dataset I uploaded, feel free to leave them in the comments. Thank you! :)

Jangan lupa untuk upvote, comment, follow jika anda menggunakan dataset ini, dan jika ada pertanyaan mengenai dataset yang saya upload, silahkan tinggalkan di comment. Terima kasih :)

Column Descriptions (English) 1. reviewId: A unique ID for each user review. 2. userName: The name of the user who submitted the review. 3. userImage: The URL of the user's profile picture. 4. content: The text content of the review provided by the user. 5. score: The review score given by the user, typically on a scale of 1-5. 6. thumbsUpCount: The number of likes (thumbs up) received by the review. 7. reviewCreatedVersion: The app version used by the user when creating the review (not always available). 8. at: The date and time when the review was submitted. 9. replyContent: The developer's response to the review (no data available in this column). 10. repliedAt: The date and time when the developer's response was submitted (no data available in this column). 11. appVersion: The app version used by the user when submitting the review (not always available).

Deskripsi Kolom (Bahasa Indonesia) 1. reviewId: ID unik untuk setiap ulasan yang diberikan pengguna. 2. userName: Nama pengguna yang memberikan ulasan. 3. userImage: URL gambar profil pengguna yang memberikan ulasan. 4. content: Isi teks ulasan yang diberikan oleh pengguna. 5. score: Skor ulasan yang diberikan pengguna, biasanya dalam skala 1-5. 6. thumbsUpCount: Jumlah suka (thumbs up) yang diterima oleh ulasan tersebut. 7. reviewCreatedVersion: Versi aplikasi yang digunakan pengguna saat membuat ulasan (tidak selalu tersedia). 8. at: Tanggal dan waktu saat ulasan dibuat. 9. replyContent: Isi balasan dari pengembang aplikasi terhadap ulasan (tidak ada data dalam kolom ini). 10. repliedAt: Tanggal dan waktu saat balasan dari pengembang diberikan (tidak ada data dalam kolom ini). 11. appVersion: Versi aplikasi yang digunakan pengguna saat memberikan ulasan (tidak selalu tersedia).
e
Types of ChatGPT Advice Used – Survey Data
expresslegalfunding.com
html
Updated Sep 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Express Legal Funding (2025). Types of ChatGPT Advice Used – Survey Data [Dataset]. https://expresslegalfunding.com/chatgpt-study/
Explore at:
htmlAvailable download formats
Dataset updated
Sep 10, 2025
Dataset authored and provided by
Express Legal Funding
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Legal Advice, Career Advice, Educational Help, Financial Advice, Medical Information, Relationship Advice, Mental Health Topics, News / Current Events, Product Recommendations
Description
This dataset shows the types of advice users sought from ChatGPT based on a 2025 U.S. survey, including education, financial, medical, and legal topics.

Facebook

Twitter

Click to copy link

Link copied

Cite

Mahdi (2023). ChatGPT Classification Dataset [Dataset]. https://www.kaggle.com/datasets/mahdimaktabdar/chatgpt-classification-dataset

ChatGPT Classification Dataset

Classification of ChatGPT generated text from human generated text

Explore at:

114 scholarly articles cite this dataset (View in Google Scholar)

zip(718710 bytes)Available download formats

Dataset updated

Sep 7, 2023

Authors

Mahdi

License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

We have compiled a dataset that consists of textual articles including common terminology, concepts and definitions in the field of computer science, artificial intelligence, and cyber security. This dataset consists of both human-generated text and OpenAI’s ChatGPT-generated text. Human-generated answers were collected from different computer science dictionaries and encyclopedias including “The Encyclopedia of Computer Science and Technology” and "Encyclopedia of Human-Computer Interaction". AI-generated content in our dataset was produced by simply posting questions to OpenAI’s ChatGPT and manually documenting the resulting responses. A rigorous data-cleaning process has been performed to remove unwanted Unicode characters, styling and formatting tags. To structure our dataset for binary classification, we combined both AI-generated and Human-generated answers into a single column and assigned appropriate labels to each data point (Human-generated = 0 and AI-generated = 1).

This creates our article-level dataset (article_level_data.csv) which consists of a total of 1018 articles, 509 AI-generated and 509 Human-generated. Additionally, we have divided each article into its sentences and labelled them accordingly. This is mainly to evaluate the performance of classification models and pipelines when it comes to shorter sentence-level data points. This constructs our sentence-level dataset (sentence_level_data.csv) which consists of a total of 7344 entries (4008 AI-generated and 3336 Human-generated).

We appreciate it, if you cite the following article if you happen to use this dataset in any scientific publication:

Maktab Dar Oghaz, M., Dhame, K., Singaram, G., & Babu Saheer, L. (2023). Detection and Classification of ChatGPT Generated Contents Using Deep Transformer Models. Frontiers in Artificial Intelligence.

https://www.techrxiv.org/users/692552/articles/682641/master/file/data/ChatGPT_generated_Content_Detection/ChatGPT_generated_Content_Detection.pdf

Clear search

Close search

Google apps

Main menu

ChatGPT Classification Dataset

Test dataset of ChatGPT in medical field

ChatGPT User Reviews

Dataset Description

Columns Explanation

Collection Methods

The Impact of AI and ChatGPT on Bangladeshi University Students

Chatgpt

awesome-chatgpt-prompts

scraped-chatgpt-conversations

Datasets .csv

ASRS-ChatGPT

Data Sheet 2_Large language models generating synthetic clinical datasets: a...

Data from: Dataset of the study: "Chatbots put to the test in math and logic...

ChatGPT Usage by Age Group – Survey Data

Data from: Higher Education Students’ Evolving Perceptions of ChatGPT:...

chats-data-2023-09-27

ChatGPT Usage by Gender – Survey Data

Higher Education Students' Early Perceptions of ChatGPT: Global Survey Data

Beliefs About ChatGPT’s Impact on Humanity – Survey Data

Data from an Evaluation of ChatGPT for Nutrient Content Estimation from Meal...

chatGPT reviews from google play store

Types of ChatGPT Advice Used – Survey Data

ChatGPT Classification Dataset

Classification of ChatGPT generated text from human generated text