100+ datasets found

f
Data_Sheet_1_Advanced large language models and visualization tools for data...
frontiersin.figshare.com
txt
Updated Aug 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jorge Valverde-Rebaza; Aram González; Octavio Navarro-Hinojosa; Julieta Noguez (2024). Data_Sheet_1_Advanced large language models and visualization tools for data analytics learning.csv [Dataset]. http://doi.org/10.3389/feduc.2024.1418006.s001
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.3389/feduc.2024.1418006.s001
Dataset updated
Aug 8, 2024
Dataset provided by
Frontiers
Authors
Jorge Valverde-Rebaza; Aram González; Octavio Navarro-Hinojosa; Julieta Noguez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionIn recent years, numerous AI tools have been employed to equip learners with diverse technical skills such as coding, data analysis, and other competencies related to computational sciences. However, the desired outcomes have not been consistently achieved. This study aims to analyze the perspectives of students and professionals from non-computational fields on the use of generative AI tools, augmented with visualization support, to tackle data analytics projects. The focus is on promoting the development of coding skills and fostering a deep understanding of the solutions generated. Consequently, our research seeks to introduce innovative approaches for incorporating visualization and generative AI tools into educational practices.MethodsThis article examines how learners perform and their perspectives when using traditional tools vs. LLM-based tools to acquire data analytics skills. To explore this, we conducted a case study with a cohort of 59 participants among students and professionals without computational thinking skills. These participants developed a data analytics project in the context of a Data Analytics short session. Our case study focused on examining the participants' performance using traditional programming tools, ChatGPT, and LIDA with GPT as an advanced generative AI tool.ResultsThe results shown the transformative potential of approaches based on integrating advanced generative AI tools like GPT with specialized frameworks such as LIDA. The higher levels of participant preference indicate the superiority of these approaches over traditional development methods. Additionally, our findings suggest that the learning curves for the different approaches vary significantly. Since learners encountered technical difficulties in developing the project and interpreting the results. Our findings suggest that the integration of LIDA with GPT can significantly enhance the learning of advanced skills, especially those related to data analytics. We aim to establish this study as a foundation for the methodical adoption of generative AI tools in educational settings, paving the way for more effective and comprehensive training in these critical areas.DiscussionIt is important to highlight that when using general-purpose generative AI tools such as ChatGPT, users must be aware of the data analytics process and take responsibility for filtering out potential errors or incompleteness in the requirements of a data analytics project. These deficiencies can be mitigated by using more advanced tools specialized in supporting data analytics tasks, such as LIDA with GPT. However, users still need advanced programming knowledge to properly configure this connection via API. There is a significant opportunity for generative AI tools to improve their performance, providing accurate, complete, and convincing results for data analytics projects, thereby increasing user confidence in adopting these technologies. We hope this work underscores the opportunities and needs for integrating advanced LLMs into educational practices, particularly in developing computational thinking skills.
d
Replication Data for: ChatGPT on ChatGPT: An Exploratory Analysis of its...
search.dataone.org
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wang, Jieshu; Kiran, Elif; S.R. Aurora (also known as Mai P. Trinh); Simeone, Michael; Lobo, José (2024). Replication Data for: ChatGPT on ChatGPT: An Exploratory Analysis of its Performance in the Public Sector Workforce [Dataset]. http://doi.org/10.7910/DVN/P3CDHS
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/P3CDHS
Dataset updated
Sep 24, 2024
Dataset provided by
Harvard Dataverse
Authors
Wang, Jieshu; Kiran, Elif; S.R. Aurora (also known as Mai P. Trinh); Simeone, Michael; Lobo, José
Description
This repository contains two datasets used in the study exploring the impact of Generative AI, specifically ChatGPT, on the public sector workforce in the United States. The datasets provide detailed information on the core tasks of public sector occupations and their estimated performance metrics, including potential for automation and augmentation by ChatGPT. These estimations are generated by OpenAI’s GPT-4 model (GPT-4-1106-preview) through OpenAI API.
f
Data from: Analyzing student prompts and their effect on ChatGPT’s...
tandf.figshare.com
txt
Updated Dec 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ghadeer Sawalha; Imran Taj; Abdulhadi Shoufan (2024). Analyzing student prompts and their effect on ChatGPT’s performance [Dataset]. http://doi.org/10.6084/m9.figshare.26970708.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26970708.v1
Dataset updated
Dec 12, 2024
Dataset provided by
Taylor & Francis
Authors
Ghadeer Sawalha; Imran Taj; Abdulhadi Shoufan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Large language models present new opportunities for teaching and learning. The response accuracy of these models, however, is believed to depend on the prompt quality which can be a challenge for students. In this study, we aimed to explore how undergraduate students use ChatGPT for problem-solving, what prompting strategies they develop, the link between these strategies and the model’s response accuracy, the existence of individual prompting tendencies, and the impact of gender in this context. Our students used ChatGPT to solve five problems related to embedded systems and provided the solutions and the conversations with this model. We analyzed the conversations thematically to identify prompting strategies and applied different quantitative analyses to establish relationships between these strategies and the response accuracy and other factors. The findings indicate that students predominantly employ three types of prompting strategies: single copy-and-paste prompting (SCP), single reformulated prompting (SRP), and multiple-question prompting (MQP). ChatGPT’s response accuracy using SRP and MQP was significantly higher than using SCP, with effect sizes of -0.94 and -0.69, respectively. The student-by-student analysis revealed some tendencies. For example, 26 percent of the students consistently copied and pasted the questions into ChatGPT without any modification. Students who used MQP showed better performance in the final exam than those who did not use this prompting strategy. As for gender, female students tended to make extensive use of SCP, whereas male students tended to mix SCP and MQP. We conclude that students develop different prompting strategies that lead to different response qualities and learning. More research is needed to deepen our understanding and inform effective educational practices in the AI era.
DeepSeek vs ChatGPT: AI Platform Comparison
kaggle.com
Updated Feb 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aakif Khan (2025). DeepSeek vs ChatGPT: AI Platform Comparison [Dataset]. https://www.kaggle.com/datasets/khanaakif/deepseek-vs-chatgpt-ai-platform-comparison
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 24, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aakif Khan
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
DeepSeek vs. ChatGPT: AI Performance & User Behavior (July 2023 - Feb 2025)

This synthetically generated dataset provides a realistic AI performance comparison between ChatGPT (GPT-4-turbo) and DeepSeek (DeepSeek-Chat 1.5) over a 1.5-year period. With 10,000+ rows, it captures key user interaction metrics, platform performance indicators, and AI response characteristics to analyze trends in accuracy, engagement, and adoption.

Key Features:

Time-Series Ready – Granular date and time columns for trend and seasonality analysis.

Comparative AI Analysis – Compare user engagement, retention rates, and response quality.

User Behavior Insights – Analyze session durations, input text complexity, and user ratings.

Technical Performance Metrics – Evaluate AI response accuracy and processing speed.

Data Cleaning Practice – Includes intentionally introduced null values for preprocessing exercises.

Ideal For:

AI benchmarking and platform performance studies

Time-series forecasting and trend analysis

Data preprocessing and feature engineering

Power BI, SQL, and Python-based analytical dashboards

📜 License: MIT – Free for research, projects, and analysis.
d
How are Chat GPT and AI used in medical diagnosis
dataone.org
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maher Asaad Baker (2023). How are Chat GPT and AI used in medical diagnosis [Dataset]. http://doi.org/10.7910/DVN/2HMJ58
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/2HMJ58
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Maher Asaad Baker
Description
The potential of using Chat GPT and AI to revolutionize the way we interact with computers, specifically in the field of medical diagnostics. Chat GPT can make conversations between doctors and patients more natural, while AI can analyze vast amounts of patient data to identify trends and estimate a patient’s health. Patients can use Chat GPT to better understand their medical conditions, and both Chat GPT and AI can be used to automate tasks such as scheduling appointments and processing test results. However, there are limitations to using AI, including data bias, complex results, and analysis errors. To reduce errors, it is important to validate findings using various techniques and ensure that data is accurate and up-to-date. Chat GPT also employs security measures to protect patient data privacy and confidentiality.
i
ChatGPT Study
ieee-dataport.org
Updated Sep 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Natasha Randall (2023). ChatGPT Study [Dataset]. https://ieee-dataport.org/documents/chatgpt-study
Explore at:
Dataset updated
Sep 1, 2023
Authors
Natasha Randall
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset comprises data created during research on AI-generated code
Dataset for Comparative Analysis of AI Models DeepSeek and ChatGPT in...
zenodo.org
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aryo Aryo Fajar Pamungkas; Rollando Rollando Marcellino Himmel Madison; Arya Pradipta Arya Wismaya; Aryo Aryo Fajar Pamungkas; Rollando Rollando Marcellino Himmel Madison; Arya Pradipta Arya Wismaya (2025). Dataset for Comparative Analysis of AI Models DeepSeek and ChatGPT in Command Execution within Higher Education [Dataset]. http://doi.org/10.5281/zenodo.15516984
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.15516984
Dataset updated
May 28, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Aryo Aryo Fajar Pamungkas; Rollando Rollando Marcellino Himmel Madison; Arya Pradipta Arya Wismaya; Aryo Aryo Fajar Pamungkas; Rollando Rollando Marcellino Himmel Madison; Arya Pradipta Arya Wismaya
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was compiled to support a comparative analysis of two artificial intelligence models, ChatGPT and DeepSeek, in executing academic-related commands within the context of higher education. The data was collected using a Likert-scale questionnaire designed around the key dimensions of the DeLone & McLean Information Systems Success Model, which include System Quality (SQ), Information Quality (IQ), Service Quality (SEQ), Intention to Use (ITU), User Satisfaction (US), and Individual Impact (II).
o
500k ChatGPT-related Tweets Jan-Mar 2023
opendatabay.com
.csv
Updated Jun 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). 500k ChatGPT-related Tweets Jan-Mar 2023 [Dataset]. https://www.opendatabay.com/data/ai-ml/4f84df82-9852-490b-b41a-00a4a4191f47
Explore at:
.csvAvailable download formats
Dataset updated
Jun 16, 2025
Dataset authored and provided by
Datasimple
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Social Media and Networking
Description
This dataset contains a CSV file related to ChatGPT including keywords(chatgpt, chat gpt) #hashtags and @mentions about ChatGPT. OpenAI's conversational AI model. The file includes information on 500,000 tweets. The dataset aims to help understand public opinion, trends, and potential applications of ChatGPT by analyzing tweet volume, sentiment, user engagement, and the influence of key AI events. The dataset offers valuable insights for companies, researchers, and policymakers, allowing them to make informed decisions and shape the future of AI-powered conversational technologies.

Check out my Comprehensive Analysis on this dataset: Medium article "Cracking the ChatGPT Code: A Deep Dive into 500,000 Tweets using Advanced NLP Techniques"

Learn about the collection process in Medium article "Effortlessly Scraping Massive Twitter Data"

License

CC0

Original Data Source: 500k ChatGPT-related Tweets Jan-Mar 2023
f
Data from: Efficient spheroid morphology assessment with a ChatGPT data...
tandf.figshare.com
docx
Updated May 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Takuya Sakamoto; Hiroto Koma; Ayane Kuwano; Tetsuhiro Horie; Atsushi Fuku; Hironori Kitajima; Yuka Nakamura; Ikuhiro Tanida; Yujiro Nakade; Hiroaki Hirata; Yoshiyuki Tachi; Hiroshi Sunami; Daisuke Sakamoto; Sohsuke Yamada; Naoki Yamamoto; Yusuke Shimizu; Yasuhito Ishigaki; Toru Ichiseki; Ayumi Kaneuji; Satoshi Osawa; Norio Kawahara (2025). Efficient spheroid morphology assessment with a ChatGPT data analyst: implications for cell therapy [Dataset]. http://doi.org/10.6084/m9.figshare.29038988.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.29038988.v1
Dataset updated
May 23, 2025
Dataset provided by
Taylor & Francis
Authors
Takuya Sakamoto; Hiroto Koma; Ayane Kuwano; Tetsuhiro Horie; Atsushi Fuku; Hironori Kitajima; Yuka Nakamura; Ikuhiro Tanida; Yujiro Nakade; Hiroaki Hirata; Yoshiyuki Tachi; Hiroshi Sunami; Daisuke Sakamoto; Sohsuke Yamada; Naoki Yamamoto; Yusuke Shimizu; Yasuhito Ishigaki; Toru Ichiseki; Ayumi Kaneuji; Satoshi Osawa; Norio Kawahara
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Adipose-derived stem cells (ADSCs) exhibit promising potential for the treatment of various diseases, including osteoarthritis. Spheroids derived from ADSCs are a viable treatment option with enhanced anti-inflammatory effects and tissue repair capabilities. SphereRing® is a rotating donut-shaped tube that efficiently produces large quantities of spheroids. However, accurately measuring spheroid size for spheroid quality assessment is challenging. This study aimed to develop an automated method for measuring spheroid size using deep learning through the ChatGPT Data Analyst for image recognition and processing. The area, perimeter, and circularity of spheroids generated with the SphereRing system were analyzed using ChatGPT Data Analyst and ImageJ. Measurement accuracy was validated using Bland–Altman analysis and scatter plot correlation coefficients. ChatGPT Data Analyst was consistent with ImageJ for all parameters. Bland–Altman plots demonstrated strong agreement; most data points were within the 95% limits. The ChatGPT Data Analyst provides a reliable and efficient alternative for assessing spheroid quality. This method reduces human error and improves reproducibility to enhance spheroid quality control. Thus, this method has potential applications in regenerative medicine. We developed an automated spheroid assessment method using ChatGPT Data Analyst to precisely quantify spheroid area, perimeter, and circularity. Microscopic images were subjected to grayscale conversion, binarization, and contour detection. The results were compared with manual ImageJ measurements to validate the method. The method exhibited high accuracy, providing a user-friendly and efficient alternative for spheroid assessment in regenerative medicine research. We established an automated method for the assessing spheroid size and morphology using ChatGPT Data Analyst.The method using ChatGPT Data Analyst exhibited high agreement with manual measurements performed using ImageJ for quantifying the spheroid area, perimeter, and circularity.Bland–Altman analysis demonstrated high reliability and agreement in measurements.Automated analysis using ChatGPT Data Analyst substantially reduces human error and labor-intensive manual processing required for spheroid analysis.This approach is a practical and accessible alternative for researchers and clinicians working on spheroid-based cell therapies. We established an automated method for the assessing spheroid size and morphology using ChatGPT Data Analyst. The method using ChatGPT Data Analyst exhibited high agreement with manual measurements performed using ImageJ for quantifying the spheroid area, perimeter, and circularity. Bland–Altman analysis demonstrated high reliability and agreement in measurements. Automated analysis using ChatGPT Data Analyst substantially reduces human error and labor-intensive manual processing required for spheroid analysis. This approach is a practical and accessible alternative for researchers and clinicians working on spheroid-based cell therapies.
H
Data from: Accuracy of ChatGPT in answering cardiology board-style questions...
dataverse.harvard.edu
Updated Feb 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Albert Andrew (2025). Accuracy of ChatGPT in answering cardiology board-style questions [Dataset]. http://doi.org/10.7910/DVN/MYQKNY
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/MYQKNY
Dataset updated
Feb 26, 2025
Dataset provided by
Harvard Dataverse
Authors
Albert Andrew
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Studies were included in the analysis if they met all of the following criteria: (1) the article was written in English, (2) the study assessed ChatGPT’s accuracy on questions that were set at a level or retrieved from an appropriate resource representing board-style (specialist) cardiology certification examination questions, (3) the questions inputted into ChatGPT were either text-based, image-based, or a combination of both, and (4) the study provided data on the number of questions inputted into ChatGPT and the number (or percentage) of correct responses reported separately for each question format. Studies were excluded if they failed to meet any of the aforementioned inclusion criteria or did not disclose original data, such as review papers or descriptive replies/correspondence to previously published articles. Key study characteristic data from included studies were extracted and entered into a predefined data abstraction template. Statistical analysis pooled the reported accuracy from each study to calculate an overall pooled accuracy with a 95% confidence interval (CI), subgrouped by model version, using a random-effects model. The meta-analysis software used was STATA ver. 18.0 (Stata Corp.). P-values <0.05 were considered statistically significant. Heterogeneity was assessed using the I2 statistic.
Exploring Student Engagement with ChatGPT in Simulated Learning: A...
zenodo.org
Updated Jun 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PABLO ROSSER; PABLO ROSSER; Seila Soler; Seila Soler (2025). Exploring Student Engagement with ChatGPT in Simulated Learning: A Multivariable Analysis of Pedagogical Preferences [Dataset]. http://doi.org/10.5281/zenodo.15620343
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.15620343
Dataset updated
Jun 8, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
PABLO ROSSER; PABLO ROSSER; Seila Soler; Seila Soler
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This study investigates students' perceptions across different subjects and demographic profiles regarding the use of ChatGPT in educational activities, with a specific focus on its implementation in simulation-based tasks, compared to UrbanGame and Museum Assignment activities. A range of quantitative analyses—including ANOVA, parametric and non-parametric correlations, multivariate analysis, and chi-square tests—were employed to examine the influence of variables such as age, gender, and academic discipline on pedagogical interest and activity preferences. Results indicate significantly higher pedagogical interest among younger students and moderate correlations between Likert-scale learning dimensions. Despite being the least preferred activity, simulation demonstrates comparable educational value in terms of content comprehension, skill acquisition, and value awareness. No statistically significant differences in activity perception were found across age, gender, or subject, suggesting a broadly inclusive pedagogical design. These findings provide guidance for improving instructional strategies and integrating AI-based tools like ChatGPT into the classroom.
d
Data from: Medical students’ patterns of using ChatGPT as a feedback tool...
search.dataone.org
dataverse.harvard.edu
Updated Dec 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Janghee Park (2023). Medical students’ patterns of using ChatGPT as a feedback tool and perceptions of ChatGPT in a Leadership and Communication course in Korea: a cross-sectional study [Dataset]. http://doi.org/10.7910/DVN/IDSD81
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/IDSD81
Dataset updated
Dec 16, 2023
Dataset provided by
Harvard Dataverse
Authors
Janghee Park
Description
This study aimed to analyze patterns of using ChatGPT before and after group activities and to explore medical students’ perceptions of ChatGPT as a feedback tool in the classroom. The study included 99 2nd-year pre-medical students who participated in a “Leadership and Communication” course from March to June 2023. Students engaged in both individual and group activities related to negotiation strategies. ChatGPT was used to provide feedback on their solutions.
A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to...
zenodo.org
data.niaid.nih.gov
+1more
bin
Updated Jun 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scott McGrath; Scott McGrath (2024). A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to selected genetics questions - Full study data [Dataset]. http://doi.org/10.5061/dryad.s4mw6m9cv
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.s4mw6m9cv
Dataset updated
Jun 4, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Scott McGrath; Scott McGrath
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Measurement technique
<h2>Study Design</h2> This study was conducted to evaluate the performance of ChatGPT 4 (March 23rd, 2023) Model) in the context of genetic counseling and education. The evaluation involved a structured questionnaire, which included questions selected from the Brief User Survey (BUS-15) and additional custom questions designed to assess the clinical value of ChatGPT 4's responses. <h2>Questionnaire Development</h2> The questionnaire was built on Qualtrics, which comprised twelve questions: seven selected from the BUS-15 preceded by two additional questions that we designed. The initial questions focused on quality and answer relevancy: 1. The overall quality of the Chatbot's response is: (5-point Likert: Very poor to Very Good) 2. The Chatbot delivered an answer that provided the relevant information you would include if asked the question. (5-point Likert: Strongly disagree to Strongly agree) The BUS-15 questions (7-point Likert: Strongly disagree to Strongly agree) focused on: 1. Recognition and facilitation of users' goal and intent: Chatbot seems able to recognize the user's intent and guide the user to its goals. 2. Relevance of information: The chatbot provides relevant and appropriate information/answer to people at each stage to make them closer to their goal. 3. Maxim of quantity: The chatbot responds in an informative way without adding too much information. 4. Resilience to failure: Chatbot seems able to find ways to respond appropriately even when it encounters situations or arguments it is not equipped to handle. 5. Understandability and politeness: The chatbot seems able to understand input and convey correct statements and answers without ambiguity and with acceptable manners. 6. Perceived conversational credibility: The chatbot responds in a credible and informative way without adding too much information. 7. Meet the neurodiverse needs: Chatbot seems able to meet needs and be used by users independently form their health conditions, well-being, age, etc. <h2>Expert Panel and Data Collection</h2> A panel of experts (two genetic counselors and two clinical geneticists) was provided with a link to the survey containing the questions. They independently evaluated the responses from ChatGPT 4 without discussing the questions or answers among themselves until after the survey submission. This approach ensured unbiased evaluation.
Description
Objective:

Our objective is to evaluate the efficacy of ChatGPT 4 in accurately and effectively delivering genetic information, building on previous findings with ChatGPT 3.5. We focus on assessing the utility, limitations, and ethical implications of using ChatGPT in medical settings.

Materials and Methods:

A structured questionnaire, including the Brief User Survey (BUS-15) and custom questions, was developed to assess ChatGPT 4's clinical value. An expert panel of genetic counselors and clinical geneticists independently evaluated ChatGPT 4's responses to these questions. We also involved comparative analysis with ChatGPT 3.5, utilizing descriptive statistics and using R for data analysis.

Results:

ChatGPT 4 demonstrated improvements over 3.5 in context recognition, relevance, and informativeness. However, performance variability and concerns about the naturalness of the output were noted. No significant difference in accuracy was found between ChatGPT 3.5 and 4.0. Notably, the efficacy of ChatGPT 4 varied significantly across different genetic conditions, with specific differences identified between responses related to BRCA1 and HFE.

Discussion and Conclusion:

This study highlights ChatGPT 4's potential in genomics, noting significant advancements over its predecessor. Despite these improvements, challenges remain, including the risk of outdated information and the necessity of ongoing refinement. The variability in performance across different genetic conditions underscores the need for expert oversight and continuous AI training. ChatGPT 4, while showing promise, emphasizes the importance of balancing technological innovation with ethical responsibility in healthcare information delivery.
4
Supplementary materials for the article: Using ChatGPT for Human Computer...
data.4tu.nl
zip
Updated Sep 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wilbert Tabone; Joost de Winter (2023). Supplementary materials for the article: Using ChatGPT for Human Computer Interaction Research: A Primer [Dataset]. http://doi.org/10.4121/21916017.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/21916017.v1
Dataset updated
Sep 4, 2023
Dataset provided by
4TU.ResearchData
Authors
Wilbert Tabone; Joost de Winter
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Supplementary data for the paperTabone, W., & De Winter, J. C. F. (2023). Using ChatGPT for Human–Computer Interaction Research: A Primer. Royal Society Open Science, 10, 231053.

This repository contains MATLAB scripts (tested in MATLAB R2023a),
input data (source data read by the MATLAB scripts),
and saved ChatGPT outputs for Study 1, 2, and 3 of the following research paper:

- Study1.m contains code for 2 different prompts, as well as different batch sizes (25 vs. 992), as described in the paper.
- Study1_bootstrapping.m contains the code for the bootstrapping approach, also described in the paper.
- Study1_randomness_test.m contains the code corresponding to the systematic variation of the temperature parameter described in the paper.
- Study2.m contains the code that corresponds to the interview summary for the Virtual fence augmented reality (AR) interface.
- Study2_16k_test.m contains the code of a trial, in which the entire interview was submitted all at once; something that has recently become possible with the 16k variant of the GPT-3.5 (not described in the paper)
- Study2_content_analysis.m and Study2_content_analysis_16k contain code for attempts at letting GPT perform a content analysis of the interview (described in the discussion of the paper).
- Study2_gpt4.m is similar to Study2.m but now adopts GPT-4 instead of GPT-3.5
- Study3_bootstrapping.m is code for the bootstrapping approach of the transcripts, as described in the paper.
4
Supplementary data for the article: 'The use of ChatGPT for personality...
data.4tu.nl
zip
Updated May 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joost de Winter; Tom Driessen; Dimitra Dodou (2024). Supplementary data for the article: 'The use of ChatGPT for personality research: Administering questionnaires using generated personas' [Dataset]. http://doi.org/10.4121/6e0f2f2b-f1fc-4300-b8ca-eb9031a7b257.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/6e0f2f2b-f1fc-4300-b8ca-eb9031a7b257.v1
Dataset updated
May 8, 2024
Dataset provided by
4TU.ResearchData
Authors
Joost de Winter; Tom Driessen; Dimitra Dodou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Personality research has traditionally relied on questionnaires, which bring with them inherent limitations, such as response style bias. With the emergence of large language models such as ChatGPT, the question arises as to what extent these models can be used in personality research. In this study, ChatGPT (GPT-4) generated 2000 text-based personas. Next, for each persona, ChatGPT completed a short form of the Big Five Inventory (BFI-10), the Brief Sensation Seeking Scale (BSSS), and a Short Dark Triad (SD3). The mean scores on the BFI-10 items were found to correlate strongly with means from previously published research, and principal component analysis revealed a clear five-component structure. Certain relationships between traits, such as a negative correlation between the age of the persona and the BSSS score, were clearly interpretable, while some other correlations diverged from the literature. An additional analysis using four new sets of 2000 personas each, including a set of ‘realistic’ personas and a set of cinematic personas, showed that the correlation matrix among personality constructs was affected by the persona set. It is concluded that evaluating questionnaires and research hypotheses prior to engaging with real individuals holds promise.
Z
Toward multimodal information and AI interaction: a quasi-experiment with...
data.niaid.nih.gov
zenodo.org
Updated Aug 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crudele, Francesca (2024). Toward multimodal information and AI interaction: a quasi-experiment with ChatGPT [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13220545
Explore at:
Dataset updated
Aug 5, 2024
Dataset provided by
Crudele, Francesca
Raffaghelli, Juliana Elisa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The development of argumentative text and information comprehension (CoI) skills related to the critical reconstruction of meaning (CT) is crucial in undergraduate education. Especially now in the era of social media and AI-mediated information. Generative AI aids in information creation, but its unconscious use can complicate complex information navigation. Argument maps (AM), commonly used for analyzing analog and static texts, can help visualize, understand, and rework multimodal and dynamic arguments and information.

Stemming from the Vygotskian idea, our study used a design-based research approach on the use of AMs and ChatGPT as socio-technical artifacts to stimulate and support the understanding of information (CoI) and thus the development of critical thinking (CT). The workshop introduced the multimodal element through a 3-group quasi-experiment. The first group dealt with fully analog texts, the second group used maps with multimodal textual modes, and the third group only interacted with ChatGPT. The research focused on comparing the three groups and focusing on the two experimental groups (experimental macro-focus).

The research had three main objectives: 1) to test whether AMs improved students' CoI enhancement and critical processing (CT); 2) to determine whether interaction with ChatGPT supported information reprocessing and critical construction of opinions and assessment tools; and 3) to determine whether interaction with ChatGPT alone, without AMs, still fostered greater integration of information and viewpoints.

Our preliminary analysis showed that AMs improved students' CoI and CT, especially when exposed to multimodal information. ChatGPT interaction increased critical reflection and awareness of AI's role in education. Students using only ChatGPT performed well in argumentative reworking, suggesting that interaction with the chatbot can be effective. However, integrating AMs and ChatGPT could provide optimal support for comprehension and critical thinking skills.

This Zenodo record follows the full analysis process with R (https://cran.r-project.org/bin/windows/base/ ) and Nvivo (https://lumivero.com/products/nvivo/) composed of the following datasets, script and results:

Comprehension of Text and AMs Results - Arg_Map.xlsx

Critical Thinking level - CriThink.xlsx

Descriptive and Inferential Statistics Comprehension and Critical Thinking - Preliminary Analysis.R

Elaboration and Integration Opinion - Opi_G1.xlsx; Opi_G2.xlsx & Opi_G3.xlsx

Descriptive and Inferential Statistics Opinion level - Preliminary Analysis_opi.R

Sentiment Analysis - Sentiment Analysis.R

Vocabulary Frequent words - Vocabulary.csv

Codebook qualitative Analysis with Nvivo (Codebook.xlsx)

Results Nvivo Analysis G1 & G2 - Codebook-ChatGPT_G1&G2.docx

Any comments or improvements are welcome!
Chatbot Market Analysis, Size, and Forecast 2025-2029: North America (US and...
technavio.com
Updated Feb 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Technavio (2025). Chatbot Market Analysis, Size, and Forecast 2025-2029: North America (US and Canada), Europe (France, Germany, Italy, and UK), Middle East and Africa (Egypt, KSA, Oman, and UAE), APAC (China, India, and Japan), South America (Argentina and Brazil), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/chatbot-market-industry-analysis
Explore at:
Dataset updated
Feb 15, 2025
Dataset provided by
TechNavio
Authors
Technavio
Time period covered
2021 - 2025
Area covered
Global
Description
Snapshot img

Chatbot Market Size 2025-2029

The chatbot market size is forecast to increase by USD 9.63 billion, at a CAGR of 42.9% between 2024 and 2029.

The market is witnessing significant growth, driven by the integration of chatbots with various communication channels such as social media, websites, and messaging apps. This integration enables businesses to engage with customers in real-time, providing instant responses and enhancing customer experience. However, the market faces challenges, including the lack of awareness and standardization of chatbot services. Despite these obstacles, the potential benefits of chatbots, including cost savings, increased efficiency, and improved customer engagement, make it an attractive investment for businesses seeking to enhance their digital presence and streamline operations. Companies looking to capitalize on this market opportunity should focus on developing chatbot solutions that offer customizable features, seamless integration with existing systems, and natural language processing capabilities to deliver human-like interactions. Navigating the challenges of awareness and standardization will require targeted marketing efforts and collaborations with industry partners to establish best practices and industry standards.

What will be the Size of the Chatbot Market during the forecast period?

Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
Request Free SampleThe market continues to evolve, with dynamic market dynamics shaping its growth and applications across various sectors. Conversational AI, a key component of chatbots, is advancing with the integration of sentiment analysis, emotional intelligence, and meteor score to enhance user experience. Pre-trained models and language understanding are being utilized to improve performance metrics, while neural networks and contextual awareness enable more accurate intent recognition. Deployment strategies, including policy learning and cloud platforms, are evolving to support cross-platform compatibility and multi-lingual support. Performance metrics, such as F1-score and response time, are crucial in evaluating model effectiveness. Reinforcement learning and knowledge base integration are essential for chatbot development and lead generation. Error rate and character error rate are critical in speech recognition, while API integration and dialogue state tracking facilitate seamless conversational experiences. Technical support and customer engagement are primary applications of chatbots, with sales conversion and automated responses optimizing business operations. Deep learning architectures and transfer learning are driving advancements in question answering and natural language processing. Contextualized word embeddings and dialogue management are essential for effective user interaction. Overall, the market is an ever-evolving landscape, with continuous innovation and integration of advanced technologies shaping its future.

How is this Chatbot Industry segmented?

The chatbot industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments. End-userRetailBFSIGovernmentTravel and hospitalityOthersProductSolutionsServicesDeploymentCloud-BasedOn-PremiseHybridApplicationCustomer ServiceSales and MarketingHealthcare SupportE-Commerce AssistanceGeographyNorth AmericaUSCanadaEuropeFranceGermanyItalyUKMiddle East and AfricaEgyptKSAOmanUAEAPACChinaIndiaJapanSouth AmericaArgentinaBrazilRest of World (ROW)

By End-user Insights

The retail segment is estimated to witness significant growth during the forecast period.The market is experiencing significant growth, particularly in the retail sector. E-commerce giants like Amazon, Flipkart, Alibaba, and Snapdeal are leading this trend, integrating chatbots to improve customer experience during online product searches. These AI-powered bots facilitate quick and effective resolution of payment-related queries, enhancing the shopping experience. However, retailers face challenges in ensuring a seamless user experience, as consumers increasingly prefer mobile shopping. Deep learning architectures and natural language processing (NLP) are crucial components of chatbot development. NLP enables intent recognition, sentiment analysis, and entity extraction, while deep learning models provide contextual awareness and dialogue management. Speech recognition and dialogue state tracking further enhance the user experience. Cross-platform compatibility and multi-lingual support are essential features for chatbots, catering to diverse user bases. Pre-trained models and transfer learning enable faster development and deployment. Reinforcement learning and policy learning optimize bot
m
Composing alt text using large language models: dataset in English
data.mendeley.com
Updated Jun 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yekaterina Kosova (2024). Composing alt text using large language models: dataset in English [Dataset]. http://doi.org/10.17632/szh5zhpgxh.1
Explore at:
Unique identifier
https://doi.org/10.17632/szh5zhpgxh.1
Dataset updated
Jun 17, 2024
Authors
Yekaterina Kosova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains the results of developing alternative text for images using chatbots based on large language models. The study was carried out in April-June 2024. Microsoft Copilot, Google Gemini, and YandexGPT chatbots were used to generate 108 text descriptions for 12 images. Descriptions were generated by chatbots using keywords specified by a person. The experts then rated the resulting descriptions on a Likert scale (from 1 to 5). The data set is presented in a Microsoft Excel table on the “Data” sheet with the following fields: record number; image number; chatbot; image type (photo, logo); request date; list of keywords; number of keywords; length of keywords; time of compilation of keywords; generated descriptions; required length of descriptions; actual length of descriptions; description generation time; usefulness; reliability; completeness; accuracy; literacy. The “Images” sheet contains links to the original images. Alternative descriptions are presented in English.
Customer Shopping Trends Dataset
kaggle.com
Updated Oct 5, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sourav Banerjee (2023). Customer Shopping Trends Dataset [Dataset]. https://www.kaggle.com/datasets/iamsouravbanerjee/customer-shopping-trends-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 5, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sourav Banerjee
Description
Context

The Customer Shopping Preferences Dataset offers valuable insights into consumer behavior and purchasing patterns. Understanding customer preferences and trends is critical for businesses to tailor their products, marketing strategies, and overall customer experience. This dataset captures a wide range of customer attributes including age, gender, purchase history, preferred payment methods, frequency of purchases, and more. Analyzing this data can help businesses make informed decisions, optimize product offerings, and enhance customer satisfaction. The dataset stands as a valuable resource for businesses aiming to align their strategies with customer needs and preferences. It's important to note that this dataset is a Synthetic Dataset Created for Beginners to learn more about Data Analysis and Machine Learning.

Content

This dataset encompasses various features related to customer shopping preferences, gathering essential information for businesses seeking to enhance their understanding of their customer base. The features include customer age, gender, purchase amount, preferred payment methods, frequency of purchases, and feedback ratings. Additionally, data on the type of items purchased, shopping frequency, preferred shopping seasons, and interactions with promotional offers is included. With a collection of 3900 records, this dataset serves as a foundation for businesses looking to apply data-driven insights for better decision-making and customer-centric strategies.

Dataset Glossary (Column-wise)

Customer ID - Unique identifier for each customer

Age - Age of the customer

Gender - Gender of the customer (Male/Female)

Item Purchased - The item purchased by the customer

Category - Category of the item purchased

Purchase Amount (USD) - The amount of the purchase in USD

Location - Location where the purchase was made

Size - Size of the purchased item

Color - Color of the purchased item

Season - Season during which the purchase was made

Review Rating - Rating given by the customer for the purchased item

Subscription Status - Indicates if the customer has a subscription (Yes/No)

Shipping Type - Type of shipping chosen by the customer

Discount Applied - Indicates if a discount was applied to the purchase (Yes/No)

Promo Code Used - Indicates if a promo code was used for the purchase (Yes/No)

Previous Purchases - The total count of transactions concluded by the customer at the store, excluding the ongoing transaction

Payment Method - Customer's most preferred payment method

Frequency of Purchases - Frequency at which the customer makes purchases (e.g., Weekly, Fortnightly, Monthly)

Structure of the Dataset

https://i.imgur.com/6UEqejq.png" alt="">

Acknowledgement

This dataset is a synthetic creation generated using ChatGPT to simulate a realistic customer shopping experience. Its purpose is to provide a platform for beginners and data enthusiasts, allowing them to create, enjoy, practice, and learn from a dataset that mirrors real-world customer shopping behavior. The aim is to foster learning and experimentation in a simulated environment, encouraging a deeper understanding of data analysis and interpretation in the context of consumer preferences and retail scenarios.

Cover Photo by: Freepik

Thumbnail by: Clothing icons created by Flat Icons - Flaticon
4
Supplementary data for the paper 'Personality and acceptance as predictors...
data.4tu.nl
zip
Updated Mar 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joost de Winter; Dimitra Dodou; Yke Bauke Eisma (2024). Supplementary data for the paper 'Personality and acceptance as predictors of ChatGPT use' [Dataset]. http://doi.org/10.4121/e2e3ac25-e264-4592-b413-254eb4ac5022.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/e2e3ac25-e264-4592-b413-254eb4ac5022.v1
Dataset updated
Mar 28, 2024
Dataset provided by
4TU.ResearchData
Authors
Joost de Winter; Dimitra Dodou; Yke Bauke Eisma
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Within a year of its launch, ChatGPT has seen a surge in popularity. While many are drawn to its effectiveness and user-friendly interface, ChatGPT also introduces moral concerns, such as the temptation to present generated text as one’s own. This led us to theorize that personality traits such as Machiavellianism and sensation-seeking may be predictive of ChatGPT usage. We launched two online questionnaires with 2,000 respondents each, in September 2023 and March 2024, respectively. In Questionnaire 1, 22% of respondents were students, and 54% were full-time employees; 32% indicated they used ChatGPT at least weekly. Analysis of our ChatGPT Acceptance Scale revealed two factors, Effectiveness and Concerns, which correlated positively and negatively, respectively, with ChatGPT use frequency. A specific aspect of Machiavellianism (manipulation tactics) was found to predict ChatGPT usage. Questionnaire 2 was a replication of Questionnaire 1, with 21% students and 54% full-time employees, of which 43% indicated using ChatGPT weekly. In Questionnaire 2, more extensive personality scales were used. We found a moderate correlation between Machiavellianism and ChatGPT usage (r = .22) and with an opportunistic attitude towards undisclosed use (r = .30), relationships that largely remained intact after controlling for gender, age, education level, and the respondents’ country. We conclude that covert use of ChatGPT is associated with darker personality traits, something that requires further attention.

Facebook

Twitter

Click to copy link

Link copied

Cite

Jorge Valverde-Rebaza; Aram González; Octavio Navarro-Hinojosa; Julieta Noguez (2024). Data_Sheet_1_Advanced large language models and visualization tools for data analytics learning.csv [Dataset]. http://doi.org/10.3389/feduc.2024.1418006.s001

Data_Sheet_1_Advanced large language models and visualization tools for data analytics learning.csv

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.3389/feduc.2024.1418006.s001

Dataset updated

Aug 8, 2024

Dataset provided by

Frontiers

Authors

Jorge Valverde-Rebaza; Aram González; Octavio Navarro-Hinojosa; Julieta Noguez

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

IntroductionIn recent years, numerous AI tools have been employed to equip learners with diverse technical skills such as coding, data analysis, and other competencies related to computational sciences. However, the desired outcomes have not been consistently achieved. This study aims to analyze the perspectives of students and professionals from non-computational fields on the use of generative AI tools, augmented with visualization support, to tackle data analytics projects. The focus is on promoting the development of coding skills and fostering a deep understanding of the solutions generated. Consequently, our research seeks to introduce innovative approaches for incorporating visualization and generative AI tools into educational practices.MethodsThis article examines how learners perform and their perspectives when using traditional tools vs. LLM-based tools to acquire data analytics skills. To explore this, we conducted a case study with a cohort of 59 participants among students and professionals without computational thinking skills. These participants developed a data analytics project in the context of a Data Analytics short session. Our case study focused on examining the participants' performance using traditional programming tools, ChatGPT, and LIDA with GPT as an advanced generative AI tool.ResultsThe results shown the transformative potential of approaches based on integrating advanced generative AI tools like GPT with specialized frameworks such as LIDA. The higher levels of participant preference indicate the superiority of these approaches over traditional development methods. Additionally, our findings suggest that the learning curves for the different approaches vary significantly. Since learners encountered technical difficulties in developing the project and interpreting the results. Our findings suggest that the integration of LIDA with GPT can significantly enhance the learning of advanced skills, especially those related to data analytics. We aim to establish this study as a foundation for the methodical adoption of generative AI tools in educational settings, paving the way for more effective and comprehensive training in these critical areas.DiscussionIt is important to highlight that when using general-purpose generative AI tools such as ChatGPT, users must be aware of the data analytics process and take responsibility for filtering out potential errors or incompleteness in the requirements of a data analytics project. These deficiencies can be mitigated by using more advanced tools specialized in supporting data analytics tasks, such as LIDA with GPT. However, users still need advanced programming knowledge to properly configure this connection via API. There is a significant opportunity for generative AI tools to improve their performance, providing accurate, complete, and convincing results for data analytics projects, thereby increasing user confidence in adopting these technologies. We hope this work underscores the opportunities and needs for integrating advanced LLMs into educational practices, particularly in developing computational thinking skills.

Clear search

Close search

Google apps

Main menu

Data_Sheet_1_Advanced large language models and visualization tools for data...

Replication Data for: ChatGPT on ChatGPT: An Exploratory Analysis of its...

Data from: Analyzing student prompts and their effect on ChatGPT’s...

DeepSeek vs ChatGPT: AI Platform Comparison

DeepSeek vs. ChatGPT: AI Performance & User Behavior (July 2023 - Feb 2025)

Key Features:

Ideal For:

How are Chat GPT and AI used in medical diagnosis

ChatGPT Study

Dataset for Comparative Analysis of AI Models DeepSeek and ChatGPT in...

500k ChatGPT-related Tweets Jan-Mar 2023

License

Data from: Efficient spheroid morphology assessment with a ChatGPT data...

Data from: Accuracy of ChatGPT in answering cardiology board-style questions...

Exploring Student Engagement with ChatGPT in Simulated Learning: A...

Data from: Medical students’ patterns of using ChatGPT as a feedback tool...

A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to...

Supplementary materials for the article: Using ChatGPT for Human Computer...

Supplementary data for the article: 'The use of ChatGPT for personality...

Toward multimodal information and AI interaction: a quasi-experiment with...

Chatbot Market Analysis, Size, and Forecast 2025-2029: North America (US and...

Snapshot img

Composing alt text using large language models: dataset in English

Customer Shopping Trends Dataset

Context

Content

Dataset Glossary (Column-wise)

Structure of the Dataset

Acknowledgement

Supplementary data for the paper 'Personality and acceptance as predictors...

Data_Sheet_1_Advanced large language models and visualization tools for data analytics learning.csv