4 datasets found

n
A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to...
data.niaid.nih.gov
zenodo.org
+1more
zip
Updated Jun 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scott McGrath (2024). A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to selected genetics questions - Full study data [Dataset]. http://doi.org/10.5061/dryad.s4mw6m9cv
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.s4mw6m9cv
Dataset updated
Jun 4, 2024
Dataset provided by
University of California, Berkeley
Authors
Scott McGrath
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Objective: Our objective is to evaluate the efficacy of ChatGPT 4 in accurately and effectively delivering genetic information, building on previous findings with ChatGPT 3.5. We focus on assessing the utility, limitations, and ethical implications of using ChatGPT in medical settings. Materials and Methods: A structured questionnaire, including the Brief User Survey (BUS-15) and custom questions, was developed to assess ChatGPT 4's clinical value. An expert panel of genetic counselors and clinical geneticists independently evaluated ChatGPT 4's responses to these questions. We also involved comparative analysis with ChatGPT 3.5, utilizing descriptive statistics and using R for data analysis. Results: ChatGPT 4 demonstrated improvements over 3.5 in context recognition, relevance, and informativeness. However, performance variability and concerns about the naturalness of the output were noted. No significant difference in accuracy was found between ChatGPT 3.5 and 4.0. Notably, the efficacy of ChatGPT 4 varied significantly across different genetic conditions, with specific differences identified between responses related to BRCA1 and HFE. Discussion and Conclusion: This study highlights ChatGPT 4's potential in genomics, noting significant advancements over its predecessor. Despite these improvements, challenges remain, including the risk of outdated information and the necessity of ongoing refinement. The variability in performance across different genetic conditions underscores the need for expert oversight and continuous AI training. ChatGPT 4, while showing promise, emphasizes the importance of balancing technological innovation with ethical responsibility in healthcare information delivery. Methods Study Design This study was conducted to evaluate the performance of ChatGPT 4 (March 23rd, 2023) Model) in the context of genetic counseling and education. The evaluation involved a structured questionnaire, which included questions selected from the Brief User Survey (BUS-15) and additional custom questions designed to assess the clinical value of ChatGPT 4's responses. Questionnaire Development The questionnaire was built on Qualtrics, which comprised twelve questions: seven selected from the BUS-15 preceded by two additional questions that we designed. The initial questions focused on quality and answer relevancy: 1. The overall quality of the Chatbot’s response is: (5-point Likert: Very poor to Very Good) 2. The Chatbot delivered an answer that provided the relevant information you would include if asked the question. (5-point Likert: Strongly disagree to Strongly agree) The BUS-15 questions (7-point Likert: Strongly disagree to Strongly agree) focused on: 1. Recognition and facilitation of users’ goal and intent: Chatbot seems able to recognize the user’s intent and guide the user to its goals. 2. Relevance of information: The chatbot provides relevant and appropriate information/answer to people at each stage to make them closer to their goal. 3. Maxim of quantity: The chatbot responds in an informative way without adding too much information. 4. Resilience to failure: Chatbot seems able to find ways to respond appropriately even when it encounters situations or arguments it is not equipped to handle. 5. Understandability and politeness: The chatbot seems able to understand input and convey correct statements and answers without ambiguity and with acceptable manners. 6. Perceived conversational credibility: The chatbot responds in a credible and informative way without adding too much information. 7. Meet the neurodiverse needs: Chatbot seems able to meet needs and be used by users independently form their health conditions, well-being, age, etc. Expert Panel and Data Collection A panel of experts (two genetic counselors and two clinical geneticists) was provided with a link to the survey containing the questions. They independently evaluated the responses from ChatGPT 4 without discussing the questions or answers among themselves until after the survey submission. This approach ensured unbiased evaluation.
f
Data Sheet 1_Comparison of medical history documentation efficiency and...
frontiersin.figshare.com
docx
Updated May 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiaojing Lu; Xinqi Gao; Xinyi Wang; Zhenye Gong; Jie Cheng; Weiguo Hu; Shaun Wu; Rong Wang; Xiaoyang Li (2025). Data Sheet 1_Comparison of medical history documentation efficiency and quality based on GPT-4o: a study on the comparison between residents and artificial intelligence.docx [Dataset]. http://doi.org/10.3389/fmed.2025.1545730.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fmed.2025.1545730.s001
Dataset updated
May 14, 2025
Dataset provided by
Frontiers
Authors
Xiaojing Lu; Xinqi Gao; Xinyi Wang; Zhenye Gong; Jie Cheng; Weiguo Hu; Shaun Wu; Rong Wang; Xiaoyang Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundAs medical technology advances, physicians' responsibilities in clinical practice continue to increase, with medical history documentation becoming an essential component. Artificial Intelligence (AI) technologies, particularly advances in Natural Language Processing (NLP), have introduced new possibilities for medical documentation. This study aims to evaluate the efficiency and quality of medical history documentation by ChatGPT-4o compared to resident physicians and explore the potential applications of AI in clinical documentation.MethodsUsing a non-inferiority design, this study compared the documentation time and quality scores between 5 resident physicians from the hematology department (with an average of 2.4 years of clinical experience) and ChatGPT-4o based on identical case materials. Medical history quality was evaluated by two attending physicians with over 10 years of clinical experience using ten case content criteria. Data were analyzed using paired t-tests and Wilcoxon signed-rank tests, with Kappa coefficients used to assess scoring consistency. Detailed scoring criteria included completeness (coverage of history elements), accuracy (correctness of information), logic (organization and coherence of content), and professionalism (appropriate use of medical terminology and format), each rated on a 10-point scale.ResultsIn terms of medical history quality, ChatGPT-4o achieved an average score of 88.9, while resident physicians scored 89.6, with no statistically significant difference between the two (p = 0.25). The Kappa coefficient between the two evaluators was 0.82, indicating good consistency in scoring. Non-inferiority testing showed that ChatGPT-4o's quality scores fell within the preset non-inferiority margin (5 points), indicating that its documentation quality was not inferior to that of resident physicians. ChatGPT-4o's average documentation time was 40.1 s, significantly shorter than the resident physicians' average of 14.9 min (p < 0.001).ConclusionWhile maintaining quality comparable to resident physicians, ChatGPT-4o significantly reduced the time required for medical history documentation. Despite these positive results, practical considerations such as data preprocessing, data security, and privacy protection must be addressed in real-world applications. Future research should further explore ChatGPT-4o's capabilities in handling complex cases and its applicability across different clinical settings.
d
Replication Data for: Context Matters: Understanding Student Usage, Skills,...
search.dataone.org
dataverse.harvard.edu
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cahill, Christine; McCabe, Katherine (2024). Replication Data for: Context Matters: Understanding Student Usage, Skills, and Attitudes Toward AI to Inform Classroom Policies [Dataset]. http://doi.org/10.7910/DVN/1QA5PC
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/1QA5PC
Dataset updated
Sep 24, 2024
Dataset provided by
Harvard Dataverse
Authors
Cahill, Christine; McCabe, Katherine
Description
With the growing prevalence of AI tools, such as ChatGPT, political science instructors are navigating how to manage the use and misuse of AI in the classroom. This study underscores the prevalence of AI in academic settings and suggests pedagogical practices to integrate AI in the classroom in ways informed by students’ current interactions with and attitudes toward AI. Using a survey of undergraduate students in political science courses, the study finds both that ChatGPT usage is widespread at the university level, but also that students are not confident in their skills for using AI appropriately to help improve their writing or prepare for exams. These findings point to key areas where instructors can intervene and integrate AI in ways that enhance student learning, reduce potential achievement gaps that may emerge due to differences in AI usage across student background, and help students develop critical AI literacy skills to prepare for careers that are increasingly affected by AI.
Most popular AI apps for information search India 2024-2025
statista.com
Updated Jul 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tanushree Basuroy (2025). Most popular AI apps for information search India 2024-2025 [Dataset]. https://www.statista.com/topics/2157/internet-usage-in-india/
Explore at:
Dataset updated
Jul 18, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Tanushree Basuroy
Area covered
India
Description
As indicated by a survey on AI usage in India, ChatGPT was the most popular AI platform used to search for information as of February 2025. Perplexity was the next most popular among Indian internet users, evident in nine percent of the responses. However, a significant share of internet users in India remained wary of AI platforms and chose to look up information using Google and other search engines.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Scott McGrath (2024). A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to selected genetics questions - Full study data [Dataset]. http://doi.org/10.5061/dryad.s4mw6m9cv

A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to selected genetics questions - Full study data

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5061/dryad.s4mw6m9cv

Dataset updated

Jun 4, 2024

Dataset provided by

University of California, Berkeley

Authors

Scott McGrath

License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Description

Objective: Our objective is to evaluate the efficacy of ChatGPT 4 in accurately and effectively delivering genetic information, building on previous findings with ChatGPT 3.5. We focus on assessing the utility, limitations, and ethical implications of using ChatGPT in medical settings. Materials and Methods: A structured questionnaire, including the Brief User Survey (BUS-15) and custom questions, was developed to assess ChatGPT 4's clinical value. An expert panel of genetic counselors and clinical geneticists independently evaluated ChatGPT 4's responses to these questions. We also involved comparative analysis with ChatGPT 3.5, utilizing descriptive statistics and using R for data analysis. Results: ChatGPT 4 demonstrated improvements over 3.5 in context recognition, relevance, and informativeness. However, performance variability and concerns about the naturalness of the output were noted. No significant difference in accuracy was found between ChatGPT 3.5 and 4.0. Notably, the efficacy of ChatGPT 4 varied significantly across different genetic conditions, with specific differences identified between responses related to BRCA1 and HFE. Discussion and Conclusion: This study highlights ChatGPT 4's potential in genomics, noting significant advancements over its predecessor. Despite these improvements, challenges remain, including the risk of outdated information and the necessity of ongoing refinement. The variability in performance across different genetic conditions underscores the need for expert oversight and continuous AI training. ChatGPT 4, while showing promise, emphasizes the importance of balancing technological innovation with ethical responsibility in healthcare information delivery. Methods Study Design This study was conducted to evaluate the performance of ChatGPT 4 (March 23rd, 2023) Model) in the context of genetic counseling and education. The evaluation involved a structured questionnaire, which included questions selected from the Brief User Survey (BUS-15) and additional custom questions designed to assess the clinical value of ChatGPT 4's responses. Questionnaire Development The questionnaire was built on Qualtrics, which comprised twelve questions: seven selected from the BUS-15 preceded by two additional questions that we designed. The initial questions focused on quality and answer relevancy: 1. The overall quality of the Chatbot’s response is: (5-point Likert: Very poor to Very Good) 2. The Chatbot delivered an answer that provided the relevant information you would include if asked the question. (5-point Likert: Strongly disagree to Strongly agree) The BUS-15 questions (7-point Likert: Strongly disagree to Strongly agree) focused on: 1. Recognition and facilitation of users’ goal and intent: Chatbot seems able to recognize the user’s intent and guide the user to its goals. 2. Relevance of information: The chatbot provides relevant and appropriate information/answer to people at each stage to make them closer to their goal. 3. Maxim of quantity: The chatbot responds in an informative way without adding too much information. 4. Resilience to failure: Chatbot seems able to find ways to respond appropriately even when it encounters situations or arguments it is not equipped to handle. 5. Understandability and politeness: The chatbot seems able to understand input and convey correct statements and answers without ambiguity and with acceptable manners. 6. Perceived conversational credibility: The chatbot responds in a credible and informative way without adding too much information. 7. Meet the neurodiverse needs: Chatbot seems able to meet needs and be used by users independently form their health conditions, well-being, age, etc. Expert Panel and Data Collection A panel of experts (two genetic counselors and two clinical geneticists) was provided with a link to the survey containing the questions. They independently evaluated the responses from ChatGPT 4 without discussing the questions or answers among themselves until after the survey submission. This approach ensured unbiased evaluation.

Clear search

Close search

Google apps

Main menu

A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to...

Data Sheet 1_Comparison of medical history documentation efficiency and...

Replication Data for: Context Matters: Understanding Student Usage, Skills,...

Most popular AI apps for information search India 2024-2025

A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to selected genetics questions - Full study data