https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Objective: Our objective is to evaluate the efficacy of ChatGPT 4 in accurately and effectively delivering genetic information, building on previous findings with ChatGPT 3.5. We focus on assessing the utility, limitations, and ethical implications of using ChatGPT in medical settings. Materials and Methods: A structured questionnaire, including the Brief User Survey (BUS-15) and custom questions, was developed to assess ChatGPT 4's clinical value. An expert panel of genetic counselors and clinical geneticists independently evaluated ChatGPT 4's responses to these questions. We also involved comparative analysis with ChatGPT 3.5, utilizing descriptive statistics and using R for data analysis. Results: ChatGPT 4 demonstrated improvements over 3.5 in context recognition, relevance, and informativeness. However, performance variability and concerns about the naturalness of the output were noted. No significant difference in accuracy was found between ChatGPT 3.5 and 4.0. Notably, the efficacy of ChatGPT 4 varied significantly across different genetic conditions, with specific differences identified between responses related to BRCA1 and HFE. Discussion and Conclusion: This study highlights ChatGPT 4's potential in genomics, noting significant advancements over its predecessor. Despite these improvements, challenges remain, including the risk of outdated information and the necessity of ongoing refinement. The variability in performance across different genetic conditions underscores the need for expert oversight and continuous AI training. ChatGPT 4, while showing promise, emphasizes the importance of balancing technological innovation with ethical responsibility in healthcare information delivery. Methods Study Design This study was conducted to evaluate the performance of ChatGPT 4 (March 23rd, 2023) Model) in the context of genetic counseling and education. The evaluation involved a structured questionnaire, which included questions selected from the Brief User Survey (BUS-15) and additional custom questions designed to assess the clinical value of ChatGPT 4's responses. Questionnaire Development The questionnaire was built on Qualtrics, which comprised twelve questions: seven selected from the BUS-15 preceded by two additional questions that we designed. The initial questions focused on quality and answer relevancy: 1. The overall quality of the Chatbot’s response is: (5-point Likert: Very poor to Very Good) 2. The Chatbot delivered an answer that provided the relevant information you would include if asked the question. (5-point Likert: Strongly disagree to Strongly agree) The BUS-15 questions (7-point Likert: Strongly disagree to Strongly agree) focused on: 1. Recognition and facilitation of users’ goal and intent: Chatbot seems able to recognize the user’s intent and guide the user to its goals. 2. Relevance of information: The chatbot provides relevant and appropriate information/answer to people at each stage to make them closer to their goal. 3. Maxim of quantity: The chatbot responds in an informative way without adding too much information. 4. Resilience to failure: Chatbot seems able to find ways to respond appropriately even when it encounters situations or arguments it is not equipped to handle. 5. Understandability and politeness: The chatbot seems able to understand input and convey correct statements and answers without ambiguity and with acceptable manners. 6. Perceived conversational credibility: The chatbot responds in a credible and informative way without adding too much information. 7. Meet the neurodiverse needs: Chatbot seems able to meet needs and be used by users independently form their health conditions, well-being, age, etc. Expert Panel and Data Collection A panel of experts (two genetic counselors and two clinical geneticists) was provided with a link to the survey containing the questions. They independently evaluated the responses from ChatGPT 4 without discussing the questions or answers among themselves until after the survey submission. This approach ensured unbiased evaluation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundAs medical technology advances, physicians' responsibilities in clinical practice continue to increase, with medical history documentation becoming an essential component. Artificial Intelligence (AI) technologies, particularly advances in Natural Language Processing (NLP), have introduced new possibilities for medical documentation. This study aims to evaluate the efficiency and quality of medical history documentation by ChatGPT-4o compared to resident physicians and explore the potential applications of AI in clinical documentation.MethodsUsing a non-inferiority design, this study compared the documentation time and quality scores between 5 resident physicians from the hematology department (with an average of 2.4 years of clinical experience) and ChatGPT-4o based on identical case materials. Medical history quality was evaluated by two attending physicians with over 10 years of clinical experience using ten case content criteria. Data were analyzed using paired t-tests and Wilcoxon signed-rank tests, with Kappa coefficients used to assess scoring consistency. Detailed scoring criteria included completeness (coverage of history elements), accuracy (correctness of information), logic (organization and coherence of content), and professionalism (appropriate use of medical terminology and format), each rated on a 10-point scale.ResultsIn terms of medical history quality, ChatGPT-4o achieved an average score of 88.9, while resident physicians scored 89.6, with no statistically significant difference between the two (p = 0.25). The Kappa coefficient between the two evaluators was 0.82, indicating good consistency in scoring. Non-inferiority testing showed that ChatGPT-4o's quality scores fell within the preset non-inferiority margin (5 points), indicating that its documentation quality was not inferior to that of resident physicians. ChatGPT-4o's average documentation time was 40.1 s, significantly shorter than the resident physicians' average of 14.9 min (p < 0.001).ConclusionWhile maintaining quality comparable to resident physicians, ChatGPT-4o significantly reduced the time required for medical history documentation. Despite these positive results, practical considerations such as data preprocessing, data security, and privacy protection must be addressed in real-world applications. Future research should further explore ChatGPT-4o's capabilities in handling complex cases and its applicability across different clinical settings.
With the growing prevalence of AI tools, such as ChatGPT, political science instructors are navigating how to manage the use and misuse of AI in the classroom. This study underscores the prevalence of AI in academic settings and suggests pedagogical practices to integrate AI in the classroom in ways informed by students’ current interactions with and attitudes toward AI. Using a survey of undergraduate students in political science courses, the study finds both that ChatGPT usage is widespread at the university level, but also that students are not confident in their skills for using AI appropriately to help improve their writing or prepare for exams. These findings point to key areas where instructors can intervene and integrate AI in ways that enhance student learning, reduce potential achievement gaps that may emerge due to differences in AI usage across student background, and help students develop critical AI literacy skills to prepare for careers that are increasingly affected by AI.
As indicated by a survey on AI usage in India, ChatGPT was the most popular AI platform used to search for information as of February 2025. Perplexity was the next most popular among Indian internet users, evident in nine percent of the responses. However, a significant share of internet users in India remained wary of AI platforms and chose to look up information using Google and other search engines.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Objective: Our objective is to evaluate the efficacy of ChatGPT 4 in accurately and effectively delivering genetic information, building on previous findings with ChatGPT 3.5. We focus on assessing the utility, limitations, and ethical implications of using ChatGPT in medical settings. Materials and Methods: A structured questionnaire, including the Brief User Survey (BUS-15) and custom questions, was developed to assess ChatGPT 4's clinical value. An expert panel of genetic counselors and clinical geneticists independently evaluated ChatGPT 4's responses to these questions. We also involved comparative analysis with ChatGPT 3.5, utilizing descriptive statistics and using R for data analysis. Results: ChatGPT 4 demonstrated improvements over 3.5 in context recognition, relevance, and informativeness. However, performance variability and concerns about the naturalness of the output were noted. No significant difference in accuracy was found between ChatGPT 3.5 and 4.0. Notably, the efficacy of ChatGPT 4 varied significantly across different genetic conditions, with specific differences identified between responses related to BRCA1 and HFE. Discussion and Conclusion: This study highlights ChatGPT 4's potential in genomics, noting significant advancements over its predecessor. Despite these improvements, challenges remain, including the risk of outdated information and the necessity of ongoing refinement. The variability in performance across different genetic conditions underscores the need for expert oversight and continuous AI training. ChatGPT 4, while showing promise, emphasizes the importance of balancing technological innovation with ethical responsibility in healthcare information delivery. Methods Study Design This study was conducted to evaluate the performance of ChatGPT 4 (March 23rd, 2023) Model) in the context of genetic counseling and education. The evaluation involved a structured questionnaire, which included questions selected from the Brief User Survey (BUS-15) and additional custom questions designed to assess the clinical value of ChatGPT 4's responses. Questionnaire Development The questionnaire was built on Qualtrics, which comprised twelve questions: seven selected from the BUS-15 preceded by two additional questions that we designed. The initial questions focused on quality and answer relevancy: 1. The overall quality of the Chatbot’s response is: (5-point Likert: Very poor to Very Good) 2. The Chatbot delivered an answer that provided the relevant information you would include if asked the question. (5-point Likert: Strongly disagree to Strongly agree) The BUS-15 questions (7-point Likert: Strongly disagree to Strongly agree) focused on: 1. Recognition and facilitation of users’ goal and intent: Chatbot seems able to recognize the user’s intent and guide the user to its goals. 2. Relevance of information: The chatbot provides relevant and appropriate information/answer to people at each stage to make them closer to their goal. 3. Maxim of quantity: The chatbot responds in an informative way without adding too much information. 4. Resilience to failure: Chatbot seems able to find ways to respond appropriately even when it encounters situations or arguments it is not equipped to handle. 5. Understandability and politeness: The chatbot seems able to understand input and convey correct statements and answers without ambiguity and with acceptable manners. 6. Perceived conversational credibility: The chatbot responds in a credible and informative way without adding too much information. 7. Meet the neurodiverse needs: Chatbot seems able to meet needs and be used by users independently form their health conditions, well-being, age, etc. Expert Panel and Data Collection A panel of experts (two genetic counselors and two clinical geneticists) was provided with a link to the survey containing the questions. They independently evaluated the responses from ChatGPT 4 without discussing the questions or answers among themselves until after the survey submission. This approach ensured unbiased evaluation.