100+ datasets found
  1. H

    Replication Data for: ChatGPT on ChatGPT: An Exploratory Analysis of its...

    • dataverse.harvard.edu
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jieshu Wang; Elif Kiran; Aurora Mai (also known as Mai P. Trinh); Michael Simeone; José Lobo (2024). Replication Data for: ChatGPT on ChatGPT: An Exploratory Analysis of its Performance in the Public Sector Workforce [Dataset]. http://doi.org/10.7910/DVN/P3CDHS
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 31, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Jieshu Wang; Elif Kiran; Aurora Mai (also known as Mai P. Trinh); Michael Simeone; José Lobo
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This repository contains two datasets used in the study exploring the impact of Generative AI, specifically ChatGPT, on the public sector workforce in the United States. The datasets provide detailed information on the core tasks of public sector occupations and their estimated performance metrics, including potential for automation and augmentation by ChatGPT. These estimations are generated by OpenAI’s GPT-4 model (GPT-4-1106-preview) through OpenAI API.

  2. s

    Data from: ChatGPT in education: A discourse analysis of worries and...

    • socialmediaarchive.org
    csv, json, txt
    Updated Sep 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). ChatGPT in education: A discourse analysis of worries and concerns on social media [Dataset]. https://socialmediaarchive.org/record/54
    Explore at:
    csv(6528597), json(248465998), txt(4908229)Available download formats
    Dataset updated
    Sep 26, 2023
    Description

    The rapid advancements in generative AI models present new opportunities in the education sector. However, it is imperative to acknowledge and address the potential risks and concerns that may arise with their use. We collected Twitter data to identify key concerns related to the use of ChatGPT in education. This dataset is used to support the study "ChatGPT in education: A discourse analysis of worries and concerns on social media."

    In this study, we particularly explored two research questions. RQ1 (Concerns): What are the key concerns that Twitter users perceive with using ChatGPT in education? RQ2 (Accounts): Which accounts are implicated in the discussion of these concerns? In summary, our study underscores the importance of responsible and ethical use of AI in education and highlights the need for collaboration among stakeholders to regulate AI policy.

  3. W

    ChatGPT Usage Survey Data

    • webfx.com
    Updated Sep 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WebFX (2025). ChatGPT Usage Survey Data [Dataset]. https://www.webfx.com/blog/ai/chatgpt-usage-statistics/
    Explore at:
    Dataset updated
    Sep 2, 2025
    Dataset authored and provided by
    WebFX
    Variables measured
    Average words in first message, Average words per ChatGPT conversation, Average number of messages per conversation, Percentage of conversations that are commands, Percentage of conversations that start as questions, Percentage of conversations in the "learning & understanding" category, Percentage of conversations using advanced features (persona assignment / data upload)
    Description

    Analysis of 13,252 publicly shared ChatGPT conversations by WebFX to uncover usage statistics - prompt length, message count, question vs command distribution, use-case categories.

  4. t

    Producing Charts with AI - Data Analysis

    • tomtunguz.com
    Updated Jul 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomasz Tunguz (2023). Producing Charts with AI - Data Analysis [Dataset]. https://tomtunguz.com/data-analysis-gpt/
    Explore at:
    Dataset updated
    Jul 17, 2023
    Dataset provided by
    Theory Ventures
    Authors
    Tomasz Tunguz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Discover how AI code interpreters are revolutionizing data visualization, reducing chart creation time from 20 to 5 minutes while simplifying complex statistical analysis.

  5. d

    How are Chat GPT and AI used in medical diagnosis

    • dataone.org
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maher Asaad Baker (2023). How are Chat GPT and AI used in medical diagnosis [Dataset]. http://doi.org/10.7910/DVN/2HMJ58
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Maher Asaad Baker
    Description

    The potential of using Chat GPT and AI to revolutionize the way we interact with computers, specifically in the field of medical diagnostics. Chat GPT can make conversations between doctors and patients more natural, while AI can analyze vast amounts of patient data to identify trends and estimate a patient’s health. Patients can use Chat GPT to better understand their medical conditions, and both Chat GPT and AI can be used to automate tasks such as scheduling appointments and processing test results. However, there are limitations to using AI, including data bias, complex results, and analysis errors. To reduce errors, it is important to validate findings using various techniques and ensure that data is accurate and up-to-date. Chat GPT also employs security measures to protect patient data privacy and confidentiality.

  6. ChatGPT Reddit

    • kaggle.com
    zip
    Updated Jan 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Armita Razavi (2023). ChatGPT Reddit [Dataset]. https://www.kaggle.com/datasets/armitaraz/chatgpt-reddit/data
    Explore at:
    zip(5282154 bytes)Available download formats
    Dataset updated
    Jan 29, 2023
    Authors
    Armita Razavi
    License

    https://www.reddit.com/wiki/apihttps://www.reddit.com/wiki/api

    Description

    Here you can find about 50K comments on Reddit website regarding ChatGPT . The comments are gathered from Reddit's Posts from 4 subreddits.

    The data includes comment_id, comment_parent_id, comment_body and subreddit

    • comment_id : the comment's id
    • comment_parent_id: the comment's id which the current comment is replied to.
    • comment_body: the comment
    • subreddit: the community/subreddit name of the comment

    The Date and other information related to comments will be added in the next version. This dataset is useful to get insight about the public take on ChatGPT and also for text analysis, text visualizations, Inline Question Answering, Text Summarization, NER and other tasks like clustering and so on.

    Please note that this dataset is not cleaned or preprocessed so if you want to get your hands dirty with data, it's a good practice to level up your skills in data cleaning too :)

    And please don't forget to UPVOTE it in case you find it useful and enjoy it.

  7. Table 1_Generative Artificial Intelligence for Data Analysis: A Randomised...

    • frontiersin.figshare.com
    docx
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tafadzwa Dhokotera; Nandi Joubert; Aline Veillat; Christoph Pimmer; Karin Gross; Marco Waser; Jan Hattendorf; Julia Bohlius (2025). Table 1_Generative Artificial Intelligence for Data Analysis: A Randomised Controlled Trial in a Public Health Research Institute.docx [Dataset]. http://doi.org/10.3389/ijph.2025.1608572.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Oct 1, 2025
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Tafadzwa Dhokotera; Nandi Joubert; Aline Veillat; Christoph Pimmer; Karin Gross; Marco Waser; Jan Hattendorf; Julia Bohlius
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ObjectiveTo assess the competence of students and academic staff to use generative artificial intelligence (GenAI) as a tool in epidemiological data analyses in a randomised controlled trial (RCT).MethodsWe invited postgraduate students and academic staff at the Swiss Tropical and Public Health Institute to the RCT. Participants were randomized to analyse a simulated cross-sectional dataset using ChatGPT’s code interpreter (integrated analysis arm) vs. a statistical software (R/Stata) with ChatGPT as a support tool (distributed analysis arm). The primary outcome was the trial task score (out of 17, using an assessment rubric). Secondary outcome was the time to complete the task.ResultsWe invited 338 and randomized 31 participants equally to the two study arms and 30 participants submitted results. Overall, there was no statistically significant difference in mean task scores between the distributed analysis arm (8.5, ±4.6) and the integrated analysis arm (9.4, ±3.8), with a mean difference of 0.93 (p = 0.55). Mean task completion time was significantly shorter in the integrated analysis arm compared to the distributed analysis arm.ConclusionWhile ChatGPT offers advantages, its effective use requires a careful balance of GenAI capabilities and human expertise.

  8. t

    ChatGPT Discussion Trends

    • tickertrends.io
    html
    Updated Oct 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TickerTrends (2025). ChatGPT Discussion Trends [Dataset]. https://tickertrends.io/chatgpt-trends
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Oct 11, 2025
    Dataset authored and provided by
    TickerTrends
    License

    https://tickertrends.io/termshttps://tickertrends.io/terms

    Time period covered
    Nov 2022 - Present
    Area covered
    Global
    Variables measured
    Keyword Volume, Topic Mentions, Trend Momentum
    Description

    Monthly dataset tracking topic frequency, keyword volume, and conversation patterns across ChatGPT discussions. Data is normalized on a 0 to 100 scale for easy comparison. Aggregates millions of AI interactions to reveal emerging trends, user interests, and discussion momentum across technology, finance, health, education, and business categories.

  9. 500k ChatGPT-related Tweets Jan-Mar 2023

    • kaggle.com
    zip
    Updated Apr 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khalid Ansari (2023). 500k ChatGPT-related Tweets Jan-Mar 2023 [Dataset]. https://www.kaggle.com/datasets/khalidryder777/500k-chatgpt-tweets-jan-mar-2023/code
    Explore at:
    zip(49816658 bytes)Available download formats
    Dataset updated
    Apr 11, 2023
    Authors
    Khalid Ansari
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains a CSV file related to ChatGPT including keywords(chatgpt, chat gpt) #hashtags and @mentions about ChatGPT. OpenAI's conversational AI model. The file includes information on 500,000 tweets. The dataset aims to help understand public opinion, trends, and potential applications of ChatGPT by analyzing tweet volume, sentiment, user engagement, and the influence of key AI events. The dataset offers valuable insights for companies, researchers, and policymakers, allowing them to make informed decisions and shape the future of AI-powered conversational technologies.

    Check out my Comprehensive Analysis on this dataset: Medium article "Cracking the ChatGPT Code: A Deep Dive into 500,000 Tweets using Advanced NLP Techniques"

    Learn about the collection process in Medium article "Effortlessly Scraping Massive Twitter Data"

  10. Z

    Collected Data of Evaluating ChatGPT for Detecting Security Vulnerabilities...

    • data-staging.niaid.nih.gov
    Updated Jan 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alqaradaghi, Midya (2025). Collected Data of Evaluating ChatGPT for Detecting Security Vulnerabilities in Java Code [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_14505161
    Explore at:
    Dataset updated
    Jan 31, 2025
    Authors
    Alqaradaghi, Midya
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the links to all the related experiments that I run related to my article titled Using "LLM for finding security vulnerabilities."

  11. Data from: Academic Discourse on ChatGPT in Social Sciences: A Topic...

    • figshare.com
    zip
    Updated Jul 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qian Shen (2025). Academic Discourse on ChatGPT in Social Sciences: A Topic Modeling and Sentiment Analysis of Research Article Abstracts [Dataset]. http://doi.org/10.6084/m9.figshare.29625773.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 23, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Qian Shen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the dataset and code used in the study titled “Academic Discourse on ChatGPT in Social Sciences: A Topic Modeling and Sentiment Analysis of Research Article Abstracts.” The study explores how social science scholars frame and evaluate ChatGPT by analyzing 1,227 SSCI-indexed abstracts using Latent Dirichlet Allocation (LDA) topic modeling and lexicon-based sentiment analysis. The data include the collected abstracts (with metadata), while the code files provide the full analytical pipeline in Python and R, covering preprocessing, topic modeling, sentiment scoring using the NRC Emotion Lexicon, and visualization scripts. This repository supports transparency, reproducibility, and reuse of the study’s computational methods and underlying materials.

  12. 4

    Data associated with the article: "Exploring the Viability of ChatGPT for...

    • data.4tu.nl
    zip
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nina van Staalduine, Data associated with the article: "Exploring the Viability of ChatGPT for Personal Data Anonymization in Government: A Comprehensive Analysis of Possibilities, Risks, and Ethical Implications" [Dataset]. http://doi.org/10.4121/a1dfacbe-b463-404f-a3d7-dab8485e6458.v1
    Explore at:
    zipAvailable download formats
    Dataset provided by
    4TU.ResearchData
    Authors
    Nina van Staalduine
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Time period covered
    Feb 2023 - Jul 2023
    Dataset funded by
    Justitiële Informatiedienst
    Description

    Artificial Intelligence (AI) applications are expected to promote government service delivery and quality, more efficient handling of cases, and bias reduction in decision-making. One potential benefit of the AI tool ChatGPT is that it may support governments in the anonymization of data. However, it is not clear whether ChatGPT is appropriate to support data anonymization for public organizations. Hence, this study examines the possibilities, risks, and ethical implications for government organizations to employ ChatGPT in the anonymization of personal data. We use a case study approach, combining informal conversations, formal interviews, a literature review, document analysis and experiments to conduct a three-step study. First, we describe the technology behind ChatGPT and its operation. Second, experiments with three types of data (fake data, original literature and modified literature) show that ChatGPT exhibits strong performance in anonymizing these three types of texts. Third, an overview of significant risks and ethical issues related to ChatGPT and its use for anonymization within a specific government organization was generated, including themes such as privacy, responsibility, transparency, bias, human intervention, and sustainability. One significant risk in the current form of ChatGPT is a privacy risk, as inputs are stored and forwarded to OpenAI and potentially other parties. This is unacceptable if texts containing personal data are anonymized with ChatGPT. We discuss several potential solutions to address these risks and ethical issues. This study contributes to the scarce scientific literature on the potential value of employing ChatGPT for personal data anonymization in government. In addition, this study has practical value for civil servants who face the challenges of data anonymization in practice including resource-intensive and costly processes.

  13. f

    Data from: The impact of using ChatGPT on academic writing among medical...

    • datasetcatalog.nlm.nih.gov
    • tandf.figshare.com
    Updated Nov 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wang, Jingyu; Shu, Jiankun; Liao, Yuxuan; Wang, Rui; Zhang, Decai; Wang, Na; Liu, Shaojun (2024). The impact of using ChatGPT on academic writing among medical undergraduates [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001479573
    Explore at:
    Dataset updated
    Nov 18, 2024
    Authors
    Wang, Jingyu; Shu, Jiankun; Liao, Yuxuan; Wang, Rui; Zhang, Decai; Wang, Na; Liu, Shaojun
    Description

    ChatGPT is widely used for writing tasks, yet its effects on medical students’ academic writing remain underexplored. This study aims to elucidate ChatGPT’s impact on academic writing efficiency and quality among medical students, while also evaluating students’ attitudes towards its use in academic writing. We collected systematic reviews from 130 third-year medical students and administered a questionnaire to assess ChatGPT usage and student attitudes. Three independent reviewers graded the papers using EASE guidelines, and statistical analysis compared articles generated with or without ChatGPT assistance across various parameters, with rigorous quality control ensuring survey reliability and validity. In this study, 33 students (25.8%) utilized ChatGPT for writing (ChatGPT group) and 95 (74.2%) did not (Control group). The ChatGPT group exhibited significantly higher daily technology use and prior experience with ChatGPT (p < 0.05). Writing time was significantly reduced in the ChatGPT group (p = 0.04), with 69.7% completing tasks within 2–3 days compared to 48.4% in the control group. They also achieved higher article quality scores (p < 0.0001) with improvements in completeness, credibility, and scientific content. Self-assessment indicated enhanced writing skills (p < 0.01), confidence (p < 0.001), satisfaction (p < 0.001) and a positive attitude toward its future use in the ChatGPT group. Integrating ChatGPT in medical academic writing, with proper guidance, improves efficiency and quality, illustrating artificial intelligence’s potential in shaping medical education methodologies.

  14. f

    Data Sheet 2_Large language models generating synthetic clinical datasets: a...

    • frontiersin.figshare.com
    • figshare.com
    xlsx
    Updated Feb 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Austin A. Barr; Joshua Quan; Eddie Guo; Emre Sezgin (2025). Data Sheet 2_Large language models generating synthetic clinical datasets: a feasibility and comparative analysis with real-world perioperative data.xlsx [Dataset]. http://doi.org/10.3389/frai.2025.1533508.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 5, 2025
    Dataset provided by
    Frontiers
    Authors
    Austin A. Barr; Joshua Quan; Eddie Guo; Emre Sezgin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundClinical data is instrumental to medical research, machine learning (ML) model development, and advancing surgical care, but access is often constrained by privacy regulations and missing data. Synthetic data offers a promising solution to preserve privacy while enabling broader data access. Recent advances in large language models (LLMs) provide an opportunity to generate synthetic data with reduced reliance on domain expertise, computational resources, and pre-training.ObjectiveThis study aims to assess the feasibility of generating realistic tabular clinical data with OpenAI’s GPT-4o using zero-shot prompting, and evaluate the fidelity of LLM-generated data by comparing its statistical properties to the Vital Signs DataBase (VitalDB), a real-world open-source perioperative dataset.MethodsIn Phase 1, GPT-4o was prompted to generate a dataset with qualitative descriptions of 13 clinical parameters. The resultant data was assessed for general errors, plausibility of outputs, and cross-verification of related parameters. In Phase 2, GPT-4o was prompted to generate a dataset using descriptive statistics of the VitalDB dataset. Fidelity was assessed using two-sample t-tests, two-sample proportion tests, and 95% confidence interval (CI) overlap.ResultsIn Phase 1, GPT-4o generated a complete and structured dataset comprising 6,166 case files. The dataset was plausible in range and correctly calculated body mass index for all case files based on respective heights and weights. Statistical comparison between the LLM-generated datasets and VitalDB revealed that Phase 2 data achieved significant fidelity. Phase 2 data demonstrated statistical similarity in 12/13 (92.31%) parameters, whereby no statistically significant differences were observed in 6/6 (100.0%) categorical/binary and 6/7 (85.71%) continuous parameters. Overlap of 95% CIs were observed in 6/7 (85.71%) continuous parameters.ConclusionZero-shot prompting with GPT-4o can generate realistic tabular synthetic datasets, which can replicate key statistical properties of real-world perioperative data. This study highlights the potential of LLMs as a novel and accessible modality for synthetic data generation, which may address critical barriers in clinical data access and eliminate the need for technical expertise, extensive computational resources, and pre-training. Further research is warranted to enhance fidelity and investigate the use of LLMs to amplify and augment datasets, preserve multivariate relationships, and train robust ML models.

  15. 89k ChatGPT conversations

    • kaggle.com
    zip
    Updated May 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Noah Persaud (2023). 89k ChatGPT conversations [Dataset]. https://www.kaggle.com/datasets/noahpersaud/89k-chatgpt-conversations
    Explore at:
    zip(681600031 bytes)Available download formats
    Dataset updated
    May 4, 2023
    Authors
    Noah Persaud
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains all available conversations from chatlogs.net between users and ChatGPT. Version 1 contains all conversations available up to the cutoff date of April 4, 2023. Version 1 contains all conversations available up to the cutoff date of April 20, 2023.

  16. f

    S1 Data -

    • plos.figshare.com
    xlsx
    Updated Nov 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jun Qiu; Youlian Zhou (2024). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0311937.s003
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 20, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Jun Qiu; Youlian Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundChatGPT, developed by OpenAI, is an artificial intelligence software designed to generate text-based responses. The objective of this study is to evaluate the accuracy and consistency of ChatGPT’s responses to single-choice questions pertaining to carbon monoxide poisoning. This evaluation will contribute to our understanding of the reliability of ChatGPT-generated information in the medical field.MethodsThe questions utilized in this study were selected from the "Medical Exam Assistant (Yi Kao Bang)" application and encompassed a range of topics related to carbon monoxide poisoning. A total of 44 single-choice questions were included in the study following a screening process. Each question was entered into ChatGPT ten times in Chinese, followed by a translation into English, where it was also entered ten times. The responses generated by ChatGPT were subjected to statistical analysis with the objective of assessing their accuracy and consistency in both languages. In this assessment process, the "Medical Exam Assistant (Yi Kao Bang)" reference responses were employed as benchmarks. The data analysis was conducted using the Python.ResultsIn approximately 50% of the cases, the responses generated by ChatGPT exhibited a high degree of consistency, whereas in approximately one-third of the cases, the responses exhibited unacceptable blurring of the answers. Meanwhile, the accuracy of these responses was less favorable, with an accuracy rate of 61.1% in Chinese and 57% in English. This indicates that ChatGPT could be enhanced with respect to both consistency and accuracy in responding to queries pertaining to carbon monoxide poisoning.ConclusionsIt is currently evident that the consistency and accuracy of responses generated by ChatGPT regarding carbon monoxide poisoning is inadequate. Although it offers significant insights, it should not supersede the role of healthcare professionals in making clinical decisions.

  17. m

    Data from: ChatGPT as an education and learning tool for engineering,...

    • data.mendeley.com
    Updated May 14, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RAVINDRA BHARDWAJ (2024). ChatGPT as an education and learning tool for engineering, technology and general studies: performance analysis of ChatGPT 3.0 on CSE, GATE and JEE examinations of India [Dataset]. http://doi.org/10.17632/995zwcz5yt.1
    Explore at:
    Dataset updated
    May 14, 2024
    Authors
    RAVINDRA BHARDWAJ
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    This is the raw data that is used in the publication: ChatGPT as an education and learning tool for engineering, technology and general studies: performance analysis of ChatGPT 3.0 on CSE, GATE and JEE examinations of India.

  18. Z

    Data from: Dataset from the study "Analysis of the accuracy of scientific...

    • data.niaid.nih.gov
    • producciocientifica.uv.es
    Updated Mar 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sixto-Costoya, Andrea; Liu, Yiming; Vidal-Cabo, Christian; Aleixandre- Benavent, Rafael; Valderrama-Zurián, Juan Carlos (2023). Dataset from the study "Analysis of the accuracy of scientific literature references provided by ChatGPT" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7788299
    Explore at:
    Dataset updated
    Mar 31, 2023
    Dataset provided by
    Universidad Católica de Valencia San Vicente Màrtir
    Instituto de Gestión de la Innovación y del Conocimiento – Ingenio (CSIC-Universitat Politécnica de València)
    Universitat de València
    Authors
    Sixto-Costoya, Andrea; Liu, Yiming; Vidal-Cabo, Christian; Aleixandre- Benavent, Rafael; Valderrama-Zurián, Juan Carlos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset corresponds to the study carried out to analyse 10 bibliographic references of 10 Spanish authors in the field of Information Sciences requested to the ChatGPT chatbot.

    The file "Bibliographic_references_ analysis" contains the 10 references returned by ChatGPT for each of the 10 authors (a total of 100 references), together with the variables analysed to check their authenticity.

    The "Keywords_analysis" file contains the normalisation carried out on the words considered to be key words extracted from the titles of the works, according to which a word cloud showing the frequency of occurrence could be drawn up.

  19. H

    ChatGPT examples in the hydrological sciences

    • hydroshare.org
    • beta.hydroshare.org
    • +1more
    zip
    Updated Oct 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dylan Irvine (2023). ChatGPT examples in the hydrological sciences [Dataset]. http://doi.org/10.4211/hs.fc0552275ea14c7082218c42ebd63da6
    Explore at:
    zip(1.3 MB)Available download formats
    Dataset updated
    Oct 9, 2023
    Dataset provided by
    HydroShare
    Authors
    Dylan Irvine
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    WGS 84 EPSG:4326,
    Description

    ChatGPT has forever changed the way that many industries operate. Much of the focus of Artificial Intelligence (AI) has been on their ability to generate text. However, it is likely that their ability to generate computer codes and scripts will also have a major impact. We demonstrate the use of ChatGPT to generate Python scripts to perform hydrological analyses and highlight the opportunities, limitations and risks that AI poses in the hydrological sciences.

    Here, we provide four worked examples of the use of ChatGPT to generate scripts to conduct hydrological analyses. We also provide a full list of the libraries available to the ChatGPT Advanced Data Analysis plugin (only available in the paid version). These files relate to a manuscript that is to be submitted to Hydrological Processes. The authors of the manuscript are Dylan J. Irvine, Landon J.S. Halloran and Philip Brunner.

    If you find these examples useful and/or use them, we would appreciate if you could cite the associated publication in Hydrological Processes. Details to be made available upon final publication.

  20. 4

    Supplementary data for the paper 'Can ChatGPT be used to predict citation...

    • data.4tu.nl
    zip
    Updated Jan 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joost de Winter (2024). Supplementary data for the paper 'Can ChatGPT be used to predict citation counts, readership, and social media interaction? An exploration among 2222 scientific abstracts' [Dataset]. http://doi.org/10.4121/710585da-ed2e-4d36-b8e4-ad02c3af1e65.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 5, 2024
    Dataset provided by
    4TU.ResearchData
    Authors
    Joost de Winter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This study explores the potential of ChatGPT, a large language model, in scientometrics by assessing its ability to predict citation counts, Mendeley readers, and social media engagement. In this study, 2222 abstracts from PLOS ONE articles published during the initial months of 2022 were analyzed using ChatGPT-4, which used a set of 60 criteria to assess each abstract. Using a principal component analysis, three components were identified: Quality and Reliability, Accessibility and Understandability, and Novelty and Engagement. The Accessibility and Understandability of the abstracts correlated with higher Mendeley readership, while Novelty and Engagement and Accessibility and Understandability were linked to citation counts (Dimensions, Scopus, Google Scholar) and social media attention. Quality and Reliability showed minimal correlation with citation and altmetrics outcomes. Finally, it was found that the predictive correlations of ChatGPT-based assessments surpassed traditional readability metrics. The findings highlight the potential of large language models in scientometrics and possibly pave the way for AI-assisted peer review.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jieshu Wang; Elif Kiran; Aurora Mai (also known as Mai P. Trinh); Michael Simeone; José Lobo (2024). Replication Data for: ChatGPT on ChatGPT: An Exploratory Analysis of its Performance in the Public Sector Workforce [Dataset]. http://doi.org/10.7910/DVN/P3CDHS

Replication Data for: ChatGPT on ChatGPT: An Exploratory Analysis of its Performance in the Public Sector Workforce

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 31, 2024
Dataset provided by
Harvard Dataverse
Authors
Jieshu Wang; Elif Kiran; Aurora Mai (also known as Mai P. Trinh); Michael Simeone; José Lobo
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

This repository contains two datasets used in the study exploring the impact of Generative AI, specifically ChatGPT, on the public sector workforce in the United States. The datasets provide detailed information on the core tasks of public sector occupations and their estimated performance metrics, including potential for automation and augmentation by ChatGPT. These estimations are generated by OpenAI’s GPT-4 model (GPT-4-1106-preview) through OpenAI API.

Search
Clear search
Close search
Google apps
Main menu