9 datasets found
  1. f

    Data from: Analyzing student prompts and their effect on ChatGPT’s...

    • tandf.figshare.com
    txt
    Updated Dec 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghadeer Sawalha; Imran Taj; Abdulhadi Shoufan (2024). Analyzing student prompts and their effect on ChatGPT’s performance [Dataset]. http://doi.org/10.6084/m9.figshare.26970708.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 12, 2024
    Dataset provided by
    Taylor & Francis
    Authors
    Ghadeer Sawalha; Imran Taj; Abdulhadi Shoufan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Large language models present new opportunities for teaching and learning. The response accuracy of these models, however, is believed to depend on the prompt quality which can be a challenge for students. In this study, we aimed to explore how undergraduate students use ChatGPT for problem-solving, what prompting strategies they develop, the link between these strategies and the model’s response accuracy, the existence of individual prompting tendencies, and the impact of gender in this context. Our students used ChatGPT to solve five problems related to embedded systems and provided the solutions and the conversations with this model. We analyzed the conversations thematically to identify prompting strategies and applied different quantitative analyses to establish relationships between these strategies and the response accuracy and other factors. The findings indicate that students predominantly employ three types of prompting strategies: single copy-and-paste prompting (SCP), single reformulated prompting (SRP), and multiple-question prompting (MQP). ChatGPT’s response accuracy using SRP and MQP was significantly higher than using SCP, with effect sizes of -0.94 and -0.69, respectively. The student-by-student analysis revealed some tendencies. For example, 26 percent of the students consistently copied and pasted the questions into ChatGPT without any modification. Students who used MQP showed better performance in the final exam than those who did not use this prompting strategy. As for gender, female students tended to make extensive use of SCP, whereas male students tended to mix SCP and MQP. We conclude that students develop different prompting strategies that lead to different response qualities and learning. More research is needed to deepen our understanding and inform effective educational practices in the AI era.

  2. o

    Supplemental for ChatGPT User Study - GenAI in Learning Software Engineering...

    • explore.openaire.eu
    Updated Nov 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rudrajit Choudhuri; Dylan Liu; Igor Steinmacher; Marco Gerosa; Anita Sarma (2023). Supplemental for ChatGPT User Study - GenAI in Learning Software Engineering [Dataset]. http://doi.org/10.5281/zenodo.8193821
    Explore at:
    Dataset updated
    Nov 3, 2023
    Authors
    Rudrajit Choudhuri; Dylan Liu; Igor Steinmacher; Marco Gerosa; Anita Sarma
    Description

    Supplemental Material Contents: · 1-Demographic Information.xlsx: contains the demographic information of the participants in the study. · 2-Forms.zip: contains the forms and questionnaires used to collect data for the experiment: demographic form, pre-study, post-study, and AAR/AI questionnaires. · 3-GitHub-Repository.zip: a copy of the GitHub repository used in the study. · 4-Tutorial Scripts.zip: script used in the experiment with the groups to be consistent with all participants. · 5-Logs-Rubric-Grades.zip: contains the participant data log (commit and PR), rubric for grading submissions, and grades. · 6-RQ1-Data-and-Analysis.zip: contains the data and analysis with respect to RQ1. · 7-RQ2-Data-and-Analysis.zip: contains the data and analysis with respect to RQ2. · 8-Participant Prompts.xlsx: contains the experimental group participant prompts with ChatGPT. 2. Forms.zip The forms zip contains the following files: · Demographics.pdf: a form used to collect demographic information from participants before the study. · Control Pre-Study Questionnaire.pdf: Pre study questionnaire control group (Self-Efficacy Questionnaire) · Control Post-Study Questionnaire.pdf: Post study questionnaire control group (NASA-TLX, Self-Efficacy Questionnaire) · Treatment - AAR_AI task.pdf: Pre and Post task AAR/AI questionnaire for experimental group. · Experimental Pre-Study Questionnaire.pdf: Pre study questionnaire experimental group (Self-Efficacy Questionnaire, Question for Familiarity with AI) · Experimental-Post Study Questionnaire.pdf: Post study questionnaire experimental group (AAR/AI step 7, Continuance Intention, NASA-TLX, HAI Guideline Questions, Self-Efficacy Questionnaire) 3-GitHub-Repository.zip The GitHub repository used in the study: contains the main.py code file and the Readme.md file (having the written instructions for the participants). 4-Tutorial Scripts.zip Contains: · Control-Script.pdf: Script for the control group. · Experimental-Script.pdf: Script for the experimental group. 5-Logs-Rubric-Grades.zip · rubric.pdf: Created rubric for grading task performance. · GitHub-Task3-Log.xlsx: File containing the data regarding the status of commit made and PR raised for each participant. · grades.xlsx: Detailed grades for each participant in experimental (treatment) and control groups. 6-RQ1-Data-and-Analysis.zip Note: The term 'treatment' has been used in the files of this folder to represent the experimental group: participants using ChatGPT for the tasks. · NASA TLX: folder containing the participant data (TLX.xlsx), code for statistical analysis (Stat-TLX.py) and statistical reports (analysis-TLX.csv). · Task Performance: folder containing the participant data (grades.xlsx & Scores.xlsx(overall grade)), code for statistical analysis (Stat-Correctness.py) and statistical reports (analysis.csv). · Self-Efficacy: folder containing: o Self-Efficacy-detailed.xlsx: participant data o Paired Stats: folder containing data (Total Self Efficacy.csv), code for statistical analysis(paired-stats.py), and statistical reports (analysis.csv). o Box plot: folder containing the code for generating the box plot and its output. · Continuance Intention.xlsx: participant data (experimental) for continuance intention of ChatGPT. · Stat-Table-H1-2-Paper.xlsx: Statistics table for NASA TLX and task performance as presented in the paper. 7-RQ2-Data-and-Analysis.zip · AAR_AI-Responses.xlsx: AAR/AI responses filled by participants in experimental group. · Quotation Manager-Faults&Conseq.xlsx: Contains the quotations from AAR/AI responses along with corresponding codes. Also contains the quotes that link faults to consequences in a separate sheet. · Codebook.xlsx: The final codebook (faults and consequences). · HAI-data.xlsx: Contains the reported guideline violations along with disaggregated analysis (grouped by gender). · Likert Plot-HAI: folder contains the code for generating the Likert plot figure presented in the paper.

  3. f

    Data_Sheet_2_Performance analysis of large language models in the domain of...

    • frontiersin.figshare.com
    pdf
    Updated Nov 17, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdullah Al Zubaer; Michael Granitzer; Jelena Mitrović (2023). Data_Sheet_2_Performance analysis of large language models in the domain of legal argument mining.pdf [Dataset]. http://doi.org/10.3389/frai.2023.1278796.s002
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Nov 17, 2023
    Dataset provided by
    Frontiers
    Authors
    Abdullah Al Zubaer; Michael Granitzer; Jelena Mitrović
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Generative pre-trained transformers (GPT) have recently demonstrated excellent performance in various natural language tasks. The development of ChatGPT and the recently released GPT-4 model has shown competence in solving complex and higher-order reasoning tasks without further training or fine-tuning. However, the applicability and strength of these models in classifying legal texts in the context of argument mining are yet to be realized and have not been tested thoroughly. In this study, we investigate the effectiveness of GPT-like models, specifically GPT-3.5 and GPT-4, for argument mining via prompting. We closely study the model's performance considering diverse prompt formulation and example selection in the prompt via semantic search using state-of-the-art embedding models from OpenAI and sentence transformers. We primarily concentrate on the argument component classification task on the legal corpus from the European Court of Human Rights. To address these models' inherent non-deterministic nature and make our result statistically sound, we conducted 5-fold cross-validation on the test set. Our experiments demonstrate, quite surprisingly, that relatively small domain-specific models outperform GPT 3.5 and GPT-4 in the F1-score for premise and conclusion classes, with 1.9% and 12% improvements, respectively. We hypothesize that the performance drop indirectly reflects the complexity of the structure in the dataset, which we verify through prompt and data analysis. Nevertheless, our results demonstrate a noteworthy variation in the performance of GPT models based on prompt formulation. We observe comparable performance between the two embedding models, with a slight improvement in the local model's ability for prompt selection. This suggests that local models are as semantically rich as the embeddings from the OpenAI model. Our results indicate that the structure of prompts significantly impacts the performance of GPT models and should be considered when designing them.

  4. Sentiment Analysis Dataset

    • kaggle.com
    Updated May 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samarth Kuchya (2024). Sentiment Analysis Dataset [Dataset]. https://www.kaggle.com/datasets/samarthkumarkuchya/sentiment-analysis-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 27, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Samarth Kuchya
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This data has been created using prompt engineering over chatGPT which has following labels - 0 - negative 1 - neutral 2 - positive

  5. o

    Model Output of GPT-3.5 and GPT-4 for ECHR-AM

    • explore.openaire.eu
    Updated Aug 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    Dataset updated
    Aug 16, 2023
    Authors
    Abdullah Al Zubaer; Michael Granitzer; Jelena Mitrović
    Description

    "gpt3.5-gpt4-input-output-echram.zip" : Input and output to GPT-3.5 and GPT-4 based on ECHR dataset published in JSON format in this paper for argument component classification only i.e. clauses that are argumentative (conclusion/premise), extracted from the JSON file Note: Output of the model is under OpenAI Terms & policies. Please cite our paper also if you use this dataset: Performance analysis of large language models in the domain of legal argument mining You can click here for BibTex or copy the text below. @ARTICLE{10.3389/frai.2023.1278796, AUTHOR={Al Zubaer, Abdullah and Granitzer, Michael and Mitrović, Jelena }, TITLE={Performance analysis of large language models in the domain of legal argument mining}, JOURNAL={Frontiers in Artificial Intelligence}, VOLUME={6}, YEAR={2023}, URL={https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2023.1278796}, DOI={10.3389/frai.2023.1278796}, ISSN={2624-8212}, ABSTRACT={Generative pre-trained transformers (GPT) have recently demonstrated excellent performance in various natural language tasks. The development of ChatGPT and the recently released GPT-4 model has shown competence in solving complex and higher-order reasoning tasks without further training or fine-tuning. However, the applicability and strength of these models in classifying legal texts in the context of argument mining are yet to be realized and have not been tested thoroughly. In this study, we investigate the effectiveness of GPT-like models, specifically GPT-3.5 and GPT-4, for argument mining via prompting. We closely study the model's performance considering diverse prompt formulation and example selection in the prompt via semantic search using state-of-the-art embedding models from OpenAI and sentence transformers. We primarily concentrate on the argument component classification task on the legal corpus from the European Court of Human Rights. To address these models' inherent non-deterministic nature and make our result statistically sound, we conducted 5-fold cross-validation on the test set. Our experiments demonstrate, quite surprisingly, that relatively small domain-specific models outperform GPT 3.5 and GPT-4 in the F1-score for premise and conclusion classes, with 1.9% and 12% improvements, respectively. We hypothesize that the performance drop indirectly reflects the complexity of the structure in the dataset, which we verify through prompt and data analysis. Nevertheless, our results demonstrate a noteworthy variation in the performance of GPT models based on prompt formulation. We observe comparable performance between the two embedding models, with a slight improvement in the local model's ability for prompt selection. This suggests that local models are as semantically rich as the embeddings from the OpenAI model. Our results indicate that the structure of prompts significantly impacts the performance of GPT models and should be considered when designing them.}}

  6. Data and Code for: Generative AI for Economic Research: Use Cases and...

    • openicpsr.org
    delimited
    Updated Oct 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anton Korinek (2023). Data and Code for: Generative AI for Economic Research: Use Cases and Implications for Economists [Dataset]. http://doi.org/10.3886/E194623V1
    Explore at:
    delimitedAvailable download formats
    Dataset updated
    Oct 21, 2023
    Dataset provided by
    American Economic Associationhttp://www.aeaweb.org/
    Authors
    Anton Korinek
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Generative AI, in particular large language models (LLMs) such as ChatGPT, has the potential to revolutionize research. I describe dozens of use cases along six domains in which LLMs are starting to become useful as both research assistants and tutors: ideation and feedback, writing, background research, data analysis, coding, and mathematical derivations. I provide general instructions and demonstrate specific examples of how to take advantage of each of these, classifying the LLM capabilities from experimental to highly useful. I argue that economists can reap significant productivity gains by taking advantage of generative AI to automate micro tasks. Moreover, these gains will grow as the performance of AI systems across all of these domains will continue to improve. I also speculate on the longer-term implications of AI-powered cognitive automation for economic research.The resources provided here contain the prompts and code to reproduce the chats with GPT-3.5, GPT-4, ChatGPT and Claude 2 that are listed in the paper.

  7. p

    AI-Driven Mental Health Literacy - An Interventional Study from India (Data...

    • psycharchives.org
    Updated Oct 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). AI-Driven Mental Health Literacy - An Interventional Study from India (Data from main study).csv [Dataset]. https://psycharchives.org/handle/20.500.12034/8771
    Explore at:
    Dataset updated
    Oct 2, 2023
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    The dataset is from an Indian study which made use of ChatGPT- a natural language processing model by OpenAI to design a mental health literacy intervention for college students. Prompt engineering tactics were used to formulate prompts that acted as anchors in the conversations with the AI agent regarding mental health. An intervention lasting for 20 days was designed with sessions of 15-20 minutes on alternative days. Fifty-one students completed pre-test and post-test measures of mental health literacy, mental help-seeking attitude, stigma, mental health self-efficacy, positive and negative experiences, and flourishing in the main study, which were then analyzed using paired t-tests. The results suggest that the intervention is effective among college students as statistically significant changes were noted in mental health literacy and mental health self-efficacy scores. The study affirms the practicality, acceptance, and initial indications of AI-driven methods in advancing mental health literacy and suggests the promising prospects of innovative platforms such as ChatGPT within the field of applied positive psychology.: Data used in analysis for the intervention study

  8. A

    AI Image Generator Market Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). AI Image Generator Market Report [Dataset]. https://www.marketresearchforecast.com/reports/ai-image-generator-market-5135
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Jan 3, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The AI Image Generator Market size was valued at USD 356.1 USD Million in 2023 and is projected to reach USD 1094.58 USD Million by 2032, exhibiting a CAGR of 17.4 % during the forecast period. AI image generator refers to a software application for generating image data by means of artificial intelligence, utilizing such models as deep learning, neural networks, and others. Some of them are GANs which stand for Generative Adversarial Networks, VAEs which stand for Variational Autoencoders, and diffusion models. Essential characteristics include crystal clear display of the resultant image, conversion of the source image to another style, and image improvement. It makes use for the generation of art, designing, virtual fitting, and even in-game design . These generators facilitate the quickly and cheaply generated visualization and image modifications depending on certain parameters or styles, hence changing the creative landscapes of various industries by improving efficiency and creativity. Recent developments include: September 2023 - OpenAI, a company specializing in the generative AI industry, introduced DALL-E 3, the latest version of its image generator. This upgrade, powered by the ChatGPT controller, produces high-quality images based on natural-language prompts and incorporates ethical safeguards., May 2023 - Stability AI introduced StableStudio, an open-source version of its DreamStudio AI application, specializing in converting text into images. This open-source release enabled developers and creators to access and utilize the technology, creating a wide range of applications for text-to-image generation., April 2023 - VanceAI launched an AI text-to-image generator called VanceAI Art Generator, powered by Stable Diffusion. This tool could interpret text descriptions and generate corresponding artworks. Users could combine image types, styles, artists, and adjust sizes to transform their creative ideas into visual art., March 2023 - Adobe unveiled Adobe Firefly, a generative AI tool in beta, catering to users without graphic design skills, helping them to create images and text effects. This announcement coincided with Microsoft’s launch of Copilot, offering automatic content generation for 365 and Dynamics 365 users. These advancements in generative AI provided valuable support and opportunities for individuals facing challenges related to writing, design, or organization., March 2023 - Runway AI introduced Gen-2, a combination of AI models capable of producing short video clips from text prompts. Gen-2, an advancement over its predecessor Gen-1, would generate higher-quality clips and provide users with increased customization options.. Key drivers for this market are: Growing Adoption of Augmented Reality (AR) and Virtual Reality (VR) to Fuel the Market Growth. Potential restraints include: Concerns related to Data Privacy and Creation of Malicious Content to Hamper the Market. Notable trends are: Growing Implementation of Touch-based and Voice-based Infotainment Systems to Increase Adoption of Intelligent Cars.

  9. f

    Evaluation Criteria for Reproducibility.

    • plos.figshare.com
    xls
    Updated Jun 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasuko Fukataki; Wakako Hayashi; Naoki Nishimoto; Yoichi M. Ito (2025). Evaluation Criteria for Reproducibility. [Dataset]. http://doi.org/10.1371/journal.pdig.0000695.t008
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset provided by
    PLOS Digital Health
    Authors
    Yasuko Fukataki; Wakako Hayashi; Naoki Nishimoto; Yoichi M. Ito
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This pilot study is the first phase of a broader project aimed at developing an explainable artificial intelligence (AI) tool to support the ethical evaluation of Japanese-language clinical research documents. The tool is explicitly not intended to assist document drafting. We assessed the baseline performance of generative AI—Generative Pre-trained Transformer (GPT)-4 and GPT-4o—in analyzing clinical research protocols and informed consent forms (ICFs). The goal was to determine whether these models could accurately and consistently extract ethically relevant information, including the research objectives and background, research design, and participant-related risks and benefits. First, we compared the performance of GPT-4 and GPT-4o using custom agents developed via OpenAI’s Custom GPT functionality (hereafter “GPTs”). Then, using GPT-4o alone, we compared outputs generated by GPTs optimized with customized Japanese prompts to those generated by standard prompts. GPT-4o achieved 80% agreement in extracting research objectives and background and 100% in extracting research design, while both models demonstrated high reproducibility across ten trials. GPTs with customized prompts produced more accurate and consistent outputs than standard prompts. This study suggests the potential utility of generative AI in pre-institutional review board (IRB) review tasks; it also provides foundational data for future validation and standardization efforts involving retrieval-augmented generation and fine-tuning. Importantly, this tool is intended not to automate ethical review but rather to support IRB decision-making. Limitations include the absence of gold standard reference data, reliance on a single evaluator, lack of convergence and inter-rater reliability analysis, and the inability of AI to substitute for in-person elements such as site visits.

  10. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ghadeer Sawalha; Imran Taj; Abdulhadi Shoufan (2024). Analyzing student prompts and their effect on ChatGPT’s performance [Dataset]. http://doi.org/10.6084/m9.figshare.26970708.v1

Data from: Analyzing student prompts and their effect on ChatGPT’s performance

Related Article
Explore at:
txtAvailable download formats
Dataset updated
Dec 12, 2024
Dataset provided by
Taylor & Francis
Authors
Ghadeer Sawalha; Imran Taj; Abdulhadi Shoufan
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Large language models present new opportunities for teaching and learning. The response accuracy of these models, however, is believed to depend on the prompt quality which can be a challenge for students. In this study, we aimed to explore how undergraduate students use ChatGPT for problem-solving, what prompting strategies they develop, the link between these strategies and the model’s response accuracy, the existence of individual prompting tendencies, and the impact of gender in this context. Our students used ChatGPT to solve five problems related to embedded systems and provided the solutions and the conversations with this model. We analyzed the conversations thematically to identify prompting strategies and applied different quantitative analyses to establish relationships between these strategies and the response accuracy and other factors. The findings indicate that students predominantly employ three types of prompting strategies: single copy-and-paste prompting (SCP), single reformulated prompting (SRP), and multiple-question prompting (MQP). ChatGPT’s response accuracy using SRP and MQP was significantly higher than using SCP, with effect sizes of -0.94 and -0.69, respectively. The student-by-student analysis revealed some tendencies. For example, 26 percent of the students consistently copied and pasted the questions into ChatGPT without any modification. Students who used MQP showed better performance in the final exam than those who did not use this prompting strategy. As for gender, female students tended to make extensive use of SCP, whereas male students tended to mix SCP and MQP. We conclude that students develop different prompting strategies that lead to different response qualities and learning. More research is needed to deepen our understanding and inform effective educational practices in the AI era.

Search
Clear search
Close search
Google apps
Main menu