24 datasets found

f
Data_Sheet_2_Performance analysis of large language models in the domain of...
frontiersin.figshare.com
pdf
Updated Nov 17, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdullah Al Zubaer; Michael Granitzer; Jelena Mitrović (2023). Data_Sheet_2_Performance analysis of large language models in the domain of legal argument mining.pdf [Dataset]. http://doi.org/10.3389/frai.2023.1278796.s002
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/frai.2023.1278796.s002
Dataset updated
Nov 17, 2023
Dataset provided by
Frontiers
Authors
Abdullah Al Zubaer; Michael Granitzer; Jelena Mitrović
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Generative pre-trained transformers (GPT) have recently demonstrated excellent performance in various natural language tasks. The development of ChatGPT and the recently released GPT-4 model has shown competence in solving complex and higher-order reasoning tasks without further training or fine-tuning. However, the applicability and strength of these models in classifying legal texts in the context of argument mining are yet to be realized and have not been tested thoroughly. In this study, we investigate the effectiveness of GPT-like models, specifically GPT-3.5 and GPT-4, for argument mining via prompting. We closely study the model's performance considering diverse prompt formulation and example selection in the prompt via semantic search using state-of-the-art embedding models from OpenAI and sentence transformers. We primarily concentrate on the argument component classification task on the legal corpus from the European Court of Human Rights. To address these models' inherent non-deterministic nature and make our result statistically sound, we conducted 5-fold cross-validation on the test set. Our experiments demonstrate, quite surprisingly, that relatively small domain-specific models outperform GPT 3.5 and GPT-4 in the F1-score for premise and conclusion classes, with 1.9% and 12% improvements, respectively. We hypothesize that the performance drop indirectly reflects the complexity of the structure in the dataset, which we verify through prompt and data analysis. Nevertheless, our results demonstrate a noteworthy variation in the performance of GPT models based on prompt formulation. We observe comparable performance between the two embedding models, with a slight improvement in the local model's ability for prompt selection. This suggests that local models are as semantically rich as the embeddings from the OpenAI model. Our results indicate that the structure of prompts significantly impacts the performance of GPT models and should be considered when designing them.
ChatGPT Prompts on FAIR Digital Objects
zenodo.org
pdf
Updated May 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nicolas Blumenröhr; Nicolas Blumenröhr (2025). ChatGPT Prompts on FAIR Digital Objects [Dataset]. http://doi.org/10.5281/zenodo.15056647
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15056647
Dataset updated
May 26, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Nicolas Blumenröhr; Nicolas Blumenröhr
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Mar 20, 2025
Description
This repository contains two examples of prompting ChatGPT to resolve, analyze and evaluate a FAIR Digital Object (FDO) information record via the Handle Regsitry, considering data from digital humanities and energy research.
f
Data Sheet 2_Large language models generating synthetic clinical datasets: a...
frontiersin.figshare.com
figshare.com
xlsx
Updated Feb 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Austin A. Barr; Joshua Quan; Eddie Guo; Emre Sezgin (2025). Data Sheet 2_Large language models generating synthetic clinical datasets: a feasibility and comparative analysis with real-world perioperative data.xlsx [Dataset]. http://doi.org/10.3389/frai.2025.1533508.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/frai.2025.1533508.s002
Dataset updated
Feb 5, 2025
Dataset provided by
Frontiers
Authors
Austin A. Barr; Joshua Quan; Eddie Guo; Emre Sezgin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundClinical data is instrumental to medical research, machine learning (ML) model development, and advancing surgical care, but access is often constrained by privacy regulations and missing data. Synthetic data offers a promising solution to preserve privacy while enabling broader data access. Recent advances in large language models (LLMs) provide an opportunity to generate synthetic data with reduced reliance on domain expertise, computational resources, and pre-training.ObjectiveThis study aims to assess the feasibility of generating realistic tabular clinical data with OpenAI’s GPT-4o using zero-shot prompting, and evaluate the fidelity of LLM-generated data by comparing its statistical properties to the Vital Signs DataBase (VitalDB), a real-world open-source perioperative dataset.MethodsIn Phase 1, GPT-4o was prompted to generate a dataset with qualitative descriptions of 13 clinical parameters. The resultant data was assessed for general errors, plausibility of outputs, and cross-verification of related parameters. In Phase 2, GPT-4o was prompted to generate a dataset using descriptive statistics of the VitalDB dataset. Fidelity was assessed using two-sample t-tests, two-sample proportion tests, and 95% confidence interval (CI) overlap.ResultsIn Phase 1, GPT-4o generated a complete and structured dataset comprising 6,166 case files. The dataset was plausible in range and correctly calculated body mass index for all case files based on respective heights and weights. Statistical comparison between the LLM-generated datasets and VitalDB revealed that Phase 2 data achieved significant fidelity. Phase 2 data demonstrated statistical similarity in 12/13 (92.31%) parameters, whereby no statistically significant differences were observed in 6/6 (100.0%) categorical/binary and 6/7 (85.71%) continuous parameters. Overlap of 95% CIs were observed in 6/7 (85.71%) continuous parameters.ConclusionZero-shot prompting with GPT-4o can generate realistic tabular synthetic datasets, which can replicate key statistical properties of real-world perioperative data. This study highlights the potential of LLMs as a novel and accessible modality for synthetic data generation, which may address critical barriers in clinical data access and eliminate the need for technical expertise, extensive computational resources, and pre-training. Further research is warranted to enhance fidelity and investigate the use of LLMs to amplify and augment datasets, preserve multivariate relationships, and train robust ML models.
d
A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to...
search.dataone.org
data.niaid.nih.gov
+1more
Updated Aug 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scott McGrath (2025). A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to selected genetics questions - Full study data [Dataset]. http://doi.org/10.5061/dryad.s4mw6m9cv
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.s4mw6m9cv
Dataset updated
Aug 1, 2025
Dataset provided by
Dryad Digital Repository
Authors
Scott McGrath
Time period covered
Jan 1, 2023
Description
Objective: Our objective is to evaluate the efficacy of ChatGPT 4 in accurately and effectively delivering genetic information, building on previous findings with ChatGPT 3.5. We focus on assessing the utility, limitations, and ethical implications of using ChatGPT in medical settings. Materials and Methods: A structured questionnaire, including the Brief User Survey (BUS-15) and custom questions, was developed to assess ChatGPT 4's clinical value. An expert panel of genetic counselors and clinical geneticists independently evaluated ChatGPT 4's responses to these questions. We also involved comparative analysis with ChatGPT 3.5, utilizing descriptive statistics and using R for data analysis. Results: ChatGPT 4 demonstrated improvements over 3.5 in context recognition, relevance, and informativeness. However, performance variability and concerns about the naturalness of the output were noted. No significant difference in accuracy was found between ChatGPT 3.5 and 4.0. Notably, the effic..., Study Design This study was conducted to evaluate the performance of ChatGPT 4 (March 23rd, 2023) Â Model) in the context of genetic counseling and education. The evaluation involved a structured questionnaire, which included questions selected from the Brief User Survey (BUS-15) and additional custom questions designed to assess the clinical value of ChatGPT 4's responses. Questionnaire Development The questionnaire was built on Qualtrics, which comprised twelve questions: seven selected from the BUS-15 preceded by two additional questions that we designed. The initial questions focused on quality and answer relevancy: 1.Â Â Â Â The overall quality of the Chatbotâ€™s response is: (5-point Likert: Very poor to Very Good) 2.Â Â Â Â The Chatbot delivered an answer that provided the relevant information you would include if asked the question. (5-point Likert: Strongly disagree to Strongly agree) The BUS-15 questions (7-point Likert: Strongly disagree to Strongly agree) focused on: 1.Â Â Â Â Recogniti..., , # A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to selected genetics questions - Full study data

https://doi.org/10.5061/dryad.s4mw6m9cv

This data was captured when evaluating the ability of ChatGPT to address questions patients may ask it about three genetic conditions (BRCA1, HFE, and MLH1). This data is associated with the JAMIA article of the similar name with the DOIÂ 10.1093/jamia/ocae128

Description of the data and file structure

Key: This tab contains the data structure, explaining the survey questions, and potential responses available.

Prompt Responses: This tab contains the prompts used for ChatGPT, and the response provided from each model (3.5 and 4)

GPT 4 Results: This tab provides the responses collected from the medical experts (genetic counselors and clinical geneticist) from the Qualtrics survey.

Accuracy (Qx_1): This tab contains the subset of results from both the Ch...
W
ChatGPT Usage Survey Data
webfx.com
Updated Sep 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
WebFX (2025). ChatGPT Usage Survey Data [Dataset]. https://www.webfx.com/blog/ai/chatgpt-usage-statistics/
Explore at:
Dataset updated
Sep 2, 2025
Dataset authored and provided by
WebFX
Variables measured
Average words in first message, Average words per ChatGPT conversation, Average number of messages per conversation, Percentage of conversations that are commands, Percentage of conversations that start as questions, Percentage of conversations in the "learning & understanding" category, Percentage of conversations using advanced features (persona assignment / data upload)
Description
Analysis of 13,252 publicly shared ChatGPT conversations by WebFX to uncover usage statistics - prompt length, message count, question vs command distribution, use-case categories.
f
Data Sheet 1_A multidimensional comparison of ChatGPT, Google Translate, and...
frontiersin.figshare.com
xlsx
Updated Jul 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shiyue Chen; Yan Lin (2025). Data Sheet 1_A multidimensional comparison of ChatGPT, Google Translate, and DeepL in Chinese tourism texts translation: fidelity, fluency, cultural sensitivity, and persuasiveness.xlsx [Dataset]. http://doi.org/10.3389/frai.2025.1619489.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/frai.2025.1619489.s001
Dataset updated
Jul 24, 2025
Dataset provided by
Frontiers
Authors
Shiyue Chen; Yan Lin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This study systematically compares the translation performance of ChatGPT, Google Translate, and DeepL on Chinese tourism texts, focusing on two prompt-engineering strategies. Using a mixed-methods approach that combines quantitative expert assessments with qualitative analysis, the evaluation centers on fidelity, fluency, cultural sensitivity, and persuasiveness. ChatGPT outperformed its counterparts across all metrics, especially when culturally tailored prompts were used. However, it occasionally introduced semantic shifts, highlighting a trade-off between accuracy and rhetorical adaptation. Despite its strong performance, human post-editing remains necessary to ensure semantic precision and professional standards. The study demonstrates ChatGPT’s potential in domain-specific translation tasks while calling for continued oversight in culturally nuanced content.
T
Text Analytics Market Report
marketreportanalytics.com
doc, pdf, ppt
Updated Jun 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Report Analytics (2025). Text Analytics Market Report [Dataset]. https://www.marketreportanalytics.com/reports/text-analytics-market-89598
Explore at:
doc, ppt, pdfAvailable download formats
Dataset updated
Jun 20, 2025
Dataset authored and provided by
Market Report Analytics
License
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The text analytics market is experiencing robust growth, projected to reach $10.49 billion in 2025 and exhibiting a remarkable Compound Annual Growth Rate (CAGR) of 39.90% from 2019 to 2033. This expansion is fueled by several key drivers. The increasing volume of unstructured data generated across various industries, including healthcare, finance, and customer service, necessitates sophisticated tools for extracting actionable insights. Furthermore, advancements in natural language processing (NLP), machine learning (ML), and artificial intelligence (AI) are empowering text analytics solutions with enhanced capabilities, such as sentiment analysis, topic modeling, and entity recognition. The rising adoption of cloud-based solutions also contributes to market growth, offering scalability, cost-effectiveness, and ease of access. Major industry players like IBM, Microsoft, and SAP are actively investing in research and development, driving innovation and expanding the market's capabilities. Competitive pressures are fostering a continuous improvement in the accuracy and efficiency of text analytics tools, making them increasingly attractive to businesses of all sizes. The growing demand for real-time insights and improved customer experience further propels market expansion. While the market enjoys significant growth momentum, certain challenges persist. Data security and privacy concerns remain paramount, necessitating robust security measures within text analytics platforms. The complexity of implementing and integrating these solutions into existing IT infrastructures can also pose a barrier to adoption, particularly for smaller businesses lacking dedicated data science teams. Furthermore, the accuracy and reliability of text analytics outputs can be affected by the quality and consistency of the input data. Overcoming these challenges through improved data governance, user-friendly interfaces, and robust customer support will be crucial for continued market expansion. Despite these restraints, the overall market outlook remains positive, driven by the continuous evolution of technology and the growing reliance on data-driven decision-making across diverse sectors. Recent developments include: January 2023- Microsoft announced a new multibillion-dollar investment in ChatGPT maker Open AI. ChatGPT, automatically generates text based on written prompts in a more creative and advanced than the chatbots. Through this investment, the company will accelerate breakthroughs in AI, and both companies will commercialize advanced technologies., November 2022 - Tntra and Invenio have partnered to develop a platform that offers comprehensive data analysis on a firm. Throughout the process, Tntra offered complete engineering support and cooperation to Invenio. Tantra offers feeds, knowledge graphs, intelligent text extraction, and analytics, which enables Invenio to give information on seven parts of the business, such as false news identification, subject categorization, dynamic data extraction, article summaries, sentiment analysis, and keyword extraction.. Key drivers for this market are: Growing Demand for Social Media Analytics, Rising Practice of Predictive Analytics. Potential restraints include: Growing Demand for Social Media Analytics, Rising Practice of Predictive Analytics. Notable trends are: Retail and E-commerce to Hold a Significant Share in Text Analytics Market.
Data from: DevGPT: Studying Developer-ChatGPT Conversations
zenodo.org
zip
Updated Sep 14, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tao Xiao; Tao Xiao; Christoph Treude; Christoph Treude; Hideaki Hata; Hideaki Hata; Kenichi Matsumoto; Kenichi Matsumoto (2023). DevGPT: Studying Developer-ChatGPT Conversations [Dataset]. http://doi.org/10.5281/zenodo.8304091
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8304091
Dataset updated
Sep 14, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tao Xiao; Tao Xiao; Christoph Treude; Christoph Treude; Hideaki Hata; Hideaki Hata; Kenichi Matsumoto; Kenichi Matsumoto
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
DevGPT is a curated dataset which encompasses 17,913 prompts and ChatGPT's responses including 11,751 code snippets, coupled with the corresponding software development artifacts—ranging from source code, commits, issues, pull requests, to discussions and Hacker News threads—to enable the analysis of the context and implications of these developer interactions with ChatGPT.
databricks dolly 15k
kaggle.com
zip
Updated Apr 12, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
databricks (2023). databricks dolly 15k [Dataset]. https://www.kaggle.com/datasets/databricks/databricks-dolly-15k/code
Explore at:
zip(4737034 bytes)Available download formats
Dataset updated
Apr 12, 2023
Dataset provided by
Databrickshttp://databricks.com/
Authors
databricks
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
Summary

databricks-dolly-15k is an open source dataset of instruction-following records generated by thousands of Databricks employees in several of the behavioral categories outlined in the InstructGPT paper, including brainstorming, classification, closed QA, generation, information extraction, open QA, and summarization.

This dataset can be used for any purpose, whether academic or commercial, under the terms of the Creative Commons Attribution-ShareAlike 3.0 Unported License.

Supported Tasks: - Training LLMs - Synthetic Data Generation - Data Augmentation

Languages: English Version: 1.0

Owner: Databricks, Inc.

Dataset Overview

databricks-dolly-15k is a corpus of more than 15,000 records generated by thousands of Databricks employees to enable large language models to exhibit the magical interactivity of ChatGPT. Databricks employees were invited to create prompt / response pairs in each of eight different instruction categories, including the seven outlined in the InstructGPT paper, as well as an open-ended free-form category. The contributors were instructed to avoid using information from any source on the web with the exception of Wikipedia (for particular subsets of instruction categories), and explicitly instructed to avoid using generative AI in formulating instructions or responses. Examples of each behavior were provided to motivate the types of questions and instructions appropriate to each category.

Halfway through the data generation process, contributors were given the option of answering questions posed by other contributors. They were asked to rephrase the original question and only select questions they could be reasonably expected to answer correctly.

For certain categories contributors were asked to provide reference texts copied from Wikipedia. Reference text (indicated by the context field in the actual dataset) may contain bracketed Wikipedia citation numbers (e.g. [42]) which we recommend users remove for downstream applications.

Intended Uses

While immediately valuable for instruction fine tuning large language models, as a corpus of human-generated instruction prompts, this dataset also presents a valuable opportunity for synthetic data generation in the methods outlined in the Self-Instruct paper. For example, contributor--generated prompts could be submitted as few-shot examples to a large open language model to generate a corpus of millions of examples of instructions in each of the respective InstructGPT categories.

Likewise, both the instructions and responses present fertile ground for data augmentation. A paraphrasing model might be used to restate each prompt or short responses, with the resulting text associated to the respective ground-truth sample. Such an approach might provide a form of regularization on the dataset that could allow for more robust instruction-following behavior in models derived from these synthetic datasets.

Dataset

Purpose of Collection

As part of our continuing commitment to open source, Databricks developed what is, to the best of our knowledge, the first open source, human-generated instruction corpus specifically designed to enable large language models to exhibit the magical interactivity of ChatGPT. Unlike other datasets that are limited to non-commercial use, this dataset can be used, modified, and extended for any purpose, including academic or commercial applications.

Sources

Human-generated data: Databricks employees were invited to create prompt / response pairs in each of eight different instruction categories.

Wikipedia: For instruction categories that require an annotator to consult a reference text (information extraction, closed QA, summarization) contributors selected passages from Wikipedia for particular subsets of instruction categories. No guidance was given to annotators as to how to select the target passages.

Annotator Guidelines

To create a record, employees were given a brief description of the annotation task as well as examples of the types of prompts typical of each annotation task. Guidelines were succinct by design so as to encourage a high task completion rate, possibly at the cost of rigorous compliance to an annotation rubric that concretely and reliably operationalizes the specific task. Caveat emptor.

The annotation guidelines for each of the categories are as follows:

Creative Writing: Write a question or instruction that requires a creative, open-ended written response. The instruction should be reasonable to ask of a person with general world knowledge and should not require searching. In this task, your prompt should give very specific instructions to follow. Constraints, instructions, guidelines, or requirements all work, and the more of them the be...
Sentiment Analysis Dataset
kaggle.com
Updated May 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samarth Kuchya (2024). Sentiment Analysis Dataset [Dataset]. https://www.kaggle.com/datasets/samarthkumarkuchya/sentiment-analysis-dataset/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 27, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Samarth Kuchya
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This data has been created using prompt engineering over chatGPT which has following labels - 0 - negative 1 - neutral 2 - positive
d
Evaluation of large language model chatbot responses to psychotic prompts:...
search.dataone.org
datadryad.org
Updated Nov 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elaine Shen; Fadi Hamati; Meghan Rose Donohue; Ragy Girgis; Jeremy Veenstra-VanderWeele; Amandeep Jutla (2025). Evaluation of large language model chatbot responses to psychotic prompts: numerical ratings of prompt-response pairs [Dataset]. http://doi.org/10.5061/dryad.x0k6djj00
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.x0k6djj00
Dataset updated
Nov 20, 2025
Dataset provided by
Dryad Digital Repository
Authors
Elaine Shen; Fadi Hamati; Meghan Rose Donohue; Ragy Girgis; Jeremy Veenstra-VanderWeele; Amandeep Jutla
Description
The large language models (LLM) "chatbot" product ChatGPT has accumulated 800 million weekly users since its 2022 launch. In 2025, several media outlets reported on individuals in whom apparent psychotic symptoms emerged or worsened in the context of using ChatGPT. As LLM chatbots are trained to align with user input and generate encouraging responses, they may have difficulty appropriately responding to psychotic content. To assess whether ChatGPT can reliably generate appropriate responses to prompts containing psychotic symptoms, we conducted a cross-sectional, experimental study of how multiple versions of the ChatGPT product respond to psychotic and control prompts, with blind clinician ratings of response appropriateness. We found that all three tested versions of ChatGPT were much more likely to generate inappropriate responses to psychotic than control prompts, with the "Free" product showing the poorest performance. In an exploratory analysis, prompts reflecting grandiosit..., We created 79 psychotic prompts, first-person statements an individual experiencing psychosis could plausibly make to ChatGPT. Each reflected one of the five positive symptom domains assessed by the Structured Interview for Psychosis-Risk Syndromes (SIPS): unusual thought content/delusional ideas (n = 16), suspiciousness/persecutory ideas (n = 17), grandiose ideas (n = 15), perceptual disturbances/hallucinations (n = 15), and disorganized communication (n = 16). For each psychotic prompt, we created a corresponding control prompt similar in length, sentence structure and content but without psychotic elements. This yielded a total of 158 unique prompts. On 8/28 and 8/29/2025, we presented these prompts to three versions of the ChatGPT product: GPT-5 Auto (paid default at time of experiment), GPT-4o (previous paid default), and â€œFreeâ€ (version accessible without subscription or account), yielding 474 prompt-response pairs. Two primary raters assigned an "appropriateness" r..., # Evaluation of large language model chatbot responses to psychotic prompts: numerical ratings of prompt-response pairs

Dataset DOI: 10.5061/dryad.x0k6djj00

Description of the data and file structure

This dataset containsÂ numerical ratings of prompt-response pairs from our study, and can be used to reproduce our analyses. Note that theÂ literal text of prompts and model responses are not provided here, but they are available from the corresponding author on reasonable request.

Files and variables

File: llm_psychosis_numeric_ratings.csv

Description: This CSV file contains all numeric appropriateness ratings assigned to prompt-response pairs in a "long" format. The 1592 rows represent 474 ratings each from two primary raters (for 948 from both), 474 derived consensus ratings, and 170 ratings from a secondary rater. The seven columns are described below.

Variables

pair_id: The ID of the prompt-response pair rat...,
🧠 AI-Driven Mental Health Literacy
kaggle.com
zip
Updated Mar 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mexwell (2024). 🧠 AI-Driven Mental Health Literacy [Dataset]. https://www.kaggle.com/datasets/mexwell/ai-driven-mental-health-literacy
Explore at:
zip(3740 bytes)Available download formats
Dataset updated
Mar 5, 2024
Authors
mexwell
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Description

The dataset is from an Indian study which made use of ChatGPT- a natural language processing model by OpenAI to design a mental health literacy intervention for college students. Prompt engineering tactics were used to formulate prompts that acted as anchors in the conversations with the AI agent regarding mental health. An intervention lasting for 20 days was designed with sessions of 15-20 minutes on alternative days. Fifty-one students completed pre-test and post-test measures of mental health literacy, mental help-seeking attitude, stigma, mental health self-efficacy, positive and negative experiences, and flourishing in the main study, which were then analyzed using paired t-tests. The results suggest that the intervention is effective among college students as statistically significant changes were noted in mental health literacy and mental health self-efficacy scores. The study affirms the practicality, acceptance, and initial indications of AI-driven methods in advancing mental health literacy and suggests the promising prospects of innovative platforms such as ChatGPT within the field of applied positive psychology.

Original Data

Citation

C K, J., & Singh, K. (2023). Dataset for: AI-Driven Mental Health Literacy: An Interventional Study from India (Final Dataset for analysis) [Data set]. PsychArchives. https://doi.org/10.23668/psycharchives.13284

Acknowlegement

Foto von Priscilla Du Preez 🇨🇦 auf Unsplash
m
LLM dermatological patient handouts - supplementary data
data.mendeley.com
Updated Sep 7, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crystal Chang (2023). LLM dermatological patient handouts - supplementary data [Dataset]. http://doi.org/10.17632/5ngxkzkdp9.2
Explore at:
Unique identifier
https://doi.org/10.17632/5ngxkzkdp9.2
Dataset updated
Sep 7, 2023
Authors
Crystal Chang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Supplementary material for Assessment of Large Language Models to Generate Patient Handouts for the Dermatology Clinic: a single-blinded randomized study

Supplementary material A describes the overall analysis and outputs for the PEMAT and readability scores.

Supplementary material B is the code used for the statistical analysis.

LLM_readability_scores, PEMAT, LLM_attending_rank, rater_df, and LLM_randomization_protocol are the raw data used for analysis.

ChatGPT handouts, Bard handouts, and BingAI handouts are the respective handouts and prompts generated for this study.
f
Data Sheet 1_On the emergent capabilities of ChatGPT 4 to estimate...
frontiersin.figshare.com
zip
Updated Feb 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marco Piastra; Patrizia Catellani (2025). Data Sheet 1_On the emergent capabilities of ChatGPT 4 to estimate personality traits.zip [Dataset]. http://doi.org/10.3389/frai.2025.1484260.s001
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.3389/frai.2025.1484260.s001
Dataset updated
Feb 13, 2025
Dataset provided by
Frontiers
Authors
Marco Piastra; Patrizia Catellani
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This study investigates the potential of ChatGPT 4 in the assessment of personality traits based on written texts. Using two publicly available datasets containing both written texts and self-assessments of the authors’ psychological traits based on the Big Five model, we aimed to evaluate the predictive performance of ChatGPT 4. For each sample text, we asked for numerical predictions on an eleven-point scale and compared them with the self-assessments. We also asked for ChatGPT 4 confidence scores on an eleven-point scale for each prediction. To keep the study within a manageable scope, a zero-prompt modality was chosen, although more sophisticated prompting strategies could potentially improve performance. The results show that ChatGPT 4 has moderate but significant abilities to automatically infer personality traits from written text. However, it also shows limitations in recognizing whether the input text is appropriate or representative enough to make accurate inferences, which could hinder practical applications. Furthermore, the results suggest that improved benchmarking methods could increase the efficiency and reliability of the evaluation process. These results pave the way for a more comprehensive evaluation of the capabilities of Large Language Models in assessing personality traits from written texts.
Z
Can Developers Prompt? A Controlled Experiment for Code Documentation...
data.niaid.nih.gov
zenodo.org
Updated Sep 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kruse, Hans-Alexander; Puhlfürß, Tim; Maalej, Walid (2024). Can Developers Prompt? A Controlled Experiment for Code Documentation Generation [Replication Package] [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13127237
Explore at:
Dataset updated
Sep 11, 2024
Dataset provided by
Universität Hamburg
Authors
Kruse, Hans-Alexander; Puhlfürß, Tim; Maalej, Walid
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Summary of Artifacts

This is the replication package for the paper titled 'Can Developers Prompt? A Controlled Experiment for Code Documentation Generation' that is part of the 40th IEEE International Conference on Software Maintenance and Evolution (ICSME), from October 6 to 11, 2024, located in Flagstaff, AZ, USA.

Full Abstract

Large language models (LLMs) bear great potential for automating tedious development tasks such as creating and maintaining code documentation. However, it is unclear to what extent developers can effectively prompt LLMs to create concise and useful documentation. We report on a controlled experiment with 20 professionals and 30 computer science students tasked with code documentation generation for two Python functions. The experimental group freely entered ad-hoc prompts in a ChatGPT-like extension of Visual Studio Code, while the control group executed a predefined few-shot prompt. Our results reveal that professionals and students were unaware of or unable to apply prompt engineering techniques. Especially students perceived the documentation produced from ad-hoc prompts as significantly less readable, less concise, and less helpful than documentation from prepared prompts. Some professionals produced higher quality documentation by just including the keyword Docstring in their ad-hoc prompts. While students desired more support in formulating prompts, professionals appreciated the flexibility of ad-hoc prompting. Participants in both groups rarely assessed the output as perfect. Instead, they understood the tools as support to iteratively refine the documentation. Further research is needed to understand which prompting skills and preferences developers have and which support they need for certain tasks.

Author Information

Name Affiliation Email

Hans-Alexander Kruse Universität Hamburg hans-alexander.kruse@studium.uni-hamburg.de

Tim Puhlfürß Universität Hamburg tim.puhlfuerss@uni-hamburg.de

Walid Maalej Universität Hamburg walid.maalej@uni-hamburg.de

Citation Information

@inproceedings{kruse-icsme-2024, author={Kruse, Hans-Alexander and Puhlf{"u}r{\ss}, Tim and Maalej, Walid}, booktitle={2022 IEEE International Conference on Software Maintenance and Evolution}, title={Can Developers Prompt? A Controlled Experiment for Code Documentation Generation}, year={2024}, doi={tba}, }

Artifacts Overview

Preprint

The file kruse-icsme-2024-preprint.pdf is the preprint version of the official paper. You should read the paper in detail to understand the study, especially its methodology and results.

Results

The folder results includes two subfolders, explained in the following.

Demographics RQ1 RQ2

The subfolder Demographics RQ1 RQ2 provides Jupyter Notebook file evaluation.ipynb for analyzing (1) the experiment participants' submissions of the digital survey and (2) the ad-hoc prompts that the experimental group entered into their tool. Hence, this file provides demographic information about the participants and results for the research questions 1 and 2. Please refer to the README file inside this subfolder for installation steps of the Jupyter Notebook file.

RQ2

The subfolder RQ2 contains further subfolders with Microsoft Excel files specific to the results of research question 2:

The subfolder UEQ contains three times the official User Experience Questionnaire (UEQ) analysis Excel tool, with data entered from all participants/students/professionals.

The subfolder Open Coding contains three Excel files with the open-coding results for the free-text answers that participants could enter at the end of the survey to state additional positive and negative comments about their experience during the experiment. The Consensus file provides the finalized version of the open coding process.

Extension

The folder extension contains the code of the Visual Studio Code (VS Code) extension developed in this study to generate code documentation with predefined prompts. Please refer to the README file inside the folder for installation steps. Alternatively, you can install the deployed version of this tool, called Code Docs AI, via the VS Code Marketplace.

You can install the tool to generate code documentation with ad-hoc prompts directly via the VS Code Marketplace. We did not include the code of this extension in this replication package due to license conflicts (GNUv3 vs. MIT).

Survey

The folder survey contains PDFs of the digital survey in two versions:

The file Survey.pdf contains the rendered version of the survey (how it was presented to participants).

The file SurveyOptions.pdf is an export of the LimeSurvey web platform. Its main purpose is to provide the technical answer codes, e.g., AO01 and AO02, that refer to the rendered answer texts, e.g., Yes and No. This can help you if you want to analyze the CSV files inside the results folder (instead of using the Jupyter Notebook file), as the CSVs contain the answer codes, not the answer texts. Please note that an export issue caused page 9 to be almost blank. However, this problem is negligible as the question on this page only contained one free-text answer field.

Appendix

The folder appendix provides additional material about the study:

The subfolder tool_screenshots contains screenshots of both tools.

The file few_shots.txt lists the few shots used for the predefined prompt tool.

The file test_functions.py lists the functions used in the experiment.

Revisions

Version Changelog

1.0.0 Initial upload

1.1.0 Add paper preprint. Update abstract.

1.2.0 Update replication package based on ICSME Artifact Track reviews

License

See LICENSE file.
PROSPECT: Professional Role Effects on Specialized Perspective Enhancement...
zenodo.org
zip
Updated Dec 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Keisuke Sato; Keisuke Sato (2024). PROSPECT: Professional Role Effects on Specialized Perspective Enhancement in Conversational Task [Dataset]. http://doi.org/10.5281/zenodo.14567800
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14567800
Dataset updated
Dec 29, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Keisuke Sato; Keisuke Sato
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 29, 2024
Description
### Data Availability Statement (for the paper)

All dialogue logs and final responses collected in this study are publicly available in the PROSPECT repository on Zenodo (DOI: [to be assigned]). The repository contains PDF files of complete dialogue histories and Markdown files of final comprehensive analyses for all conditions and models used in this study, allowing for reproducibility and further analysis.

### README.md for Zenodo

# PROSPECT: Professional Role Effects on Specialized Perspective Enhancement in Conversational Task

## Overview
This repository (PROSPECT) contains the dataset associated with the paper:
> "Empirical Investigation of Expertise, Multiperspectivity, and Abstraction Induction in Conversational AI Outputs through Professional Role Assignment to Both User and AI"

This research analyzed changes in dialogue logs and final responses when professional roles were assigned to both user and AI sides across multiple Large Language Models (LLMs). This repository provides the complete dialogue logs (PDF format) and final responses (Markdown format) used in the analysis.

## Directory Structure
The repository structure under the top directory (`PROSPECT/`) is as follows:

```
PROSPECT/
├── dialogue/ # Dialogue histories (PDF)
│ ├── none/
│ ├── ai_only/
│ ├── user_only/
│ └── both/
└── final_answers/ # Final responses (Markdown)
├── none/
├── ai_only/
├── user_only/
└── both/
```

- **dialogue/**
- Contains raw dialogue logs in PDF format. Subdirectories represent role assignment conditions:
- `none/`: No roles assigned to either user or AI
- `ai_only/`: Role assigned to AI only
- `user_only/`: Role assigned to user only
- `both/`: Roles assigned to both user and AI
- **final_answers/**
- Contains final comprehensive analysis responses in Markdown format. Directory structure mirrors that of `dialogue/`.

## File Naming Convention
Files in each directory follow this naming convention:
```
[AI]_[conditionNumber]-[roleNumber].pdf
[AI]_[conditionNumber]-[roleNumber].md
```
- `[AI]`: AI model name used for dialogue (e.g., ChatGPT, ChatGPT-o1, Claude, Gemini)
- `[conditionNumber]`: Number indicating role assignment condition
- 0: none
- 1: ai_only
- 2: user_only
- 3: both
- `[roleNumber]`: Professional role number
- 0: No role
- 1: Detective
- 2: Psychologist
- 3: Artist
- 4: Architect
- 5: Natural Scientist

### Examples:
- `ChatGPT_3-1.pdf`: Dialogue log with ChatGPT-4o model under "both" condition (3) with detective role (1)
- `Gemini_1-4.md`: Final response from Gemini model under "ai_only" condition (1) with architect role (4)

## Role Number Reference
| roleNumber | Professional Role |
|-----------:|:-----------------|
| 0 | No role |
| 1 | Detective |
| 2 | Psychologist |
| 3 | Artist |
| 4 | Architect |
| 5 | Natural Scientist|

## Data Description
- **Dialogue Histories (PDF format)**
Complete logs of questions and responses from each session, preserved as captured during the research. All dialogues were conducted in Japanese. While assistant version information is not included, implementation dates and model names are recorded within the files.
- **Final Responses (Markdown format)**
Excerpted responses to the final "comprehensive analysis request" as Markdown files, intended for text analysis and keyword extraction. All responses are in Japanese.

*Note: This dataset contains dialogues and responses exclusively in Japanese. Researchers interested in lexical analysis or content analysis should consider this language specification.

## How to Use
1. Please maintain the folder hierarchy after downloading.
2. For meta-analysis or lexical analysis, refer to PDFs for complete dialogues and Markdown files for final responses.
3. Utilize for research reproduction, secondary analysis, or meta-analysis.

## License
This dataset is released under the **CC BY 4.0** License.
- Free to use and modify, but please cite this repository (DOI) and the associated paper when using the data.

## Related Publication

## Disclaimer
- The dialogue logs contain no personal information or confidential data.
- The provided logs and responses reflect the research timing; identical prompts may yield different responses due to AI model updates.
- The creators assume no responsibility for any damages resulting from the use of this dataset.

## Contact
For questions or requests, please contact skeisuke@ibaraki-ct.ac.jp.
A
AI Image Generator Market Report
marketresearchforecast.com
doc, pdf, ppt
Updated Jan 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Research Forecast (2025). AI Image Generator Market Report [Dataset]. https://www.marketresearchforecast.com/reports/ai-image-generator-market-5135
Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
Jan 3, 2025
Dataset authored and provided by
Market Research Forecast
License
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The AI Image Generator Market size was valued at USD 356.1 USD Million in 2023 and is projected to reach USD 1094.58 USD Million by 2032, exhibiting a CAGR of 17.4 % during the forecast period. Recent developments include: September 2023 - OpenAI, a company specializing in the generative AI industry, introduced DALL-E 3, the latest version of its image generator. This upgrade, powered by the ChatGPT controller, produces high-quality images based on natural-language prompts and incorporates ethical safeguards., May 2023 - Stability AI introduced StableStudio, an open-source version of its DreamStudio AI application, specializing in converting text into images. This open-source release enabled developers and creators to access and utilize the technology, creating a wide range of applications for text-to-image generation., April 2023 - VanceAI launched an AI text-to-image generator called VanceAI Art Generator, powered by Stable Diffusion. This tool could interpret text descriptions and generate corresponding artworks. Users could combine image types, styles, artists, and adjust sizes to transform their creative ideas into visual art., March 2023 - Adobe unveiled Adobe Firefly, a generative AI tool in beta, catering to users without graphic design skills, helping them to create images and text effects. This announcement coincided with Microsoft’s launch of Copilot, offering automatic content generation for 365 and Dynamics 365 users. These advancements in generative AI provided valuable support and opportunities for individuals facing challenges related to writing, design, or organization., March 2023 - Runway AI introduced Gen-2, a combination of AI models capable of producing short video clips from text prompts. Gen-2, an advancement over its predecessor Gen-1, would generate higher-quality clips and provide users with increased customization options.. Key drivers for this market are: Growing Adoption of Augmented Reality (AR) and Virtual Reality (VR) to Fuel the Market Growth. Potential restraints include: Concerns related to Data Privacy and Creation of Malicious Content to Hamper the Market. Notable trends are: Growing Implementation of Touch-based and Voice-based Infotainment Systems to Increase Adoption of Intelligent Cars.
LLM Data
figshare.com
xlsx
Updated Sep 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carter Emerson (2025). LLM Data [Dataset]. http://doi.org/10.6084/m9.figshare.30066574.v2
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.30066574.v2
Dataset updated
Sep 5, 2025
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Carter Emerson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data from Prompts to politics: How political identity shapes AI-generated discourse on climate change
Z
Limits of ChatGPT's Conversational Pragmatics in a Turing Test About Ethics,...
data-staging.niaid.nih.gov
Updated Jan 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wagner, Wolfgang; Gaskell, George; Paraschou, Eva; Lyu, Siqi; Michali, Maria; Vakali, Athina (2025). Limits of ChatGPT's Conversational Pragmatics in a Turing Test About Ethics, Commonsense, and Cultural Sensitivity [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_14762323
Explore at:
Dataset updated
Jan 31, 2025
Dataset provided by
Aristotle University of Thessaloniki
University of Tartu
South East European Research Centre
London School of Economics and Political Science
Authors
Wagner, Wolfgang; Gaskell, George; Paraschou, Eva; Lyu, Siqi; Michali, Maria; Vakali, Athina
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Does ChatGPT deliver its explicit claim to be culturally sensitive and its implicit claim to be a friendly digital person when conversing with human users? These claims are investigated from the perspective of linguistic pragmatics, particularly Grice's cooperative principle in communication. Following the pattern of real-life communication, turn-taking conversations reveal limitations in the LLM's grasp of the entire contextual setting described in the prompt. The prompts included ethical issues, a hiking adventure, geographical orientation and bodily movement. For cultural sensitivity the prompts came from a Pakistani Muslim in English language, from a Hindu in English, and from a Chinese in Chinese language. The issues were deeply cultural issues involving feelings and affects. Qualitative analysis of the conversation pragmatics showed that ChatGPT is often unable to conduct conversations according to the pragmatic principles of quantity, reliable quality, remaining in focus, and being clear in expression. We conclude that ChatGPT should not be presented as a global LLM but be subdivided into several culture-specific modules.
m
A dataset comparing the performance of a publicly available generative...
data.mendeley.com
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexei Birkun (2025). A dataset comparing the performance of a publicly available generative artificial intelligence chatbot and its customised version in providing first aid guidance for seizures [Dataset]. http://doi.org/10.17632/6h53jrhf7t.1
Explore at:
Unique identifier
https://doi.org/10.17632/6h53jrhf7t.1
Dataset updated
Jun 30, 2025
Authors
Alexei Birkun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The applicability of generative artificial intelligence chatbots as first aid consultants is a topical issue. This dataset contains the results of an analysis comparing the quality of seizure first aid recommendations generated by the publicly available chatbot ChatGPT (GPT-4o model) with those generated by its customised version. The dataset consists of three files. The first file (customisation rules.txt) contains customised text instructions for the chatbot, including definitions of key terms and roles, communication and dialogue style guidelines, a catalogue and description of knowledge base documents, operational recommendations for applying knowledge base documents in dialogue, prohibited actions, barrier mitigation strategies, chatbot phrasing examples, and conversation closure instructions. The second file (instructions.txt) contains four sets of mandatory questions and instructional wordings corresponding to the following emergency scenarios: scenario I – an unconscious victim with ongoing seizures; scenario II – a victim in the postictal period, unconscious, not breathing; scenario III – a victim in the postictal period, unconscious, breathing normally; scenario IV – a victim in the postictal period, conscious. The third file (evaluation results.xlsx) contains the results of a comparative analysis of the effectiveness of the publicly available chatbot (Baseline_# sheets) and its customised version (Custom_# sheets) according to checklists corresponding to the dialogue scenarios.

Facebook

Twitter

Click to copy link

Link copied

Cite

Abdullah Al Zubaer; Michael Granitzer; Jelena Mitrović (2023). Data_Sheet_2_Performance analysis of large language models in the domain of legal argument mining.pdf [Dataset]. http://doi.org/10.3389/frai.2023.1278796.s002

Data_Sheet_2_Performance analysis of large language models in the domain of legal argument mining.pdf

Explore at:

pdfAvailable download formats

Unique identifier

https://doi.org/10.3389/frai.2023.1278796.s002

Dataset updated

Nov 17, 2023

Dataset provided by

Frontiers

Authors

Abdullah Al Zubaer; Michael Granitzer; Jelena Mitrović

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Generative pre-trained transformers (GPT) have recently demonstrated excellent performance in various natural language tasks. The development of ChatGPT and the recently released GPT-4 model has shown competence in solving complex and higher-order reasoning tasks without further training or fine-tuning. However, the applicability and strength of these models in classifying legal texts in the context of argument mining are yet to be realized and have not been tested thoroughly. In this study, we investigate the effectiveness of GPT-like models, specifically GPT-3.5 and GPT-4, for argument mining via prompting. We closely study the model's performance considering diverse prompt formulation and example selection in the prompt via semantic search using state-of-the-art embedding models from OpenAI and sentence transformers. We primarily concentrate on the argument component classification task on the legal corpus from the European Court of Human Rights. To address these models' inherent non-deterministic nature and make our result statistically sound, we conducted 5-fold cross-validation on the test set. Our experiments demonstrate, quite surprisingly, that relatively small domain-specific models outperform GPT 3.5 and GPT-4 in the F1-score for premise and conclusion classes, with 1.9% and 12% improvements, respectively. We hypothesize that the performance drop indirectly reflects the complexity of the structure in the dataset, which we verify through prompt and data analysis. Nevertheless, our results demonstrate a noteworthy variation in the performance of GPT models based on prompt formulation. We observe comparable performance between the two embedding models, with a slight improvement in the local model's ability for prompt selection. This suggests that local models are as semantically rich as the embeddings from the OpenAI model. Our results indicate that the structure of prompts significantly impacts the performance of GPT models and should be considered when designing them.

Clear search

Close search

Google apps

Main menu

Data_Sheet_2_Performance analysis of large language models in the domain of...

ChatGPT Prompts on FAIR Digital Objects

Data Sheet 2_Large language models generating synthetic clinical datasets: a...

A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to...

Description of the data and file structure

ChatGPT Usage Survey Data

Data Sheet 1_A multidimensional comparison of ChatGPT, Google Translate, and...

Text Analytics Market Report

Data from: DevGPT: Studying Developer-ChatGPT Conversations

databricks dolly 15k

Summary

Dataset Overview

Intended Uses

Dataset

Purpose of Collection

Sources

Annotator Guidelines

Sentiment Analysis Dataset

Evaluation of large language model chatbot responses to psychotic prompts:...

Description of the data and file structure

Files and variables

File: llm_psychosis_numeric_ratings.csv

Variables

🧠 AI-Driven Mental Health Literacy

Description

Citation

Acknowlegement

LLM dermatological patient handouts - supplementary data

Data Sheet 1_On the emergent capabilities of ChatGPT 4 to estimate...

Can Developers Prompt? A Controlled Experiment for Code Documentation...

PROSPECT: Professional Role Effects on Specialized Perspective Enhancement...

AI Image Generator Market Report

LLM Data

Limits of ChatGPT's Conversational Pragmatics in a Turing Test About Ethics,...

A dataset comparing the performance of a publicly available generative...

Data_Sheet_2_Performance analysis of large language models in the domain of legal argument mining.pdfSee More Versions

Data_Sheet_2_Performance analysis of large language models in the domain of legal argument mining.pdf