44 datasets found

h
chatgpt-paraphrases
huggingface.co
Updated Mar 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Humarin (2023). chatgpt-paraphrases [Dataset]. https://huggingface.co/datasets/humarin/chatgpt-paraphrases
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 17, 2023
Dataset authored and provided by
Humarin
License
https://choosealicense.com/licenses/openrail/https://choosealicense.com/licenses/openrail/
Description
This is a dataset of paraphrases created by ChatGPT. Model based on this dataset is avaible: model

We used this prompt to generate paraphrases

Generate 5 similar paraphrases for this question, show it like a numbered list without commentaries: {text} This dataset is based on the Quora paraphrase question, texts from the SQUAD 2.0 and the CNN news dataset. We generated 5 paraphrases for each sample, totally this dataset has about 420k data rows. You can make 30 rows from a row from… See the full description on the dataset page: https://huggingface.co/datasets/humarin/chatgpt-paraphrases.
S
Test dataset of ChatGPT in medical field
scidb.cn
Updated Mar 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
robin shen (2023). Test dataset of ChatGPT in medical field [Dataset]. http://doi.org/10.57760/sciencedb.o00130.00001
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.o00130.00001
Dataset updated
Mar 3, 2023
Dataset provided by
Science Data Bank
Authors
robin shen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The researcher tests the QA capability of ChatGPT in the medical field from the following aspects:1. Test their reserve capacity for medical knowledge2. Check their ability to read literature and understand medical literature3. Test their ability of auxiliary diagnosis after reading case data4. Test its error correction ability for case data5. Test its ability to standardize medical terms6. Test their evaluation ability to experts7. Check their ability to evaluate medical institutionsThe conclusion is:ChatGPT has great potential in the application of medical and health care, and may directly replace human beings or even professionals at a certain level in some fields;The researcher preliminarily believe that ChatGPT has basic medical knowledge and the ability of multiple rounds of dialogue, and its ability to understand Chinese is not weak;ChatGPT has the ability to read, understand and correct cases;ChatGPT has the ability of information extraction and terminology standardization, and is quite excellent;ChatGPT has the reasoning ability of medical knowledge;ChatGPT has the ability of continuous learning. After continuous training, its level has improved significantly;ChatGPT does not have the academic evaluation ability of Chinese medical talents, and the results are not ideal;ChatGPT does not have the academic evaluation ability of Chinese medical institutions, and the results are not ideal;ChatGPT is an epoch-making product, which can become a useful assistant for medical diagnosis and treatment, knowledge service, literature reading, review and paper writing.
DAIGT | External Dataset
kaggle.com
Updated Oct 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
moth (2023). DAIGT | External Dataset [Dataset]. https://www.kaggle.com/datasets/alejopaullier/daigt-external-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 31, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
moth
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Important Note: the text column is NOT AI generated. However, the source_text is, which can still be used as AI generated text. I will update the dataset accordingly. Consequently, this dataset provides 2421 student generated texts (text column) and 2421 AI generated texts (source_text column). I will update as soon as possible.

In the LLM- Detect AI Generated Text competition you are required to distinguish between student-made and AI-generated texts. However, the competition's data only provides student-made texts.

Luckily, for CommonLit's competition I made a dataset with AI generated texts to use for that competition. Surprisingly, it's very much alike the data we need for in this competition!

My dataset not only has 2421 Chat GPT generated texts but also their prompts and source texts! That's double the data we are given in this competition!

Also, it's very diverse since the texts are generated from unique prompts.

The best of luck to all of you in this competition! 🍀

Dataset Description

id: unique identifier for each text.

text: extracted text from FeedBack Prize 3 competition. Can be used as student text.

instructions: the instruction for ChatGPT to generate the text.

source_text: AI generated text.
Prompts generated from ChatGPT3.5, ChatGPT4, LLama3-8B, and Mistral-7B with...
zenodo.org
portaldelaciencia.uva.es
+1more
bin
Updated Nov 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martínez Gonzalo; Martínez Gonzalo; Hernández José Alberto; Hernández José Alberto; Conde Javier; Conde Javier; Reviriego Pedro; Reviriego Pedro; Merino Elena; Merino Elena (2024). Prompts generated from ChatGPT3.5, ChatGPT4, LLama3-8B, and Mistral-7B with NYT and HC3 topics in different roles and parameters configurations [Dataset]. http://doi.org/10.5281/zenodo.11121394
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11121394
Dataset updated
Nov 16, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Martínez Gonzalo; Martínez Gonzalo; Hernández José Alberto; Hernández José Alberto; Conde Javier; Conde Javier; Reviriego Pedro; Reviriego Pedro; Merino Elena; Merino Elena
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description

Prompts generated from ChatGPT3.5, ChatGPT4, Llama3-8B, and Mistral-7B with NYT and HC3 topics in different roles and parameter configurations.

The dataset is useful to study lexical aspects of LLMs with different parameters/roles configurations.

The 0_Base_Topics.xlsx file lists the topics used for the dataset generation

The rest of the files collect the answers of ChatGPT to these topics with different configurations of parameters/context:

Temperature (parameter): Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

Frequency penalty (parameter): Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

Top probability (parameter): An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass.

Presence penalty (parameter): Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

Roles (context)

Default: No role is assigned to the LLM, the default role is used.

Child: The LLM is requested to answer as a five-year-old child.

Young adult male: The LLM is requested to answer as a young male adult.

Young adult female: The LLM is requested to answer as a young female adult.

Elderly adult male: The LLM is requested to answer as an elderly male adult.

Elderly adult female: The LLM is requested to answer as an elderly female adult.

Affluent adult male: The LLM is requested to answer as an affluent male adult.

Affluent adult female: The LLM is requested to answer as an affluent female adult.

Lower-class adult male: The LLM is requested to answer as a lower-class male adult.

Lower-class adult female: The LLM is requested to answer as a lower-class female adult.

Erudite: The LLM is requested to answer as an erudite who uses a rich vocabulary.

Paper

Paper: Beware of Words: Evaluating the Lexical Diversity of Conversational LLMs using ChatGPT as Case Study

Cite:

@article{10.1145/3696459,
author = {Mart\'{\i}nez, Gonzalo and Hern\'{a}ndez, Jos\'{e} Alberto and Conde, Javier and Reviriego, Pedro and Merino-G\'{o}mez, Elena},
title = {Beware of Words: Evaluating the Lexical Diversity of Conversational LLMs using ChatGPT as Case Study},
year = {2024},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {2157-6904},
url = {https://doi.org/10.1145/3696459},
doi = {10.1145/3696459},
abstract = ,
note = {Just Accepted},
journal = {ACM Trans. Intell. Syst. Technol.},
month = sep,
keywords = {LLM, Lexical diversity, ChatGPT, Evaluation}
}
ChatGPT Evaluation Dataset v.2.0
zenodo.org
data.niaid.nih.gov
Updated Oct 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jan Kocoń; Jan Kocoń; Przemysław Kazienko; Przemysław Kazienko (2024). ChatGPT Evaluation Dataset v.2.0 [Dataset]. http://doi.org/10.5281/zenodo.14019715
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.14019715
Dataset updated
Oct 31, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jan Kocoń; Jan Kocoń; Przemysław Kazienko; Przemysław Kazienko
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Oct 2023
Description
We tested ChatGPT on 25 tasks focusing on solving common NLP problems and requiring analytical reasoning. These tasks include (1) a relatively simple binary classification of texts like spam, humor, sarcasm, aggression detection, or grammatical correctness of the text; (2) a more complex multiclass and multi-label classification of texts such as sentiment analysis, emotion recognition; (3) reasoning with the personal context, i.e., personalized versions of the problems that make use of additional information about text perception of a given user (user’s examples provided to ChatGPT); (4) semantic annotation and acceptance of the text going towards natural language understanding (NLU) like word sense disambiguation (WSD), and (5) answering questions based on the input text. More information in the paper: https://www.sciencedirect.com/science/article/pii/S156625352300177X
R
Monarch Butterfly Detector Dataset
universe.roboflow.com
zip
Updated Jun 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scott Cole (2023). Monarch Butterfly Detector Dataset [Dataset]. https://universe.roboflow.com/scott-cole-a3ty4/monarch-butterfly-detector/model/1
Explore at:
zipAvailable download formats
Dataset updated
Jun 11, 2023
Dataset authored and provided by
Scott Cole
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Monarch Butterfly Bounding Boxes
Description
Monarch Butterfly Detector

The Monarch Butterfly Detector is an advanced computer vision model that detects and localizes Monarch butterflies in images. With its cutting-edge technology and high accuracy, this model opens up exciting possibilities for biodiversity monitoring, migration studies, citizen science projects, identification guides, and environmental education.

Key Features

Accurate Detection: The Monarch Butterfly Detector utilizes state-of-the-art computer vision algorithms to accurately identify and localize Monarch butterflies within images.

Versatile Use Cases: This powerful model has diverse applications, ranging from scientific research and conservation efforts to citizen science projects and environmental education programs.

Easy Integration: The Monarch Butterfly Detector can be seamlessly integrated into existing platforms, apps, or websites, making it accessible to many users and enabling them to contribute effortlessly to butterfly monitoring.

User-Friendly Interface: We provide a user-friendly interface/API that allows users to easily interact with the model, upload images, and obtain instant results.

For detailed documentation and tutorials on using Roboflow, refer to the https://docs.roboflow.com/

Contribution Guidelines

We welcome contributions from the open-source community to enhance the Monarch Butterfly Detector. If you're interested in contributing, please follow the guidelines outlined in [CONTRIBUTING.md] and submit your pull requests.

License

This project is licensed under the [Roboflow License]. For more information, see the [LICENSE] file provided by Roboflow.

Contact Information

For any questions, suggestions, or collaborations, please reach out to us at savetheworld at 150left.com

Congratulations if you have made it this far. 🥳

🎁🎁🎁I thought I would use another piece of AI to get some more content about this detector I created. Here is the dialog I had with ChatGPT.

Scott: ChatGPT, can you make my readers drool over my Monarch Butterfly Detector model?

ChatGPT: Here are 10 captivating outro or call-to-action suggestions to encourage users to try out the Monarch Butterfly Detector and contribute to its success:

"Join us in preserving the delicate beauty of Monarch butterflies. Try out the Monarch Butterfly Detector today and become a guardian of these magnificent creatures."

"Unlock the secrets of Monarch butterflies with our cutting-edge detection model. Experience the wonders of computer vision and contribute to vital research efforts by using the Monarch Butterfly Detector."

"Make a difference in conservation by being a part of the Monarch Butterfly Detector community. Help protect these iconic insects and their habitats with the power of AI and computer vision."

"Calling all nature enthusiasts and citizen scientists! Contribute to the Monarch Butterfly Detector project and help us gather valuable data on Monarch butterflies. Together, we can make a lasting impact."

"Explore the hidden world of Monarch butterflies with the Monarch Butterfly Detector. Whether you're a researcher, educator, or simply curious, join us on this journey and unravel the mysteries of these captivating creatures."

"Become a Monarch detective! Unleash the potential of computer vision and aid in the conservation of Monarch butterflies. Download the Monarch Butterfly Detector and help us protect these extraordinary pollinators."

"Attention conservationists and researchers! The Monarch Butterfly Detector needs your expertise. Contribute your insights and collaborate with us to further our understanding of Monarch butterflies and support their conservation."

"Empower yourself with AI and contribute to science. With the Monarch Butterfly Detector, you can be a part of groundbreaking research on Monarch butterflies. Start detecting, start making a difference."

"Inspire the next generation of environmentalists. Use the Monarch Butterfly Detector as an educational tool and engage students in the wonders of computer vision and the importance of protecting our natural world."

"Experience the thrill of discovery with the Monarch Butterfly Detector. Join a vibrant community of nature enthusiasts and scientists, and together, let's unlock new insights into the fascinating lives of Monarch butterflies."

As you can see, AI certainly likes my idea.
f
Data from: AI Narratives Dataset
figshare.com
txt
Updated Jul 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Melada Sudajit-apa (2024). AI Narratives Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.25132241.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25132241.v1
Dataset updated
Jul 22, 2024
Dataset provided by
figshare
Authors
Melada Sudajit-apa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ChatGPT has increasingly been utilized to generate ideas and create texts in digital media. This has raised concerns of whether the AI might produce texts that perpetuate socio-political biases. The aim of this research was thus to examine how ChatGPT construct identities and ideologies of women represented in digital stories.
Z
Dolly 15k Dutch
data.niaid.nih.gov
Updated Jun 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vanroy, Bram (2023). Dolly 15k Dutch [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8054097
Explore at:
Dataset updated
Jun 20, 2023
Dataset authored and provided by
Vanroy, Bram
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
This dataset contains 14,934 instructions, contexts and responses, in several natural language categories such as classification, closed QA, generation, etc. The English original dataset was created by @databricks, who crowd-sourced the data creation via its employees. The current dataset is a translation of that dataset through ChatGPT (gpt-3.5-turbo).

Data Instances

{ "id": 14963, "instruction": "Wat zijn de duurste steden ter wereld?", "context": "", "response": "Dit is een uitgebreide lijst van de duurste steden: Singapore, Tel Aviv, New York, Hong Kong, Los Angeles, Zurich, Genève, San Francisco, Parijs en Sydney.", "category": "brainstorming" }

Data Fields

id: the ID of the item. The following 77 IDs are not included because they could not be translated (or were too long): [1502, 1812, 1868, 4179, 4541, 6347, 8851, 9321, 10588, 10835, 11257, 12082, 12319, 12471, 12701, 12988, 13066, 13074, 13076, 13181, 13253, 13279, 13313, 13346, 13369, 13446, 13475, 13528, 13546, 13548, 13549, 13558, 13566, 13600, 13603, 13657, 13668, 13733, 13765, 13775, 13801, 13831, 13906, 13922, 13923, 13957, 13967, 13976, 14028, 14031, 14045, 14050, 14082, 14083, 14089, 14110, 14155, 14162, 14181, 14187, 14200, 14221, 14222, 14281, 14473, 14475, 14476, 14587, 14590, 14667, 14685, 14764, 14780, 14808, 14836, 14891, 1 4966]

instruction: the instruction (question)

context: additional context that the AI can use to answer the question

response: the AI's expected response

category: the category of this type of question (see Dolly for more info)

Dataset Creation

Both the translations and the topics were translated with OpenAI's API for gpt-3.5-turbo. max_tokens=1024, temperature=0 as parameters.

The prompt template to translate the input is (where src_lang was English and tgt_lang Dutch):

CONVERSATION_TRANSLATION_PROMPT = """You are asked to translate a task's instruction, optional context to the task, and the response to the task, from {src_lang} to {tgt_lang}.

Here are the requirements that you should adhere to: 1. maintain the format: the task consists of a task instruction (marked instruction:), optional context to the task (marked context:) and response for the task marked with response:; 2. do not translate the identifiers instruction:, context:, and response: but instead copy them to your output; 3. make sure that text is fluent to read and does not contain grammatical errors. Use standard {tgt_lang} without regional bias; 4. translate the instruction and context text using informal, but standard, language; 5. make sure to avoid biases (such as gender bias, grammatical bias, social bias); 6. if the instruction is to correct grammar mistakes or spelling mistakes then you have to generate a similar mistake in the context in {tgt_lang}, and then also generate a corrected output version in the output in {tgt_lang}; 7. if the instruction is to translate text from one language to another, then you do not translate the text that needs to be translated in the instruction or the context, nor the translation in the response (just copy them as-is); 8. do not translate code fragments but copy them to your output. If there are English examples, variable names or definitions in code fragments, keep them in English.

Now translate the following task with the requirements set out above. Do not provide an explanation and do not add anything else.

"""

The system message was:

You are a helpful assistant that translates English to Dutch according to the requirements that are given to you.

Note that 77 items (0.5%) were not successfully translated. This can either mean that the prompt was too long for the given limit (max_tokens=1024) or that the generated translation could not be parsed into instruction, context and response fields. The missing IDs are [1502, 1812, 1868, 4179, 4541, 6347, 8851, 9321, 10588, 10835, 11257, 12082, 12319, 12471, 12701, 12988, 13066, 13074, 13076, 13181, 13253, 13279, 13313, 13346, 13369, 13446, 13475, 13528, 13546, 13548, 13549, 13558, 13566, 13600, 13603, 13657, 13668, 13733, 13765, 13775, 13801, 13831, 13906, 13922, 13923, 13957, 13967, 13976, 14028, 14031, 14045, 14050, 14082, 14083, 14089, 14110, 14155, 14162, 14181, 14187, 14200, 14221, 14222, 14281, 14473, 14475, 14476, 14587, 14590, 14667, 14685, 14764, 14780, 14808, 14836, 14891, 1 4966].

Initial Data Collection and Normalization

Initial data collection by databricks. See their repository for more information about this dataset.

Considerations for Using the Data

Note that the translations in this new dataset have not been verified by humans! Use at your own risk, both in terms of quality and biases.

Discussion of Biases

As with any machine-generated texts, users should be aware of potential biases that are included in this dataset. Although the prompt specifically includes make sure to avoid biases (such as gender bias, grammatical bias, social bias), of course the impact of such command is not known. It is likely that biases remain in the dataset so use with caution.

Other Known Limitations

The translation quality has not been verified. Use at your own risk!

Licensing Information

This repository follows the original databricks license, which is CC BY-SA 3.0 but see below for a specific restriction.

This text was generated (either in part or in full) with GPT-3 (gpt-3.5-turbo), OpenAI’s large-scale language-generation model. Upon generating draft language, the author reviewed, edited, and revised the language to their own liking and takes ultimate responsibility for the content of this publication.

If you use this dataset, you must also follow the Sharing and Usage policies.

As clearly stated in their Terms of Use, specifically 2c.iii, "[you may not] use output from the Services to develop models that compete with OpenAI". That means that you cannot use this dataset to build models that are intended to commercially compete with OpenAI. As far as I am aware, that is a specific restriction that should serve as an addendum to the current license.

This dataset is also available on the Hugging Face hub, its canonical repository.
ultrachat_200k
huggingface.co
opendatalab.com
Updated Oct 29, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hugging Face H4 (2023). ultrachat_200k [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 29, 2023
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face H4
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for UltraChat 200k

Dataset Description

This is a heavily filtered version of the UltraChat dataset and was used to train Zephyr-7B-β, a state of the art 7b chat model. The original datasets consists of 1.4M dialogues generated by ChatGPT and spanning a wide range of topics. To create UltraChat 200k, we applied the following logic:

Selection of a subset of data for faster supervised fine tuning. Truecasing of the dataset, as we observed around 5% of the data… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k.
ChatGPT in Engineering Education: Survey Data on AI Usage, Learning Impact,...
figshare.com
csv
Updated May 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Davood Khodadad (2025). ChatGPT in Engineering Education: Survey Data on AI Usage, Learning Impact, and Collaboration [Dataset]. http://doi.org/10.6084/m9.figshare.28536422.v1
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28536422.v1
Dataset updated
May 8, 2025
Dataset provided by
Figsharehttp://figshare.com/
Authors
Davood Khodadad
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract:This dataset presents survey responses from first-year engineering students on their use of ChatGPT and other AI tools in a project-based learning environment. Collected as part of a study on AI’s role in engineering education, the data captures key insights into how students utilize ChatGPT for coding assistance, conceptual understanding, and collaborative work. The dataset includes responses on frequency of AI usage, perceived benefits and challenges, ethical concerns, and the impact of AI on learning outcomes and problem-solving skills.With AI increasingly integrated into education, this dataset provides valuable empirical evidence for researchers, educators, and policymakers interested in AI-assisted learning, STEM education, and academic integrity. It enables further analysis of student perceptions, responsible AI use, and the evolving role of generative AI in higher education.By making this dataset publicly available, we aim to support future research on AI literacy, pedagogy, and best practices for integrating AI into engineering and science curricula..................................................................................................................................................................Related PublicationThis dataset supports the findings presented in the following peer-reviewed article:ChatGPT in Engineering Education: A Breakthrough or a Challenge?Davood KhodadadPublished: 7 May 2025 | Physics Education, Volume 60, Number 4© 2025 The Author(s). Published by IOP Publishing LtdCitation: Davood Khodadad 2025 Phys. Educ. 60 045006DOI: 10.1088/1361-6552/add073If you use or reference this dataset, please consider citing the above publication......................................................................................................................................................................Description of the data and file structureTitle: ChatGPT in Engineering Education: Survey Data on AI Usage, Learning Impact, and CollaborationDescription of Data Collection:This dataset was collected through a survey distributed via the Canvas learning platform following the completion of group projects in an introductory engineering course. The survey aimed to investigate how students engaged with ChatGPT and other AI tools in a project-based learning environment, particularly in relation to coding, report writing, idea generation, and collaboration.The survey consisted of 15 questions:12 multiple-choice questions to capture quantitative insights on AI usage patterns, frequency, and perceived benefits.3 open-ended questions to collect qualitative perspectives on challenges, ethical concerns, and students' reflections on AI-assisted learning.Key areas assessed in the survey include:Students’ prior familiarity with AI tools before the course.Frequency and purpose of ChatGPT usage (e.g., coding assistance, conceptual learning, collaboration).Perceived benefits and limitations of using AI tools in an engineering learning environment.Ethical considerations, including concerns about over-reliance and academic integrity.The dataset provides valuable empirical insights into the evolving role of AI in STEM education and can support further research on AI-assisted learning, responsible AI usage, and best practices for integrating AI tools in engineering education.
m
Data from: Artificial Intelligence Adoption Prediction Model: Would...
data.mendeley.com
Updated Feb 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christa van Staden (2024). Artificial Intelligence Adoption Prediction Model: Would ChatGPT-3.5 be adopted in English poetry classrooms? [Dataset]. http://doi.org/10.17632/289jtphg33.2
Explore at:
Unique identifier
https://doi.org/10.17632/289jtphg33.2
Dataset updated
Feb 26, 2024
Authors
Christa van Staden
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is version 2 of the dataset created and used to explore ChatGPT-3.5's ability to write, justify and analyse English poems. This version was created after the reviewers decision that this paper may be published, if some changes are made.

The purpose of the research was to determine if ChatGPT-3.5 would be adopted in English poetry classrooms. As none of the theoretical models were applicable, the Artificial Intelligence Adoption Prediction Model (AIAPM) was designed. Based on this model, an Artificial Intelligence Adoption Prediction tool (AIAPT) was designed to calculate an Adoption Prediction Score (APS). Then, ChatGPT-3.5's ability to write, justify and analyse poems were explored.

It was found that ChatGPT-3.5 could write, justify, and analyse poems, but it could also make errors and hallucinate convincingly. Thus, the AIAPT was used to calculate the Adoption Prediction Score. The APS was 9, thus all factors of the AIAPM could drive the adoption decision. Thus, it could be predicted that ChatGPT-3.5 would be adopted in English poetry classrooms, both for ethical and unethical purposes. Based on the results, a few pro-active strategies were suggested.

This dataset contains all data created and used during the research, including the poems which were integrated in the paper: "An Artificial Intelligence Adoption Prediction Model to determine if ChatGPT-3.5 would be adopted in English poetry classrooms" which was submitted toe Heliyon for publication.
f
Table1_Can ChatGPT help patients understand radiopharmaceutical...
frontiersin.figshare.com
datasetcatalog.nlm.nih.gov
docx
Updated Nov 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Madeleine Alvarez (2024). Table1_Can ChatGPT help patients understand radiopharmaceutical extravasations?.docx [Dataset]. http://doi.org/10.3389/fnume.2024.1469487.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fnume.2024.1469487.s001
Dataset updated
Nov 6, 2024
Dataset provided by
Frontiers
Authors
Madeleine Alvarez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A previously published paper in the official journal of the Society of Nuclear Medicine and Molecular Imaging (SNMMI) concluded that the artificial intelligence chatbot ChatGPT may offer an adequate substitute for nuclear medicine staff informational counseling to patients in an investigated setting of 18F-FDG PET/CT. To ensure consistency with the previous paper, the author and a team of experts followed a similar methodology and evaluated whether ChatGPT could adequately offer a substitute for nuclear medicine staff informational counseling to patients regarding radiopharmaceutical extravasations. We asked ChatGPT fifteen questions regarding radiopharmaceutical extravasations. Each question or prompt was queried three times. Using the same evaluation criteria as the previously published paper, the ChatGPT responses were evaluated by two nuclear medicine trained physicians and one nuclear medicine physicist for appropriateness and helpfulness. These evaluators found ChatGPT responses to be either highly appropriate or quite appropriate in 100% of questions and very helpful or quite helpful in 93% of questions. The interobserver agreement among the evaluators, assessed using the Intraclass Correlation Coefficient (ICC), was found to be 0.72, indicating good overall agreement. The evaluators also rated the inconsistency across the three ChatGPT responses for each question and found irrelevant or minor inconsistencies in 87% of questions and some differences relevant to main content in the other 13% of the questions. One physician evaluated the quality of the references listed by ChatGPT as the source material it used in generating its responses. The reference check revealed no AI hallucinations. The evaluator concluded that ChatGPT used fully validated references (appropriate, identifiable, and accessible) to generate responses for eleven of the fifteen questions and used generally available medical and ethical guidelines to generate responses for four questions. Based on these results we concluded that ChatGPT may be a reliable resource for patients interested in radiopharmaceutical extravasations. However, these validated and verified ChatGPT responses differed significantly from official positions and public comments regarding radiopharmaceutical extravasations made by the SNMMI and nuclear medicine staff. Since patients are increasingly relying on the internet for information about their medical procedures, the differences need to be addressed.
n
Data from: neuroGPT-X: Towards an Accountable Expert Opinion Tool for...
narcis.nl
data.mendeley.com
Updated Feb 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guo, E (via Mendeley Data) (2023). neuroGPT-X: Towards an Accountable Expert Opinion Tool for Vestibular Schwannoma [Dataset]. http://doi.org/10.17632/b9mck42r35.1
Explore at:
Unique identifier
https://doi.org/10.17632/b9mck42r35.1
Dataset updated
Feb 27, 2023
Dataset provided by
Data Archiving and Networked Services (DANS)
Authors
Guo, E (via Mendeley Data)
Description
We hypothesize that a well-trained, context-enriched GPT will perform at the level of or better than an expert surgeon in generating comprehensive answers to questions surrounding commonly posed in day-to-day practice regarding vestibular schwannoma. In this study, we make three key contributions to assessing the feasibility of LLMs as a clinical decision-making adjunct.1. We develop a framework to context-enrich GPT with context relevant to vestibular schwannoma.2. We compare the performance of ChatGPT (Jan. 30, 2023 model) and a context-enriched GPT model against leading neurosurgical experts worldwide, evaluating the ability of large language models (LLMs) to assist in clinical decision-making.3. We introduce a proof-of-concept clinical decision-making tool, neuroGPT-X, which incorporates working memory, sources with each answer, and a web-based chat platform to address challenges in using LLMs in a clinical setting, including interpretability, reliability, accountability, and safety.
h
bmi_based_advice
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Navaneeth K, bmi_based_advice [Dataset]. https://huggingface.co/datasets/navaneeth005/bmi_based_advice
Explore at:
Authors
Navaneeth K
Description
Dataset Card for Dataset Name

BMI BASED ADVICE DATASET

Dataset Details Dataset Description

THIS IS A DATASET CREATED BY SLECTIVELY CHOOSING AND MERGING MULTIPLE DATASETS FROM VARIOUS SOURCERS INCLUDING OTHER DATASETS AND GENERATED DATASETS. FEEL FREE TO USE THESE ANYWHERE AND MAKE SURE TO CREDIT THE APPROPIATE DATA SOURCERS WHEREVER NECESSARY!! 😀

Curated by: [Navaneeth. K]

Dataset Sources

https://huggingface.co/datasets https://www.chatgpt.com… See the full description on the dataset page: https://huggingface.co/datasets/navaneeth005/bmi_based_advice.
f
Minimal dataset.
figshare.com
txt
Updated Mar 8, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Avishek Choudhury; Safa Elkefi; Achraf Tounsi (2024). Minimal dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0296151.s002
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0296151.s002
Dataset updated
Mar 8, 2024
Dataset provided by
PLOS ONE
Authors
Avishek Choudhury; Safa Elkefi; Achraf Tounsi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
As ChatGPT emerges as a potential ally in healthcare decision-making, it is imperative to investigate how users leverage and perceive it. The repurposing of technology is innovative but brings risks, especially since AI’s effectiveness depends on the data it’s fed. In healthcare, ChatGPT might provide sound advice based on current medical knowledge, which could turn into misinformation if its data sources later include erroneous information. Our study assesses user perceptions of ChatGPT, particularly of those who used ChatGPT for healthcare-related queries. By examining factors such as competence, reliability, transparency, trustworthiness, security, and persuasiveness of ChatGPT, the research aimed to understand how users rely on ChatGPT for health-related decision-making. A web-based survey was distributed to U.S. adults using ChatGPT at least once a month. Bayesian Linear Regression was used to understand how much ChatGPT aids in informed decision-making. This analysis was conducted on subsets of respondents, both those who used ChatGPT for healthcare decisions and those who did not. Qualitative data from open-ended questions were analyzed using content analysis, with thematic coding to extract public opinions on urban environmental policies. Six hundred and seven individuals responded to the survey. Respondents were distributed across 306 US cities of which 20 participants were from rural cities. Of all the respondents, 44 used ChatGPT for health-related queries and decision-making. In the healthcare context, the most effective model highlights ’Competent + Trustworthy + ChatGPT for healthcare queries’, underscoring the critical importance of perceived competence and trustworthiness specifically in the realm of healthcare applications of ChatGPT. On the other hand, the non-healthcare context reveals a broader spectrum of influential factors in its best model, which includes ’Trustworthy + Secure + Benefits outweigh risks + Satisfaction + Willing to take decisions + Intent to use + Persuasive’. In conclusion our study findings suggest a clear demarcation in user expectations and requirements from AI systems based on the context of their use. We advocate for a balanced approach where technological advancement and user readiness are harmonized.
d
Data and code on the Moral Machine experiment on large language models...
search.dataone.org
data.niaid.nih.gov
+1more
Updated Mar 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kazuhiro Takemoto (2024). Data and code on the Moral Machine experiment on large language models (LLMs) [Dataset]. http://doi.org/10.5061/dryad.d7wm37q6v
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.d7wm37q6v
Dataset updated
Mar 23, 2024
Dataset provided by
Dryad Digital Repository
Authors
Kazuhiro Takemoto
Time period covered
Sep 21, 2023
Description
As large language models (LLMs) have become more deeply integrated into various sectors, understanding how they make moral judgements has become crucial, particularly in the realm of autonomous driving. This study used the moral machine framework to investigate the ethical decision-making tendencies of prominent LLMs, including GPT-3.5, GPT-4, PaLM 2 and Llama 2, to compare their responses with human preferences. While LLMs' and humans' preferences such as prioritizing humans over pets and favouring saving more lives are broadly aligned, PaLM 2 and Llama 2, especially, evidence distinct deviations. Additionally, despite the qualitative similarities between the LLM and human preferences, there are significant quantitative disparities, suggesting that LLMs might lean toward more uncompromising decisions, compared with the milder inclinations of humans. These insights elucidate the ethical frameworks of LLMs and their potential implications for autonomous driving., Using the MM methodology detailed in the supplementary information of https://www.nature.com/articles/s41586-018-0637-6, we implemented code for generating Moral Machine scenarios. After generating the MM scenarios, responses from GPT-3.5, GPT-4, PaLM 2, and Llama 2 were collected using the application programming interface (API) and relevant code. We applied the conjoint analysis framework to evaluate the relative importance of the nine preferences., , # Data and Code on the Moral Machine Experiment on Large Language Models

https://doi.org/10.5061/dryad.d7wm37q6v

Requirements

Python 3.9

pip install -r requirements.txt

NOTE: The script run_chatgpt.py requires an OpenAI API key. Please obtain your API key by following OpenAI's instructions. To run the script run_palm2.py, setup is required. Please refer to the Google Cloud instructions. Specifically, follow these sections in the given order: 1) Set up a project and a development environment and 2) Install the Vertex AI SDK for Python. Before running run_llama2.py, the Llama2 model files must be downloaded. Please follow [the instructi...
Z
Simple Multimodal Algorithmic Reasoning Task Dataset (SMART-101)
data.niaid.nih.gov
Updated Mar 28, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lohit, Suhas (2023). Simple Multimodal Algorithmic Reasoning Task Dataset (SMART-101) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7761799
Explore at:
Dataset updated
Mar 28, 2023
Dataset provided by
Cherian, Anoop
Peng, Kuan-Chuan
Smith, Kevin A.
Lohit, Suhas
Tenenbaum, Joshua B.
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Introduction

Recent times have witnessed an increasing number of applications of deep neural networks towards solving tasks that require superior cognitive abilities, e.g., playing Go, generating art, ChatGPT, etc. Such a dramatic progress raises the question: how generalizable are neural networks in solving problems that demand broad skills? To answer this question, we propose SMART: a Simple Multimodal Algorithmic Reasoning Task (and the associated SMART-101 dataset) for evaluating the abstraction, deduction, and generalization abilities of neural networks in solving visuo-linguistic puzzles designed specifically for children of younger age (6--8). Our dataset consists of 101 unique puzzles; each puzzle comprises a picture and a question, and their solution needs a mix of several elementary skills, including pattern recognition, algebra, and spatial reasoning, among others. To train deep neural networks, we programmatically augment each puzzle to 2,000 new instances; each instance varied in appearance, associated natural language question, and its solution. To foster research and make progress in the quest for artificial general intelligence, we are publicly releasing our SMART-101 dataset, consisting of the full set of programmatically-generated instances of 101 puzzles and their solutions.

The dataset was introduced in our paper Are Deep Neural Networks SMARTer than Second Graders? by Anoop Cherian, Kuan-Chuan Peng, Suhas Lohit, Kevin A. Smith, and Joshua B. Tenenbaum, CVPR 2023

Files in the unzipped folder:

./README.md: This Markdown file

./SMART101-Data: Folder containing all the puzzle data. See below for details.

./puzzle_type_info.csv: Puzzle categorization (into 8 skill classes).

Dataset Organization

The dataset consists of 101 folders (numbered from 1-101); each folder corresponds to one distinct puzzle (root puzzle). There are 2000 puzzle instances programmatically created for each root puzzle, numbered from 1-2000. Every root puzzle index (in [1,101]) folder contains: (i) img/ and (ii) puzzle_.csv. The folder img/ is the location where the puzzle instance images are stored, and puzzle_.csv the non-image part of a puzzle. Specifically, a row of puzzle_.csv is the following tuple: `, whereidis the puzzle instance id (in [1,2000]),Questionis the puzzle question associated with the instance,imageis the name of the image (inimg/folder) corresponding to this instanceid,A, B, C, D, Eare the five answer candidates, andAnswer` is the answer to the question.

At a Glance

The size of the unzipped dataset is ~12GB.

The dataset consists of 101 folders (numbered from 1-101); each folder corresponds to one distinct puzzle (root puzzle).

There are 2000 puzzle instances programmatically created for each root puzzle, numbered from 1-2000.

Every root puzzle index (in [1,101]) folder contains: (i) img/ and (ii) puzzle_.csv.

The folder img/ is the location where the puzzle instance images are stored, and puzzle_.csv contains the non-image part of a puzzle. Specifically, a row of puzzle_.csv is the following tuple: `, whereidis the puzzle instance id (in [1,2000]),Questionis the puzzle question associated with the instance,imageis the name of the image (inimg/folder) corresponding to this instanceid,A, B, C, D, Eare the five answer candidates, andAnswer` is the correct answer to the question.

Other Details In our paper Are Deep Neural Networks SMARTer than Second Graders?, we provide four different dataset splits for evaluation: (i) Instance Split (IS), (ii) Answer Split (AS), (iii) Puzzle Split (PS), and (iv) Few-shot Split (FS). Below, we provide the details of each split to make fair comparisons to the results reported in our paper.

Puzzle Split (PS) We use the following root puzzle ids as the Train and Test sets.

Split Root Puzzle Id Sets `Test` { 94,95, 96, 97, 98, 99, 101, 61,62, 65, 66,67, 69, 70, 71,72,73,74,75,76,77} `Train` {1,2,...,101} \ Test

Evaluation is done on all the Test puzzles and their accuracies averaged. For the 'Test' puzzles, we use the instance indices 1701-2000 in the evaluation.

Few-shot Split (FS)

We randomly select k number of instances from the Test sets (that are used in the PS split above) for training in FS split (e.g., k=100). These k few-shot samples are taken from instance indices 1-1600 of the respective puzzles and evaluation is conducted on all instance ids from 1701-2000.

Instance Split (IS)

We split the instances under every root puzzle as: Train = 1-1600, Val = 1601-1700, Test = 1701-2000. We train the neural network models using the Train split puzzle instances from all the root puzzles together and evaluate on the Test split of all puzzles.

Answer Split (AS)

We find the median answer value among all the 2000 instances for every root puzzle and only use this set of the respective instances (with the median answer value) as the Test set for evaluation (this set is excluded from the training of the neural networks).

Puzzle Categorization

Please see puzzle_type_info.csv for details on the categorization of the puzzles into eight classes, namely (i) counting, (ii) logic, (iii) measure, (iv) spatial, (v) arithmetic, (vi) algebra, (vii) pattern finding, and (viii) path tracing.

Other Resources

PyTorch code for using the dataset to train deep neural networks is available here.

Contact Anoop Cherian (cherian@merl.com), Kuan-Chuan Peng (kpeng@merl.com), or Suhas Lohit (slohit@merl.com)

Citation If you use the SMART-101 dataset in your research, please cite our paper:

@article{cherian2022deep, title={Are Deep Neural Networks SMARTer than Second Graders?}, author={Cherian, Anoop and Peng, Kuan-Chuan and Lohit, Suhas and Smith, Kevin and Tenenbaum, Joshua B}, journal={arXiv preprint arXiv:2212.09993}, year={2022} }

Copyright and Licenses

The SMART-101 dataset is released under CC-BY-SA-4.0.

Created by Mitsubishi Electric Research Laboratories (MERL), 2022-2023

SPDX-License-Identifier: CC-BY-SA-4.0
h
workoutplanner
huggingface.co
Updated Jun 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Navaneeth K (2025). workoutplanner [Dataset]. https://huggingface.co/datasets/navaneeth005/workoutplanner
Explore at:
Dataset updated
Jun 17, 2025
Authors
Navaneeth K
Description
Dataset Card for Dataset Name

WORKOUT PLANNER DATASET

Dataset Details Dataset Description

THIS IS A DATASET CREATED BY SLECTIVELY CHOOSING AND MERGING MULTIPLE DATASETS FROM VARIOUS SOURCERS INCLUDING OTHER DATASETS AND GENERATED DATASETS. FEEL FREE TO USE THESE ANYWHERE AND MAKE SURE TO CREDIT THE APPROPIATE DATA SOURCERS WHEREVER NECESSARY!! 😀

Curated by: [Navaneeth. K]

Dataset Sources

https://huggingface.co/datasets https://www.chatgpt.com… See the full description on the dataset page: https://huggingface.co/datasets/navaneeth005/workoutplanner.
f
Dataset 1.sav
figshare.com
bin
Updated Mar 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Al Wardha Zahoor (2024). Dataset 1.sav [Dataset]. http://doi.org/10.6084/m9.figshare.25398004.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.25398004.v1
Dataset updated
Mar 13, 2024
Dataset provided by
figshare
Authors
Al Wardha Zahoor
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A survey conducted in Pakistan to determine the perception of health care professionals on the use of ChatGPT in clinical decision making. The survey was conducted online, in between March to April 2023 through online Google forms. Any healthcare professional practicing in Pakistan including doctors, paramedic staff, allied health care; physiotherapist, occupational & speech therapist, nurses with any age group, must be familiar with ChatGPT and had used it in their daily practices of clinical decision making were the targeted population of this survey. The undergraduate students who are practicing clinical for their learning are excluded due to amateur skills.
Text sample datasets and AI detectors test results
figshare.com
txt
Updated Oct 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrey Popkov (2023). Text sample datasets and AI detectors test results [Dataset]. http://doi.org/10.6084/m9.figshare.24208443.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24208443.v1
Dataset updated
Oct 18, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Andrey Popkov
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset includes three distinct subsets of text:Open Access Academic Articles: A collection of 100 open-access articles from various academic journals focused on mental health and psychiatry published between 2016-2018. The articles are selected from reputable journals including JAMA, The Lancet Psychiatry, WPJ, and AM J Psy.ChatGPT-Generated Texts: Discussion section samples generated by ChatGPT (GPT-4 model, version as of August 3, 2023, OpenAI) that are designed to imitate the style and content of academic articles in the field of mental health and psychiatry.Claude-Generated Texts: Discussion section samples generated by Claude (Version 2, Anthropic) with the aim of imitating academic articles in the same field.Additionally, the dataset contains the results of tests performed using ZeroGPT and Originality.AI to evaluate the AI texts vs the academic articles for the percentage of texts identified as being AI-generated.Please cite this dataset if you make use of it in your research.

Facebook

Twitter

Click to copy link

Link copied

Cite

Humarin (2023). chatgpt-paraphrases [Dataset]. https://huggingface.co/datasets/humarin/chatgpt-paraphrases

chatgpt-paraphrases

humarin/chatgpt-paraphrases

Explore at:

61 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Mar 17, 2023

Dataset authored and provided by

Humarin

License

https://choosealicense.com/licenses/openrail/https://choosealicense.com/licenses/openrail/

Description

This is a dataset of paraphrases created by ChatGPT. Model based on this dataset is avaible: model

  We used this prompt to generate paraphrases

Generate 5 similar paraphrases for this question, show it like a numbered list without commentaries: {text} This dataset is based on the Quora paraphrase question, texts from the SQUAD 2.0 and the CNN news dataset. We generated 5 paraphrases for each sample, totally this dataset has about 420k data rows. You can make 30 rows from a row from… See the full description on the dataset page: https://huggingface.co/datasets/humarin/chatgpt-paraphrases.

Clear search

Close search

Google apps

Main menu

chatgpt-paraphrases

Test dataset of ChatGPT in medical field

DAIGT | External Dataset

Dataset Description

Prompts generated from ChatGPT3.5, ChatGPT4, LLama3-8B, and Mistral-7B with...

Description

Paper

ChatGPT Evaluation Dataset v.2.0

Monarch Butterfly Detector Dataset

Monarch Butterfly Detector

Key Features

Contribution Guidelines

License

Contact Information

Data from: AI Narratives Dataset

Dolly 15k Dutch

ultrachat_200k

ChatGPT in Engineering Education: Survey Data on AI Usage, Learning Impact,...

Data from: Artificial Intelligence Adoption Prediction Model: Would...

Table1_Can ChatGPT help patients understand radiopharmaceutical...

Data from: neuroGPT-X: Towards an Accountable Expert Opinion Tool for...

bmi_based_advice

Minimal dataset.

Data and code on the Moral Machine experiment on large language models...

Requirements

Simple Multimodal Algorithmic Reasoning Task Dataset (SMART-101)

workoutplanner

Dataset 1.sav

Text sample datasets and AI detectors test results

chatgpt-paraphrases

humarin/chatgpt-paraphrases