10 datasets found

b
ChatGPT Revenue and Usage Statistics (2025)
businessofapps.com
Updated Feb 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Business of Apps (2023). ChatGPT Revenue and Usage Statistics (2025) [Dataset]. https://www.businessofapps.com/data/chatgpt-statistics/
Explore at:
Dataset updated
Feb 9, 2023
Dataset authored and provided by
Business of Apps
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
ChatGPT has taken the world by storm, setting a record for the fastest app to reach a 100 million users, which it hit in two months. The implications of this tool are far-reaching, universities...
Top user concerns about ChatGPT SEA 2023
statista.com
Updated Jul 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Top user concerns about ChatGPT SEA 2023 [Dataset]. https://www.statista.com/statistics/1382944/sea-top-user-concerns-about-chat-gpt/
Explore at:
Dataset updated
Jul 8, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2023
Area covered
Asia
Description
In a survey conducted across **** Southeast Asian countries in February 2023, almost half of the respondents selected collection of personal data as one of the concerns they had regarding the usage of chatbots like ChatGPT. In contrast, ethical issues related to data privacy and intellectual property were a concern for ** percent of the respondents.
Top user concerns about ChatGPT SEA 2023, by country
statista.com
Updated Jun 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Top user concerns about ChatGPT SEA 2023, by country [Dataset]. https://www.statista.com/statistics/1383112/sea-top-user-concerns-about-chat-gpt-by-country/
Explore at:
Dataset updated
Jun 24, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2023
Area covered
Asia
Description
In a survey conducted across four Southeast Asian countries in February 2023, ** percent of the respondents in Singapore selected the collection of personal data as one of the concerns they had regarding the usage of chatbots like ChatGPT. In contrast, this was an issue for ** percent of respondents in Indonesia.
Global employees attempting to use ChatGPT at work 2023
statista.com
Updated Jun 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Global employees attempting to use ChatGPT at work 2023 [Dataset]. https://www.statista.com/statistics/1378709/global-employees-chatgpt-se/
Explore at:
Dataset updated
Jun 23, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 2023 - Jun 2023
Area covered
Worldwide
Description
As of June 2023, it was reported that **** percent of employees of worldwide companies have tried using ChatGPT in the workplace at least once. Those who have put confidential corporate data into the AI-powered tool were *** percent.
h
toxic-chat
huggingface.co
Updated Jan 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Large Model Systems Organization (2024). toxic-chat [Dataset]. https://huggingface.co/datasets/lmsys/toxic-chat
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 25, 2024
Dataset authored and provided by
Large Model Systems Organization
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Update

[01/31/2024] We update the OpenAI Moderation API results for ToxicChat (0124) based on their updated moderation model on on Jan 25, 2024.[01/28/2024] We release an official T5-Large model trained on ToxicChat (toxicchat0124). Go and check it for you baseline comparision![01/19/2024] We have a new version of ToxicChat (toxicchat0124)!

Content

This dataset contains toxicity annotations on 10K user prompts collected from the Vicuna online demo. We utilize a human-AI… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/toxic-chat.
WildChat-1M
huggingface.co
Updated Jul 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ai2 (2024). WildChat-1M [Dataset]. https://huggingface.co/datasets/allenai/WildChat-1M
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 22, 2024
Dataset provided by
Allen Institute for AIhttp://allenai.org/
Authors
Ai2
License
https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/
Description
Dataset Card for WildChat

Dataset Description

Paper: https://arxiv.org/abs/2405.01470

Interactive Search Tool: https://wildvisualizer.com (paper)

License: ODC-BY

Language(s) (NLP): multi-lingual

Point of Contact: Yuntian Deng

Dataset Summary

WildChat is a collection of 1 million conversations between human users and ChatGPT, alongside demographic data, including state, country, hashed IP addresses, and request headers. We collected WildChat by… See the full description on the dataset page: https://huggingface.co/datasets/allenai/WildChat-1M.
P
GPTFuzzer Dataset
paperswithcode.com
Updated Jan 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiahao Yu; Xingwei Lin; Zheng Yu; Xinyu Xing (2025). GPTFuzzer Dataset [Dataset]. https://paperswithcode.com/dataset/gptfuzzer
Explore at:
Dataset updated
Jan 21, 2025
Authors
Jiahao Yu; Xingwei Lin; Zheng Yu; Xinyu Xing
Description
GPTFuzzer is a fascinating project that explores red teaming of large language models (LLMs) using auto-generated jailbreak prompts. Let's dive into the details:

Project Overview: GPTFuzzer aims to assess the security and robustness of LLMs by crafting prompts that can potentially lead to harmful or unintended behavior.

The project focuses on GPT-3 and similar models.

Datasets:

The datasets used in GPTFuzzer include:

Harmful Questions: Sampled from public datasets like llm-jailbreak-study and hh-rlhf. Human-Written Templates: Collected from llm-jailbreak-study. Responses: Gathered by querying models like Vicuna-7B, ChatGPT, and Llama-2-7B-chat.

Models:

The judgment model is a finetuned RoBERTa-large model. The training code and data are available in the repository.

During fuzzing experiments, the model is automatically downloaded and cached.

Updates:

The project has received recognition and awards at conferences like Geekcon 2023. The team continues to improve the codebase and aims to build a general black-box fuzzing framework for LLMs.

Source: Conversation with Bing, 3/17/2024 (1) sherdencooper/GPTFuzz: Official repo for GPTFUZZER - GitHub. https://github.com/sherdencooper/GPTFuzz. (2) GPTFUZZER : Red Teaming Large Language Models with Auto ... - GitHub. https://github.com/sherdencooper/GPTFuzz/blob/master/README.md. (3) GPTFUZZER : Red Teaming Large Language Models with Auto ... - GitHub. https://github.com/CriticalPulsar/GPTFuzz/blob/master/README.md. (4) undefined. https://avatars.githubusercontent.com/u/37368657?v=4. (5) undefined. https://github.com/sherdencooper/GPTFuzz/blob/master/README.md?raw=true. (6) undefined. https://desktop.github.com. (7) undefined. https://github.com/sherdencooper/GPTFuzz/raw/master/README.md. (8) undefined. https://opensource.org/licenses/MIT. (9) undefined. https://camo.githubusercontent.com/a4426cbe5c21edb002526331c7a8fbfa089e84a550567b02a0d829a98b136ad0/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e73652d4d49542d79656c6c6f772e737667. (10) undefined. https://img.shields.io/badge/License-MIT-yellow.svg. (11) undefined. https://arxiv.org/pdf/2309.10253.pdf. (12) undefined. https://sherdencooper.github.io/. (13) undefined. https://scholar.google.com/citations?user=Zv_rC0AAAAAJ&. (14) undefined. http://www.dataisland.org/. (15) undefined. http://xinyuxing.org/. (16) undefined. https://geekcon.darknavy.com/2023/china/en/index.html. (17) undefined. https://avatars.githubusercontent.com/u/35443979?v=4. (18) undefined. https://github.com/CriticalPulsar/GPTFuzz/blob/master/README.md?raw=true. (19) undefined. https://docs.github.com/articles/about-issue-and-pull-request-templates. (20) undefined. https://github.com/CriticalPulsar/GPTFuzz/raw/master/README.md. (21) undefined. https://scholar.google.com/citations?user=Zv_rC0AAAAAJ&hl=en.
GPT base SPSS.sav
figshare.com
bin
Updated Oct 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ery Ko; Gabriel Salerni; Carlos Alonso; Esteban Covian; Ana Ochoa; Ana Torre; Barbara Hernandez; Nuria Biblioni; Luis Mazzuoccolo (2023). GPT base SPSS.sav [Dataset]. http://doi.org/10.6084/m9.figshare.24425473.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24425473.v1
Dataset updated
Oct 24, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Ery Ko; Gabriel Salerni; Carlos Alonso; Esteban Covian; Ana Ochoa; Ana Torre; Barbara Hernandez; Nuria Biblioni; Luis Mazzuoccolo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Chat-GPT’s perception and usage patterns among Argentine dermatologists remain unclear. To determine this, we carried out an email survey to members of the Argentine Society of Dermatology in July 2023. While 83.7% of Argentine dermatologists are not acquainted with ChatGPT, a significant 65.4% have never used it. Argentine dermatologists seem to adopt a cautious and intermediate stance towards the tool, likely due to moderate reliability and utility perceptions (Likert scale 3/7). Chat-GPT users are significantly younger (p=0.042), have a higher proportion of “early adopters” (p=0.004), have less technology anxiety (p
h
Bitext-customer-support-llm-chatbot-training-dataset
huggingface.co
opendatalab.com
Updated Jul 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bitext (2024). Bitext-customer-support-llm-chatbot-training-dataset [Dataset]. https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 16, 2024
Dataset authored and provided by
Bitext
License
https://choosealicense.com/licenses/cdla-sharing-1.0/https://choosealicense.com/licenses/cdla-sharing-1.0/
Description
Bitext - Customer Service Tagged Training Dataset for LLM-based Virtual Assistants

Overview

This hybrid synthetic dataset is designed to be used to fine-tune Large Language Models such as GPT, Mistral and OpenELM, and has been generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools. The goal is to demonstrate how Verticalization/Domain Adaptation for the Customer Support sector can be easily achieved using our two-step approach to LLM… See the full description on the dataset page: https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset.
h
Capybara
huggingface.co
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alan Tseng (2024). Capybara [Dataset]. https://huggingface.co/datasets/agentlans/Capybara
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 19, 2024
Authors
Alan Tseng
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Capybara

This is the Capybara multi-turn chat dataset in ShareGPT format for LLaMA-Factory. Each line represents a conversation between a human user and GPT AI. The first message in the conversation is by a human user. Language is English only.

Example line truncated and pretty-printed: { "conversations": [ { "from": "human", "value": "Using the given plot points, write a short blurb for a science fiction novel. Earth endangered | alien invaders | secret… See the full description on the dataset page: https://huggingface.co/datasets/agentlans/Capybara.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Business of Apps (2023). ChatGPT Revenue and Usage Statistics (2025) [Dataset]. https://www.businessofapps.com/data/chatgpt-statistics/

ChatGPT Revenue and Usage Statistics (2025)

Explore at:

23 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Feb 9, 2023

Dataset authored and provided by

Business of Apps

License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Description

ChatGPT has taken the world by storm, setting a record for the fastest app to reach a 100 million users, which it hit in two months. The implications of this tool are far-reaching, universities...

Clear search

Close search

Google apps

Main menu

ChatGPT Revenue and Usage Statistics (2025)

Top user concerns about ChatGPT SEA 2023

Top user concerns about ChatGPT SEA 2023, by country

Global employees attempting to use ChatGPT at work 2023

toxic-chat

WildChat-1M

GPTFuzzer Dataset

GPT base SPSS.sav

Bitext-customer-support-llm-chatbot-training-dataset

Capybara

ChatGPT Revenue and Usage Statistics (2025)