Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents ChatGPT usage patterns across different age groups, showing the percentage of users who have followed its advice, used it without following advice, or have never used it, based on a 2025 U.S. survey.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents ChatGPT usage patterns across U.S. Census regions, based on a 2025 nationwide survey. It tracks how often users followed, partially used, or never used ChatGPT by state region.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
A major challenge of our time is reducing disparities in access to and effective use of digital technologies, with recent discussions highlighting the role of AI in exacerbating the digital divide. We examine user characteristics that predict usage of the AI-powered conversational agent ChatGPT. We combine behavioral and survey data in a web tracked sample of N=1376 German citizens to investigate differences in ChatGPT activity (usage, visits, and adoption) during the first 11 months from the launch of the service (November 30, 2022). Guided by a model of technology acceptance (UTAUT-2), we examine the role of socio-demographics commonly associated with the digital divide in ChatGPT activity and explore further socio-political attributes identified via stability selection in Lasso regressions. We confirm that lower age and higher education affect ChatGPT usage, but neither gender nor income do. We find full-time employment and more children to be barriers to ChatGPT activity. Using a variety of social media was positively associated with ChatGPT activity. In terms of political variables, political knowledge and political self-efficacy as well as some political behaviors such as voting, debating political issues online and offline and political action online were all associated with ChatGPT activity, with online political debating and political self-efficacy negatively so. Finally, need for cognition and communication skills such as writing, attending meetings, or giving presentations, were also associated with ChatGPT engagement, though chairing/organizing meetings was negatively associated. Our research informs efforts to address digital disparities and promote digital literacy among underserved populations by presenting implications, recommendations, and discussions on ethical and social issues of our findings.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset for this research project was meticulously constructed to investigate the adoption of ChatGPT among students in the United States. The primary objective was to gain insights into the technological barriers and resistances faced by students in integrating ChatGPT into their information systems. The dataset was designed to capture the diverse adoption patterns among students in various public and private schools and universities across the United States. By examining adoption rates, frequency of usage, and the contexts in which ChatGPT is employed, the research sought to provide a comprehensive understanding of how students are incorporating this technology into their information systems. Moreover, by including participants from diverse educational institutions, the research sought to ensure a comprehensive representation of the student population in the United States. This approach aimed to provide nuanced insights into how factors such as educational background, institution type, and technological familiarity influence ChatGPT adoption.
Dataset Card for Dataset Name
Dataset Summary
scraped-chatgpt-conversations contains ~100k conversations between a user and chatgpt that were shared online through reddit, twitter, or sharegpt. For sharegpt, the conversations were directly scraped from the website. For reddit and twitter, images were downloaded from submissions, segmented, and run through an OCR pipeline to obtain a conversation list. For information on how the each json file is structured, please see… See the full description on the dataset page: https://huggingface.co/datasets/ar852/scraped-chatgpt-conversations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset shows the types of advice users sought from ChatGPT based on a 2025 U.S. survey, including education, financial, medical, and legal topics.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for "Collective Cognition ChatGPT Conversations"
Dataset Description
Dataset Summary
The "Collective Cognition ChatGPT Conversations" dataset is a collection of chat logs between users and the ChatGPT model. These conversations have been shared by users on the "Collective Cognition" website. The dataset provides insights into user interactions with language models and can be utilized for multiple purposes, including training, research, and… See the full description on the dataset page: https://huggingface.co/datasets/CollectiveCognition/chats-data-2023-09-27.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
RavenStack is a fictional AI-powered collaboration platform used to simulate a real-world SaaS business. This simulated dataset was created using Python and ChatGPT specifically for people learning data analysis, business intelligence, or data science. It offers a realistic environment to practice SQL joins, cohort analysis, churn modeling, revenue tracking, and support analytics using a multi-table relational structure.
The dataset spans 5 CSV files:
accounts.csv – customer metadata
subscriptions.csv – subscription lifecycles and revenue
feature_usage.csv – daily product interaction logs
support_tickets.csv – support activity and satisfaction scores
churn_events.csv – churn dates, reasons, and refund behaviors
Users can explore trial-to-paid conversion, MRR trends, upgrade funnels, feature adoption, support patterns, churn drivers, and reactivation cycles. The dataset supports temporal and cohort analyses, and has built-in edge cases for testing real-world logic.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
ChatGPT-RealUser-2.2M: A Large-Scale Dataset of Real-User, Real-World ChatGPT Conversations
ChatGPT-RealUser-2.2M is a large-scale dataset of real-user, Real-World ChatGPT conversations developed by Gata. From 2024–2025, participants using Gata’s GPT-to-Earn product opted in to share their chats and earned points based on conversation quality. The dataset covers GPT-3.5, GPT-4, and o1 models, and contains 2,244,389 conversations from 15,316 unique users. Because many chats are… See the full description on the dataset page: https://huggingface.co/datasets/Gata-community/ChatGPT-RealUser-2.2M-preview.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents how much users trust ChatGPT across different advice categories, including career, education, financial, legal, and medical advice, based on a 2025 U.S. survey.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset summarizes how ChatGPT users rated the outcomes of the advice they received, including whether it was helpful, harmful, neutral, or uncertain, based on a 2025 U.S. survey.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Monarch Butterfly Detector is an advanced computer vision model that detects and localizes Monarch butterflies in images. With its cutting-edge technology and high accuracy, this model opens up exciting possibilities for biodiversity monitoring, migration studies, citizen science projects, identification guides, and environmental education.
Accurate Detection: The Monarch Butterfly Detector utilizes state-of-the-art computer vision algorithms to accurately identify and localize Monarch butterflies within images.
Versatile Use Cases: This powerful model has diverse applications, ranging from scientific research and conservation efforts to citizen science projects and environmental education programs.
Easy Integration: The Monarch Butterfly Detector can be seamlessly integrated into existing platforms, apps, or websites, making it accessible to many users and enabling them to contribute effortlessly to butterfly monitoring.
User-Friendly Interface: We provide a user-friendly interface/API that allows users to easily interact with the model, upload images, and obtain instant results.
For detailed documentation and tutorials on using Roboflow, refer to the https://docs.roboflow.com/
We welcome contributions from the open-source community to enhance the Monarch Butterfly Detector. If you're interested in contributing, please follow the guidelines outlined in [CONTRIBUTING.md] and submit your pull requests.
This project is licensed under the [Roboflow License]. For more information, see the [LICENSE] file provided by Roboflow.
For any questions, suggestions, or collaborations, please reach out to us at savetheworld at 150left.com
Congratulations if you have made it this far. 🥳
🎁🎁🎁I thought I would use another piece of AI to get some more content about this detector I created. Here is the dialog I had with ChatGPT.
Scott: ChatGPT, can you make my readers drool over my Monarch Butterfly Detector model?
ChatGPT: Here are 10 captivating outro or call-to-action suggestions to encourage users to try out the Monarch Butterfly Detector and contribute to its success:
"Join us in preserving the delicate beauty of Monarch butterflies. Try out the Monarch Butterfly Detector today and become a guardian of these magnificent creatures."
"Unlock the secrets of Monarch butterflies with our cutting-edge detection model. Experience the wonders of computer vision and contribute to vital research efforts by using the Monarch Butterfly Detector."
"Make a difference in conservation by being a part of the Monarch Butterfly Detector community. Help protect these iconic insects and their habitats with the power of AI and computer vision."
"Calling all nature enthusiasts and citizen scientists! Contribute to the Monarch Butterfly Detector project and help us gather valuable data on Monarch butterflies. Together, we can make a lasting impact."
"Explore the hidden world of Monarch butterflies with the Monarch Butterfly Detector. Whether you're a researcher, educator, or simply curious, join us on this journey and unravel the mysteries of these captivating creatures."
"Become a Monarch detective! Unleash the potential of computer vision and aid in the conservation of Monarch butterflies. Download the Monarch Butterfly Detector and help us protect these extraordinary pollinators."
"Attention conservationists and researchers! The Monarch Butterfly Detector needs your expertise. Contribute your insights and collaborate with us to further our understanding of Monarch butterflies and support their conservation."
"Empower yourself with AI and contribute to science. With the Monarch Butterfly Detector, you can be a part of groundbreaking research on Monarch butterflies. Start detecting, start making a difference."
"Inspire the next generation of environmentalists. Use the Monarch Butterfly Detector as an educational tool and engage students in the wonders of computer vision and the importance of protecting our natural world."
"Experience the thrill of discovery with the Monarch Butterfly Detector. Join a vibrant community of nature enthusiasts and scientists, and together, let's unlock new insights into the fascinating lives of Monarch butterflies."
As you can see, AI certainly likes my idea.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset and documents used for the study execution provided by the authors. In this raw data, you can find the scripts used by the evaluators and the TAM form submitted by them during the evaluation process.This raw data is part of the developed work published on ICEIS 2025 by the title of: "Exploring the Use of ChatGPT for the Generation of User Story Based Test Cases: An Experimental Study"All the data presented here is licensed under the CC BY 4.0 license and should be followed for any copyright matter.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
In scientific research, the ability to effectively retrieve relevant documents based on complex, multifaceted queries is critical. Existing evaluation datasets for this task are limited, primarily due to the high costs and effort required to annotate resources that effectively represent complex queries. To address this, we propose a novel task, Scientific DOcument Retrieval using Multi-level Aspect-based quEries (DORIS-MAE), which is designed to handle the complex nature of user queries in scientific research.
Documentations for the DORIS-MAE dataset is publicly available at https://github.com/Real-Doris-Mae/Doris-Mae-Dataset. This upload contains both DORIS-MAE dataset version 1 and ada-002 vector embeddings for all queries and related abstracts (used in candidate pool creation). DORIS-MAE dataset version 1 is comprised of four main sub-datasets, each serving distinct purposes.
The Query dataset contains 100 human-crafted complex queries spanning across five categories: ML, NLP, CV, AI, and Composite. Each category has 20 associated queries. Queries are broken down into aspects (ranging from 3 to 9 per query) and sub-aspects (from 0 to 6 per aspect, with 0 signifying no further breakdown required). For each query, a corresponding candidate pool of relevant paper abstracts, ranging from 99 to 138, is provided.
The Corpus dataset is composed of 363,133 abstracts from computer science papers, published between 2011-2021, and sourced from arXiv. Each entry includes title, original abstract, URL, primary and secondary categories, as well as citation information retrieved from Semantic Scholar. A masked version of each abstract is also provided, facilitating the automated creation of queries.
The Annotation dataset includes generated annotations for all 165,144 question pairs, each comprising an aspect/sub-aspect and a corresponding paper abstract from the query's candidate pool. It includes the original text generated by ChatGPT (version chatgpt-3.5-turbo-0301) explaining its decision-making process, along with a three-level relevance score (e.g., 0,1,2) representing ChatGPT's final decision.
Finally, the Test Set dataset contains human annotations for a random selection of 250 question pairs used in hypothesis testing. It includes each of the three human annotators' final decisions, recorded as a three-level relevance score (e.g., 0,1,2).
The file "ada_embedding_for_DORIS-MAE_v1.pickle" contains text embeddings for the DORIS-MAE dataset, generated by OpenAI's ada-002 model. The structure of the file is as follows:
├── ada_embedding_for_DORIS-MAE_v1.pickle ├── "Query" │ ├── query_id_1 (Embedding of query_1) │ ├── query_id_2 (Embedding of query_2) │ └── query_id_3 (Embedding of query_3) │ . │ . │ . └── "Corpus" ├── corpus_id_1 (Embedding of abstract_1) ├── corpus_id_2 (Embedding of abstract_2) └── corpus_id_3 (Embedding of abstract_3) . . .
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset shows how men and women in the U.S. reported using ChatGPT in a 2025 survey, including whether they followed its advice or chose not to use it.
AI in Consumer Decision-Making: Global Zero-Party Dataset
This dataset captures how consumers around the world are using AI tools like ChatGPT, Perplexity, Gemini, Claude, and Copilot to guide their purchase decisions. It spans multiple product categories, demographics, and geographies, mapping the emerging role of AI as a decision-making companion across the consumer journey.
What Makes This Dataset Unique
Unlike datasets inferred from digital traces or modeled from third-party assumptions, this collection is built entirely on zero-party data: direct responses from consumers who voluntarily share their habits and preferences. That means the insights come straight from the people making the purchases, ensuring unmatched accuracy and relevance.
For FMCG leaders, retailers, and financial services strategists, this dataset provides the missing piece: visibility into how often consumers are letting AI shape their decisions, and where that influence is strongest.
Dataset Structure
Each record is enriched with: Product Category – from high-consideration items like electronics to daily staples such as groceries and snacks. AI Tool Used – identifying whether consumers turn to ChatGPT, Gemini, Perplexity, Claude, or Copilot. Influence Level – the percentage of consumers in a given context who rely on AI to guide their choices. Demographics – generational breakdowns from Gen Z through Boomers. Geographic Detail – city- and country-level coverage across Africa, LATAM, Asia, Europe, and North America.
This structure allows filtering and comparison across categories, age groups, and markets, giving users a multidimensional view of AI’s impact on purchasing.
Why It Matters
AI has become a trusted voice in consumers’ daily lives. From meal planning to product comparisons, many people now consult AI before making a purchase—often without realizing how much it shapes the options they consider. For brands, this means that the path to purchase increasingly runs through an AI filter.
This dataset provides a comprehensive view of that hidden step in the consumer journey, enabling decision-makers to quantify: How much AI shapes consumer thinking before they even reach the shelf or checkout. Which product categories are most influenced by AI consultation. How adoption varies by geography and generation. Which AI platforms are most commonly trusted by consumers.
Opportunities for Business Leaders
FMCG & Retail Brands: Understand where AI-driven decision-making is already reshaping category competition. Marketers: Identify demographic segments most likely to consult AI, enabling targeted strategies. Retailers: Align assortments and promotions with the purchase patterns influenced by AI queries. Investors & Innovators: Gauge market readiness for AI-integrated commerce solutions.
The dataset doesn’t just describe what’s happening—it opens doors to the “so what” questions that define strategy. Which categories are becoming algorithm-driven? Which markets are shifting fastest? Where is the opportunity to get ahead of competitors in an AI-shaped funnel?
Why Now
Consumer AI adoption is no longer a forecast; it is a daily behavior. Just as search engines once rewrote the rules of marketing, conversational AI is quietly rewriting how consumers decide what to buy. This dataset offers an early, detailed view into that change, giving brands the ability to act while competitors are still guessing.
What You Get
Users gain: A global, city-level view of AI adoption in consumer decision-making. Cross-category comparability to see where AI influence is strongest and weakest. Generational breakdowns that show how adoption differs between younger and older cohorts. AI platform analysis, highlighting how tool preferences vary by region and category. Every row is powered by zero-party input, ensuring the insights reflect actual consumer behavior—not modeled assumptions.
How It’s Used
Leverage this data to:
Validate strategies before entering new markets or categories. Benchmark competitors on AI readiness and influence. Identify growth opportunities in categories where AI-driven recommendations are rapidly shaping decisions. Anticipate risks where brand visibility could be disrupted by algorithmic mediation.
Core Insights
The full dataset reveals: Surprising adoption curves across categories where AI wasn’t expected to play a role. Geographic pockets where AI has already become a standard step in purchase decisions. Demographic contrasts showing who trusts AI most—and where skepticism still holds. Clear differences between AI platforms and the consumer profiles most drawn to each.
These patterns are not visible in traditional retail data, sales reports, or survey summaries. They are only captured here, directly from the consumers themselves.
Summary
Winning in FMCG and retail today means more than getting on shelves, capturing price points, or running promotions. It means understanding the invisible algorithms consumers are ...
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
🧠 Awesome ChatGPT Prompts [CSV dataset]
This is a Dataset Repository of Awesome ChatGPT Prompts View All Prompts on GitHub
License
CC-0
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Within a year of its launch, ChatGPT has seen a surge in popularity. While many are drawn to its effectiveness and user-friendly interface, ChatGPT also introduces moral concerns, such as the temptation to present generated text as one’s own. This led us to theorize that personality traits such as Machiavellianism and sensation-seeking may be predictive of ChatGPT usage. We launched two online questionnaires with 2,000 respondents each, in September 2023 and March 2024, respectively. In Questionnaire 1, 22% of respondents were students, and 54% were full-time employees; 32% indicated they used ChatGPT at least weekly. Analysis of our ChatGPT Acceptance Scale revealed two factors, Effectiveness and Concerns, which correlated positively and negatively, respectively, with ChatGPT use frequency. A specific aspect of Machiavellianism (manipulation tactics) was found to predict ChatGPT usage. Questionnaire 2 was a replication of Questionnaire 1, with 21% students and 54% full-time employees, of which 43% indicated using ChatGPT weekly. In Questionnaire 2, more extensive personality scales were used. We found a moderate correlation between Machiavellianism and ChatGPT usage (r = .22) and with an opportunistic attitude towards undisclosed use (r = .30), relationships that largely remained intact after controlling for gender, age, education level, and the respondents’ country. We conclude that covert use of ChatGPT is associated with darker personality traits, something that requires further attention.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset has been used to write a book chapter on the topic of "Classifying User Intent for Effective Prompt Engineering: A Case of a Chatbot for Startup Teams". The dataset contains the following five resources:Startup questions and intent classifications- This resource demonstrates a list of possible questions and the classification of those questions into four intents i.e. reflecting on own experience, seeking information, brainstorming, and seeking advicePrompt_Book_v1- The file contains a brief guide on how questions are classified, a description of prompt patterns and templates, and lastly matching purpose-prompt patternQuestions_classification_script- The Python script used in our work to classify user intentSurvey_questionnaire- The original survey questions asked from the participantssurvey_responses- Survey responses from study respondents
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset comprises a curated collection of prompts designed to guide ChatGPT's responses, enabling it to act in specific ways or exhibit expertise in a particular field. These prompts offer a tailored solution to improve ChatGPT's replies.
You may wish to explore, contribute, or find inspiration in the 🧠 Awesome ChatGPT Prompts GitHub repository. Here you'll discover an evolving library of prompts, along with guidelines and examples to help you get the most out of your interactions with ChatGPT.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset presents ChatGPT usage patterns across different age groups, showing the percentage of users who have followed its advice, used it without following advice, or have never used it, based on a 2025 U.S. survey.