45 datasets found
  1. C

    ChatGPT Statistics 2025

    • sambretzmann.com
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). ChatGPT Statistics 2025 [Dataset]. https://sambretzmann.com/chatgpt-statistics/
    Explore at:
    Dataset updated
    Jun 12, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comprehensive ChatGPT statistics covering 800 million weekly users, $300 billion valuation, market share, demographics, and technical specifications for 2025.

  2. e

    ChatGPT Usage by Age Group – Survey Data

    • expresslegalfunding.com
    html
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Express Legal Funding (2025). ChatGPT Usage by Age Group – Survey Data [Dataset]. https://expresslegalfunding.com/chatgpt-study/
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 2, 2025
    Dataset authored and provided by
    Express Legal Funding
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    60+, 18–29, 30–44, 45–60
    Description

    This dataset presents ChatGPT usage patterns across different age groups, showing the percentage of users who have followed its advice, used it without following advice, or have never used it, based on a 2025 U.S. survey.

  3. o

    ChatGPT User Satisfaction Ratings

    • opendatabay.com
    .undefined
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). ChatGPT User Satisfaction Ratings [Dataset]. https://www.opendatabay.com/data/ai-ml/fd21bbf8-e5bf-4a34-93c2-57ae36ffbaf0
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Reviews & Ratings
    Description

    This dataset provides user reviews for ChatGPT, offering valuable qualitative feedback, satisfaction ratings, and submission dates. It captures a diverse array of user sentiments, from concise remarks to more detailed feedback. The ratings are provided on a scale of 1 to 5, indicating different levels of user satisfaction. The dataset spans several months, which allows for temporal analysis of sentiment trends, as each review includes a timestamp. This data is ideal for gaining insights into user characteristics and for improving application features and services.

    Columns

    • Review Id: A unique identifier for each individual review. This is formatted as a String, typically in a UUID structure.
    • Review: The actual text of the user's feedback, offering qualitative insights into their experience with the application. This is a String data type.
    • Ratings: User-submitted numerical ratings, ranging from 1 (lowest satisfaction) to 5 (highest satisfaction), indicating their level of contentment. This is an Integer data type.
    • Review Date: The timestamp when the review was originally submitted, recorded in MM/DD/YYYY HH:MM format, serving as a Date_Time data type.

    Distribution

    The dataset is provided as a free resource. While a sample file will be updated separately to the platform, the data quality is assessed as 5 out of 5, with the current version being 1.0. It was listed on 08/06/2025, with 1 view and 0 downloads recorded so far. The dataset contains approximately 193,154 unique reviews.

    Usage

    This dataset is particularly useful for various analytical applications, including: * Sentiment Analysis: Developing models to predict the emotional tone or sentiment conveyed in user reviews. * Customer Feedback Analysis: Extracting actionable insights that can inform and guide improvements to application features and services. * Review Classification: Building machine learning models to categorise user reviews, for instance, as positive or negative. * Data Visualisation: Creating visual representations of review patterns and trends. * Exploratory Data Analysis: Investigating the characteristics and underlying patterns within the review data. * Natural Language Processing (NLP): Applying NLP techniques to understand and process the textual feedback. * Text Mining: Discovering patterns and insights from the large collection of text reviews. * Time-Series Analysis: Examining how sentiment and ratings evolve over time based on review timestamps.

    Coverage

    This dataset comprises user reviews for ChatGPT collected from 25th July 2023 to 24th August 2024. The data collection is global, reflecting feedback from users worldwide.

    License

    CCO

    Who Can Use It

    This dataset is ideal for a range of users interested in understanding user feedback and sentiment, including: * Data Scientists and Machine Learning Engineers for building and training sentiment analysis and classification models. * Product Managers and App Developers to gain actionable insights for product improvement and feature development. * Market Researchers to understand user satisfaction and market perception of AI applications. * Academic Researchers studying human-computer interaction, natural language processing, or user behaviour.

    Dataset Name Suggestions

    • ChatGPT User Reviews
    • GPT User Review Sentiment Data
    • AI App User Feedback Dataset
    • ChatGPT User Satisfaction Ratings

    Attributes

    Original Data Source: ChatGPT Users Reviews

  4. How User Language Affects Conflict Fatality Estimates in ChatGPT, Query...

    • zenodo.org
    • data.niaid.nih.gov
    text/x-python, zip
    Updated Jul 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazenwadel Daniel*; Kazenwadel Daniel*; Steinert Christoph*; Steinert Christoph* (2023). How User Language Affects Conflict Fatality Estimates in ChatGPT, Query script and dataset [Dataset]. http://doi.org/10.5281/zenodo.8181226
    Explore at:
    zip, text/x-pythonAvailable download formats
    Dataset updated
    Jul 26, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kazenwadel Daniel*; Kazenwadel Daniel*; Steinert Christoph*; Steinert Christoph*
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    *both authors contributed equally

    Automated query script for automated language bias studies in GPT 3-5

    Dataset of the paper "How User Language Affects Conflict Fatality Estimates in ChatGPT" preprint available on ArXiv

  5. 🤖 ChatGPT App Google Store Reviews

    • kaggle.com
    Updated Nov 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BwandoWando (2023). 🤖 ChatGPT App Google Store Reviews [Dataset]. http://doi.org/10.34740/kaggle/ds/4017553
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 17, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    BwandoWando
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1842206%2Fd7e02bf38f4b08df2508d6b6e42f3066%2Fchatgpt2.png?generation=1700233710310045&alt=media" alt="">

    Based on their wikipedia page

    ChatGPT (Chat Generative Pre-trained Transformer) is a large language model-based chatbot developed by OpenAI and launched on November 30, 2022, that enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. Successive prompts and replies, known as prompt engineering, are considered at each conversation stage as a context.

    These reviews were extracted from Google Store App

    Usage

    This dataset should paint a good picture on what is the public's perception of the app over the years. Using this dataset, we can do the following

    1. Extract sentiments and trends
    2. Identify which version of the app had the most positive feedback, the worst.
    3. Use topic modeling to identify the pain points of the application.

    (AND MANY MORE!)

    Note

    Images generated using Bing Image Generator

  6. e

    ChatGPT Usage by U.S. Census Region – Survey Data

    • expresslegalfunding.com
    html
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Express Legal Funding (2025). ChatGPT Usage by U.S. Census Region – Survey Data [Dataset]. https://expresslegalfunding.com/chatgpt-study/
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 2, 2025
    Dataset authored and provided by
    Express Legal Funding
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Pacific, Mountain, New England, South Atlantic, Middle Atlantic, East North Central, East South Central, West North Central, West South Central
    Description

    This dataset presents ChatGPT usage patterns across U.S. Census regions, based on a 2025 nationwide survey. It tracks how often users followed, partially used, or never used ChatGPT by state region.

  7. h

    ChatGPT-Gemini-Claude-Perplexity-Human-Evaluation-Multi-Aspects-Review-Dataset...

    • huggingface.co
    Updated Nov 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DeepNLP (2024). ChatGPT-Gemini-Claude-Perplexity-Human-Evaluation-Multi-Aspects-Review-Dataset [Dataset]. https://huggingface.co/datasets/DeepNLP/ChatGPT-Gemini-Claude-Perplexity-Human-Evaluation-Multi-Aspects-Review-Dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 12, 2024
    Authors
    DeepNLP
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    ChatGPT Gemini Claude Perplexity Human Evaluation Multi Aspect Review Dataset

      Introduction
    

    Human evaluation and reviews with scalar score of AI Services responses are very usefuly in LLM Finetuning, Human Preference Alignment, Few-Shot Learning, Bad Case Shooting, etc, but extremely difficult to collect. This dataset is collected from DeepNLP AI Service User Review panel (http://www.deepnlp.org/store), which is an open review website for users to give reviews and upload… See the full description on the dataset page: https://huggingface.co/datasets/DeepNLP/ChatGPT-Gemini-Claude-Perplexity-Human-Evaluation-Multi-Aspects-Review-Dataset.

  8. o

    Mobile ChatGPT User Feedback

    • opendatabay.com
    .undefined
    Updated Jul 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Mobile ChatGPT User Feedback [Dataset]. https://www.opendatabay.com/data/ai-ml/3ecb80ad-dd56-4f41-afa0-dd5fd29dad80
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 5, 2025
    Dataset authored and provided by
    Datasimple
    Area covered
    Reviews & Ratings
    Description

    This dataset provides a collection of user reviews for the ChatGPT mobile application on iOS. It captures valuable user insights and sentiments, making it suitable for understanding customer satisfaction, evaluating app performance, and identifying emerging trends. The data was gathered by scraping ChatGPT reviews from the App Store.

    Columns

    • date: The date the review was posted.
    • title: The heading or title of the user's review.
    • review: The full text of the user's review.
    • rating: The star rating provided by the user.

    Distribution

    The dataset is typically provided in a CSV file format. It includes 2058 unique date values and 2257 unique review texts. The reviews span from 18th May 2023 to 25th July 2023. Review counts by period are as follows: * 18th May 2023 - 25th May 2023: 1,475 reviews * 25th May 2023 - 1st June 2023: 267 reviews * 1st June 2023 - 7th June 2023: 117 reviews * 7th June 2023 - 14th June 2023: 82 reviews * 14th June 2023 - 21st June 2023: 60 reviews * 21st June 2023 - 28th June 2023: 59 reviews * 28th June 2023 - 4th July 2023: 73 reviews * 4th July 2023 - 11th July 2023: 45 reviews * 11th July 2023 - 18th July 2023: 57 reviews * 18th July 2023 - 25th July 2023: 57 reviews

    Rating distribution is also available: * 1.00 - 1.40 stars: 495 reviews * 1.80 - 2.20 stars: 139 reviews * 3.00 - 3.40 stars: 220 reviews * 3.80 - 4.20 stars: 304 reviews * 4.60 - 5.00 stars: 1,134 reviews

    Usage

    This dataset is ideal for: * Sentiment analysis to gauge user emotions and opinions regarding the ChatGPT app. * Performance evaluation to identify factors contributing to high or low user ratings. * Pattern identification to uncover recurring themes and common issues in user feedback.

    Coverage

    The dataset covers reviews globally, spanning a time range from 18th May 2023 to 25th July 2023.

    License

    CC-BY-NC

    Who Can Use It

    • Data scientists and analysts for performing sentiment analysis and machine learning tasks on user-generated text.
    • App developers and product managers seeking to understand user satisfaction and improve app features based on direct feedback.
    • Market researchers for insights into user perceptions of AI applications and large language models.

    Dataset Name Suggestions

    • ChatGPT iOS App Reviews
    • Mobile ChatGPT User Feedback
    • App Store ChatGPT Review Data
    • iOS App User Sentiment for ChatGPT

    Attributes

    Original Data Source: ChatGPT App Reviews

  9. #ChatGPT 1000 Daily 🐦 Tweets

    • kaggle.com
    Updated May 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Enric Domingo (2023). #ChatGPT 1000 Daily 🐦 Tweets [Dataset]. http://doi.org/10.34740/kaggle/dsv/5685262
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 14, 2023
    Dataset provided by
    Kaggle
    Authors
    Enric Domingo
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    UPDATE: Due to new Twitter API conditions changed by Elon Musk, now it's no longer free to use the Twitter (X) API and the pricing is 100 $/month in the hobby plan. So my automated ETL notebook stopped from updating new tweets to this dataset on May 13th 2023.

    This dataset is was updated everyday with the addition of 1000 tweets/day containing any of the words "ChatGPT", "GPT3", or "GPT4", starting from the 3rd of April 2023. Everyday's tweets are uploaded 24-72h later, so the counter on tweets' likes, retweets, messages and impressions gets enough time to be relevant. Tweets are from any language selected randomly from all hours of the day. There are some basic filters applied trying to discard sensitive tweets and spam.

    This dataset can be used for many different applications regarding to Data Analysis and Visualization but also NLP Sentiment Analysis techniques and more.

    Consider upvoting this Dataset and the ETL scheduled Notebook providing new data everyday into it if you found them interesting, thanks! 🤗

    Columns Description:

    • tweet_id: Integer. unique identifier for each tweet. Older tweets have smaller IDs.

    • tweet_created: Timestamp. Time of the tweet's creation.

    • tweet_extracted: Timestamp. The UTC time when the ETL pipeline pulled the tweet and its metadata (likes count, retweets count, etc).

    • text: String. The raw payload text from the tweet.

    • lang: String. Short name for the Tweet text's language.

    • user_id: Integer. Twitter's unique user id.

    • user_name: String. The author's public name on Twitter.

    • user_username: String. The author's Twitter account username (@example)

    • user_location: String. The author's public location.

    • user_description: String. The author's public profile's bio.

    • user_created: Timestamp. Timestamp of user's Twitter account creation.

    • user_followers_count: Integer. The number of followers of the author's account at the moment of the tweet extraction

    • user_following_count: Integer. The number of followed accounts from the author's account at the moment of the Tweet extraction

    • user_tweet_count: Integer. The number of Tweets that the author has published at the moment of the Tweet extraction.

    • user_verified: Boolean. True if the user is verified (blue mark).

    • source: The device/app used to publish the tweet (Apparently not working, all values are Nan so far).

    • retweet_count: Integer. Number of retweets to the Tweet at the moment of the Tweet extraction.

    • like_count: Integer. Number of Likes to the Tweet at the moment of the Tweet extraction.

    • reply_count: Integer. Number of reply messages to the Tweet.

    • impression_count: Integer. Number of times the Tweet has been seen at the moment of the Tweet extraction.

    More info: Tweets API info definition: https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet Users API info definition: https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/user

  10. o

    ChatGPT Social Media Dataset

    • opendatabay.com
    .undefined
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). ChatGPT Social Media Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/f629eb0b-473a-4d7e-8043-3d6d7a600271
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 4, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Social Media and Networking
    Description

    This dataset contains a collection of tweets featuring the hashtag #chatgpt. The tweets were collected from Twitter, providing insights into various discussions surrounding the ChatGPT language model. It offers a view into the online community's engagement, level of interest, and the diverse applications of ChatGPT. This data can be used for various Natural Language Processing (NLP) and Machine Learning (ML) tasks, such as sentiment analysis and topic modelling.

    Columns

    The dataset includes the following key information for each tweet: * Datetime: The timestamp of the tweet. * Tweet Id: A unique identifier for each tweet. * Text: The full content of the tweet. * Username: The username of the tweet's author. * Permalink: The direct link to the tweet. * User: Additional user information, potentially including user ID. * Outlinks: Any URLs included within the tweet. * CountLinks: The number of links present in the tweet. * ReplyCount: The number of replies to the tweet. * RetweetCount: The number of retweets for the tweet, which also reflects favourite counts. * DateTime Count: Provides counts of tweets within specific date and time intervals. * Label Count: Numerical labels indicating counts, for instance, of unique tweet IDs or retweet figures.

    Distribution

    The dataset is typically provided in a CSV format. It includes a substantial number of tweets, with over 50,000 unique tweet IDs identified. The collection covers a period from 22nd January 2023 to 24th January 2023, with daily tweet counts ranging from approximately 1,753 to 4,487 within various intervals. Retweet counts for individual tweets in the dataset can range from 0 to over 6,800.

    Usage

    This dataset is well-suited for a range of analytical purposes, including: * Performing sentiment analysis to gauge public opinion on ChatGPT. * Conducting topic modelling to identify key themes and discussions. * Developing and testing various Natural Language Processing applications. * Exploring Machine Learning models for social media data. * Gaining insights into the ChatGPT community, its level of interest, and how the language model is being used.

    Coverage

    The dataset's geographic scope is global. It covers tweets posted between 22nd January 2023 and 24th January 2023. While specific demographic details are not listed as columns, user information including location is available for analysis.

    License

    CC0

    Who Can Use It

    This dataset is ideal for: * Data Scientists working on social media analytics or AI trends. * Researchers studying language models, online discourse, or NLP applications. * Developers building applications that leverage social media data. * Analysts interested in understanding public perception and engagement with artificial intelligence.

    Dataset Name Suggestions

    • ChatGPT Twitter Dataset
    • #ChatGPT Social Media Data
    • Twitter #ChatGPT Public Conversation

    Attributes

    Original Data Source: ChatGPT Twitter Dataset

  11. h

    awesome-chatgpt-prompts

    • huggingface.co
    Updated Dec 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fatih Kadir Akın (2023). awesome-chatgpt-prompts [Dataset]. https://huggingface.co/datasets/fka/awesome-chatgpt-prompts
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2023
    Authors
    Fatih Kadir Akın
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    🧠 Awesome ChatGPT Prompts [CSV dataset]

    This is a Dataset Repository of Awesome ChatGPT Prompts View All Prompts on GitHub

      License
    

    CC-0

  12. e

    Types of ChatGPT Advice Used – Survey Data

    • expresslegalfunding.com
    html
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Express Legal Funding (2025). Types of ChatGPT Advice Used – Survey Data [Dataset]. https://expresslegalfunding.com/chatgpt-study/
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 2, 2025
    Dataset authored and provided by
    Express Legal Funding
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Legal Advice, Career Advice, Educational Help, Financial Advice, Medical Information, Relationship Advice, Mental Health Topics, News / Current Events, Product Recommendations
    Description

    This dataset shows the types of advice users sought from ChatGPT based on a 2025 U.S. survey, including education, financial, medical, and legal topics.

  13. f

    Data_Sheet_1_Advanced large language models and visualization tools for data...

    • frontiersin.figshare.com
    txt
    Updated Aug 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jorge Valverde-Rebaza; Aram González; Octavio Navarro-Hinojosa; Julieta Noguez (2024). Data_Sheet_1_Advanced large language models and visualization tools for data analytics learning.csv [Dataset]. http://doi.org/10.3389/feduc.2024.1418006.s001
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 8, 2024
    Dataset provided by
    Frontiers
    Authors
    Jorge Valverde-Rebaza; Aram González; Octavio Navarro-Hinojosa; Julieta Noguez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionIn recent years, numerous AI tools have been employed to equip learners with diverse technical skills such as coding, data analysis, and other competencies related to computational sciences. However, the desired outcomes have not been consistently achieved. This study aims to analyze the perspectives of students and professionals from non-computational fields on the use of generative AI tools, augmented with visualization support, to tackle data analytics projects. The focus is on promoting the development of coding skills and fostering a deep understanding of the solutions generated. Consequently, our research seeks to introduce innovative approaches for incorporating visualization and generative AI tools into educational practices.MethodsThis article examines how learners perform and their perspectives when using traditional tools vs. LLM-based tools to acquire data analytics skills. To explore this, we conducted a case study with a cohort of 59 participants among students and professionals without computational thinking skills. These participants developed a data analytics project in the context of a Data Analytics short session. Our case study focused on examining the participants' performance using traditional programming tools, ChatGPT, and LIDA with GPT as an advanced generative AI tool.ResultsThe results shown the transformative potential of approaches based on integrating advanced generative AI tools like GPT with specialized frameworks such as LIDA. The higher levels of participant preference indicate the superiority of these approaches over traditional development methods. Additionally, our findings suggest that the learning curves for the different approaches vary significantly. Since learners encountered technical difficulties in developing the project and interpreting the results. Our findings suggest that the integration of LIDA with GPT can significantly enhance the learning of advanced skills, especially those related to data analytics. We aim to establish this study as a foundation for the methodical adoption of generative AI tools in educational settings, paving the way for more effective and comprehensive training in these critical areas.DiscussionIt is important to highlight that when using general-purpose generative AI tools such as ChatGPT, users must be aware of the data analytics process and take responsibility for filtering out potential errors or incompleteness in the requirements of a data analytics project. These deficiencies can be mitigated by using more advanced tools specialized in supporting data analytics tasks, such as LIDA with GPT. However, users still need advanced programming knowledge to properly configure this connection via API. There is a significant opportunity for generative AI tools to improve their performance, providing accurate, complete, and convincing results for data analytics projects, thereby increasing user confidence in adopting these technologies. We hope this work underscores the opportunities and needs for integrating advanced LLMs into educational practices, particularly in developing computational thinking skills.

  14. o

    Google Play ChatGPT Reviews Dataset

    • opendatabay.com
    .undefined
    Updated Jul 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Google Play ChatGPT Reviews Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/1c19202d-adb8-4778-a259-8a036c0573db
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 6, 2025
    Dataset authored and provided by
    Datasimple
    Area covered
    Reviews & Ratings
    Description

    This dataset features 100,000 user reviews of the ChatGPT app, collected from the Google Play Store. It offers diverse feedback from users across ten countries, providing valuable insights into user experiences and application performance. This dataset is well-suited for natural language processing tasks, sentiment analysis, and studies on user feedback.

    Columns

    • Name: Username of the reviewer.
    • Rating: The star rating provided by the user, on a scale of 1 to 5.
    • Comment: The textual content of the user's review.
    • Date: The date on which the review was posted.
    • Country: The country code of the reviewer, limited to 'us', 'gb', 'ca', 'au', 'in', 'jp', 'de', 'fr', 'kr', 'br'.
    • Thumbs Up: The count of 'likes' received by the review.
    • Review ID: A unique identifier assigned to each review.
    • App Version: The specific version of the app that was reviewed.

    Distribution

    The dataset contains 100,000 records, typically formatted as a CSV file. It includes detailed information such as user ratings, textual comments, and application versions. The ratings distribution shows a significant majority of high ratings, with 74,403 reviews in the 4.80-5.00 range. Thumbs Up counts range from 0 to 1712, with most reviews having fewer than 85.60 likes. There are 95,666 unique app versions and 87,220 unique review dates recorded.

    Usage

    This dataset is ideal for: * Sentiment Analysis: To evaluate user sentiment, assess satisfaction levels, and pinpoint areas for app improvement. * Natural Language Processing (NLP): For applying techniques such as text classification, summarisation, and keyword extraction from user comments. * Trend Analysis: To observe changes in user feedback over time and across different app versions. * Market Research: To analyse user preferences and common issues across various geographic regions and demographic groups.

    Coverage

    The dataset covers user reviews from ten specific countries: the United States, United Kingdom, Canada, Australia, India, Japan, Germany, France, South Korea, and Brazil. The reviews span a time range from 21 November 2023 to 19 July 2024. The data originates from publicly available user reviews on the Google Play Store.

    License

    CC BY-NC-SA

    Who Can Use It

    This dataset is suitable for: * Researchers: Undertaking studies in natural language processing and user experience. * Data Analysts: For sentiment analysis and identifying key trends in user feedback. * Product Developers: Seeking to understand user satisfaction and pinpoint areas for product enhancement. * Market Researchers: Interested in consumer preferences and challenges across different markets.

    Dataset Name Suggestions

    • ChatGPT Play Store Reviews
    • ChatGPT Mobile App Feedback
    • Google Play ChatGPT Reviews
    • User Reviews for ChatGPT App

    Attributes

    Original Data Source: ChatGPT Reviews

  15. f

    The Dataset for the book chapter on "Classifying User Intent for Effective...

    • figshare.com
    xlsx
    Updated Dec 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seyedmoein Mohsenimofidi; Akshy Sripad Raghavendra Prasad; Aida Zahid; Usman Rafiq; Xiaofeng Wang; Muhammad Attal Idris (2023). The Dataset for the book chapter on "Classifying User Intent for Effective Prompt Engineering: A Case of a Chatbot for StartupTeams". [Dataset]. http://doi.org/10.6084/m9.figshare.24847920.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Dec 27, 2023
    Dataset provided by
    figshare
    Authors
    Seyedmoein Mohsenimofidi; Akshy Sripad Raghavendra Prasad; Aida Zahid; Usman Rafiq; Xiaofeng Wang; Muhammad Attal Idris
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset has been used to write a book chapter on the topic of "Classifying User Intent for Effective Prompt Engineering: A Case of a Chatbot for Startup Teams". The dataset contains the following five resources:Startup questions and intent classifications- This resource demonstrates a list of possible questions and the classification of those questions into four intents i.e. reflecting on own experience, seeking information, brainstorming, and seeking advicePrompt_Book_v1- The file contains a brief guide on how questions are classified, a description of prompt patterns and templates, and lastly matching purpose-prompt patternQuestions_classification_script- The Python script used in our work to classify user intentSurvey_questionnaire- The original survey questions asked from the participantssurvey_responses- Survey responses from study respondents

  16. e

    ChatGPT Trust Levels by Advice Category – Survey Data

    • expresslegalfunding.com
    html
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Express Legal Funding (2025). ChatGPT Trust Levels by Advice Category – Survey Data [Dataset]. https://expresslegalfunding.com/chatgpt-study/
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 2, 2025
    Dataset authored and provided by
    Express Legal Funding
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Legal Advice, Career Advice, Educational Help, Financial Advice, Medical Information, Relationship Advice, Mental Health Topics, News / Current Events, Product Recommendations
    Description

    This dataset presents how much users trust ChatGPT across different advice categories, including career, education, financial, legal, and medical advice, based on a 2025 U.S. survey.

  17. h

    chats-data-2023-10-16

    • huggingface.co
    Updated Oct 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Collective Cognition (2023). chats-data-2023-10-16 [Dataset]. https://huggingface.co/datasets/CollectiveCognition/chats-data-2023-10-16
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 16, 2023
    Dataset authored and provided by
    Collective Cognition
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for "Collective Cognition ChatGPT Conversations"

      Dataset Description
    
    
    
    
    
      Dataset Summary
    

    The "Collective Cognition ChatGPT Conversations" dataset is a collection of chat logs between users and the ChatGPT model. These conversations have been shared by users on the "Collective Cognition" website. The dataset provides insights into user interactions with language models and can be utilized for multiple purposes, including training, research, and… See the full description on the dataset page: https://huggingface.co/datasets/CollectiveCognition/chats-data-2023-10-16.

  18. 4

    Supplementary data for the paper 'Personality and acceptance as predictors...

    • data.4tu.nl
    zip
    Updated Mar 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joost de Winter; Dimitra Dodou; Yke Bauke Eisma (2024). Supplementary data for the paper 'Personality and acceptance as predictors of ChatGPT use' [Dataset]. http://doi.org/10.4121/e2e3ac25-e264-4592-b413-254eb4ac5022.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 28, 2024
    Dataset provided by
    4TU.ResearchData
    Authors
    Joost de Winter; Dimitra Dodou; Yke Bauke Eisma
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Within a year of its launch, ChatGPT has seen a surge in popularity. While many are drawn to its effectiveness and user-friendly interface, ChatGPT also introduces moral concerns, such as the temptation to present generated text as one’s own. This led us to theorize that personality traits such as Machiavellianism and sensation-seeking may be predictive of ChatGPT usage. We launched two online questionnaires with 2,000 respondents each, in September 2023 and March 2024, respectively. In Questionnaire 1, 22% of respondents were students, and 54% were full-time employees; 32% indicated they used ChatGPT at least weekly. Analysis of our ChatGPT Acceptance Scale revealed two factors, Effectiveness and Concerns, which correlated positively and negatively, respectively, with ChatGPT use frequency. A specific aspect of Machiavellianism (manipulation tactics) was found to predict ChatGPT usage. Questionnaire 2 was a replication of Questionnaire 1, with 21% students and 54% full-time employees, of which 43% indicated using ChatGPT weekly. In Questionnaire 2, more extensive personality scales were used. We found a moderate correlation between Machiavellianism and ChatGPT usage (r = .22) and with an opportunistic attitude towards undisclosed use (r = .30), relationships that largely remained intact after controlling for gender, age, education level, and the respondents’ country. We conclude that covert use of ChatGPT is associated with darker personality traits, something that requires further attention.

  19. o

    ChatGPT Social Media Insights Dataset

    • opendatabay.com
    .undefined
    Updated Jul 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). ChatGPT Social Media Insights Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/2cf951da-3ce1-4606-a8d6-3f865c4d8a3b
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 6, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Social Media and Networking
    Description

    This dataset captures a daily collection of tweets containing keywords such as "ChatGPT", "GPT3", or "GPT4". It was designed to provide a rich source of social media data for analysis, particularly for applications concerning Natural Language Processing (NLP) and sentiment analysis. The collection process began on 3rd April 2023, with approximately 1,000 tweets added daily. Tweets were extracted 24-72 hours after creation to allow for relevant engagement metrics like likes and retweets to accumulate. However, updates to this dataset ceased on 13th May 2023, due to changes in Twitter (X) API conditions, which introduced a cost for its use. The dataset includes tweets from various languages, selected randomly throughout the day, with basic filters applied to discard sensitive content and spam.

    Columns

    • tweet_id: An integer serving as a unique identifier for each tweet. Older tweets typically have smaller IDs.
    • tweet_created: A timestamp indicating the exact time the tweet was published.
    • tweet_extracted: A UTC timestamp recording when the ETL (Extract, Transform, Load) pipeline pulled the tweet and its associated metadata (e.g., likes count, retweets count).
    • text: A string containing the raw text content of the tweet payload.
    • lang: A string providing the short name for the language of the tweet's text.
    • user_id: An integer representing the author's unique user ID on Twitter.
    • user_name: A string displaying the author's public name on Twitter.
    • user_username: A string showing the author's Twitter account username (e.g., @example).
    • user_location: A string detailing the author's publicly stated location.
    • user_description: A string containing the author's public profile biography.
    • user_created: A timestamp indicating when the user's Twitter account was created.
    • user_followers_count: An integer showing the number of followers the author's account had at the moment the tweet was extracted.
    • user_following_count: An integer indicating the number of accounts the author was following at the moment of tweet extraction.
    • user_tweet_count: An integer representing the total number of tweets the author had published at the time of tweet extraction.
    • user_verified: A boolean value (True/False) indicating if the user is verified (i.e., has a blue tick).
    • source: This column was intended to show the device or application used to publish the tweet but currently contains only 'Nan' (Not a Number) values.
    • retweet_count: An integer displaying the number of times the tweet had been retweeted at the moment of extraction.
    • like_count: An integer showing the number of likes the tweet had received at the moment of extraction.
    • reply_count: An integer indicating the number of reply messages to the tweet.
    • impression_count: An integer representing the number of times the tweet had been seen at the moment of extraction.

    Distribution

    The dataset is provided in a CSV file format, generated from a Pandas DataFrame, with each row containing the tweet's text and its metadata, along with the author's information. The collection started on 3rd April 2023, adding approximately 1,000 tweets per day, and stopped updating on 13th May 2023. While specific total row counts are not available, various segments show substantial data, such as 43,000 tweets collected between 22nd September 2022 and 12th May 2023. Daily additions of 1,000 to 7,000 tweets are noted for the period of 8th April 2023 to 14th May 2023. The dataset includes unique values for over 25,000 tweet IDs, over 37,000 unique user IDs, and over 38,000 unique user locations.

    Usage

    This dataset is ideal for various data analysis and visualisation applications. It is particularly well-suited for Natural Language Processing (NLP) techniques, including sentiment analysis, to understand public opinion and trends related to ChatGPT, GPT3, and GPT4. Researchers can use it for social media listening, trend tracking, and studying the evolution of discussions around large language models.

    Coverage

    The dataset primarily covers tweets from 3rd April 2023 to 13th May 2023, with some older tweets included, particularly from September 2022. Tweets are from any language, randomly selected globally. English (en) tweets constitute approximately 48% of the dataset, Japanese (ja) tweets make up about 23%, and other languages account for 30%. User locations vary widely, with a significant portion (41%) being null, 1% from Japan, and the remaining 59% from various other global locations.

    License

    CC0

    Who Can Use It

    • Data Analysts: For exploring social media trends and user engagement related to AI.
    • Researchers: Studying the public reception, discussion patterns, and sentiment around large language models.
    • Machine Learning Engineers: Developing and testing NLP models for s
  20. e

    ChatGPT Usage by Gender – Survey Data

    • expresslegalfunding.com
    html
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Express Legal Funding (2025). ChatGPT Usage by Gender – Survey Data [Dataset]. https://expresslegalfunding.com/chatgpt-study/
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 2, 2025
    Dataset authored and provided by
    Express Legal Funding
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Men, Women
    Description

    This dataset shows how men and women in the U.S. reported using ChatGPT in a 2025 survey, including whether they followed its advice or chose not to use it.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). ChatGPT Statistics 2025 [Dataset]. https://sambretzmann.com/chatgpt-statistics/

ChatGPT Statistics 2025

Explore at:
9 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 12, 2025
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Comprehensive ChatGPT statistics covering 800 million weekly users, $300 billion valuation, market share, demographics, and technical specifications for 2025.

Search
Clear search
Close search
Google apps
Main menu