3 datasets found
  1. #ChatGPT 1000 Daily 🐦 Tweets

    • kaggle.com
    Updated May 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Enric Domingo (2023). #ChatGPT 1000 Daily 🐦 Tweets [Dataset]. http://doi.org/10.34740/kaggle/dsv/5685262
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 14, 2023
    Dataset provided by
    Kaggle
    Authors
    Enric Domingo
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    UPDATE: Due to new Twitter API conditions changed by Elon Musk, now it's no longer free to use the Twitter (X) API and the pricing is 100 $/month in the hobby plan. So my automated ETL notebook stopped from updating new tweets to this dataset on May 13th 2023.

    This dataset is was updated everyday with the addition of 1000 tweets/day containing any of the words "ChatGPT", "GPT3", or "GPT4", starting from the 3rd of April 2023. Everyday's tweets are uploaded 24-72h later, so the counter on tweets' likes, retweets, messages and impressions gets enough time to be relevant. Tweets are from any language selected randomly from all hours of the day. There are some basic filters applied trying to discard sensitive tweets and spam.

    This dataset can be used for many different applications regarding to Data Analysis and Visualization but also NLP Sentiment Analysis techniques and more.

    Consider upvoting this Dataset and the ETL scheduled Notebook providing new data everyday into it if you found them interesting, thanks! 🤗

    Columns Description:

    • tweet_id: Integer. unique identifier for each tweet. Older tweets have smaller IDs.

    • tweet_created: Timestamp. Time of the tweet's creation.

    • tweet_extracted: Timestamp. The UTC time when the ETL pipeline pulled the tweet and its metadata (likes count, retweets count, etc).

    • text: String. The raw payload text from the tweet.

    • lang: String. Short name for the Tweet text's language.

    • user_id: Integer. Twitter's unique user id.

    • user_name: String. The author's public name on Twitter.

    • user_username: String. The author's Twitter account username (@example)

    • user_location: String. The author's public location.

    • user_description: String. The author's public profile's bio.

    • user_created: Timestamp. Timestamp of user's Twitter account creation.

    • user_followers_count: Integer. The number of followers of the author's account at the moment of the tweet extraction

    • user_following_count: Integer. The number of followed accounts from the author's account at the moment of the Tweet extraction

    • user_tweet_count: Integer. The number of Tweets that the author has published at the moment of the Tweet extraction.

    • user_verified: Boolean. True if the user is verified (blue mark).

    • source: The device/app used to publish the tweet (Apparently not working, all values are Nan so far).

    • retweet_count: Integer. Number of retweets to the Tweet at the moment of the Tweet extraction.

    • like_count: Integer. Number of Likes to the Tweet at the moment of the Tweet extraction.

    • reply_count: Integer. Number of reply messages to the Tweet.

    • impression_count: Integer. Number of times the Tweet has been seen at the moment of the Tweet extraction.

    More info: Tweets API info definition: https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet Users API info definition: https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/user

  2. ChatGPT reviews [DAILY UPDATED]

    • kaggle.com
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashish Kumar (2025). ChatGPT reviews [DAILY UPDATED] [Dataset]. https://www.kaggle.com/datasets/ashishkumarak/chatgpt-reviews-daily-updated/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 10, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ashish Kumar
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This dataset mainly consists of daily-updated user reviews and ratings for the ChatGPT Android App. It also contains data on the relevancy of these reviews and the dates they were posted.

  3. E

    Google Gemini Statistics By Features, Performance and AI Versions

    • enterpriseappstoday.com
    Updated Dec 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EnterpriseAppsToday (2023). Google Gemini Statistics By Features, Performance and AI Versions [Dataset]. https://www.enterpriseappstoday.com/stats/google-gemini-statistics.html
    Explore at:
    Dataset updated
    Dec 20, 2023
    Dataset authored and provided by
    EnterpriseAppsToday
    License

    https://www.enterpriseappstoday.com/privacy-policyhttps://www.enterpriseappstoday.com/privacy-policy

    Time period covered
    2022 - 2032
    Area covered
    Global
    Description

    Google Gemini Statistics: In 2023, Google unveiled the most powerful AI model to date. Google Gemini is the world’s most advanced AI leaving the ChatGPT 4 behind in the line. Google has 3 different sizes of models, superior to each, and can perform tasks accordingly. According to Google Gemini Statistics, these can understand and solve complex problems related to absolutely anything. Google even said, they will develop AI in such as way that it will let you know how helpful AI is in our daily routine. Well, we hope our next generation won’t be fully dependent on such technologies, otherwise, we will lose all of our natural talent! Editor’s Choice Google Gemini can follow natural and engaging conversations. According to Google Gemini Statistics, Gemini Ultra has a 90.0% score on the MMLU benchmark for testing the knowledge of and problem-solving on subjects including history, physics, math, law, ethics, history, and medicine. If you ask Gemini what to do with your raw material, it can provide you with ideas in the form of text or images according to the given input. Gemini has outperformed ChatGPT -4 tests in the majority of the cases. According to the report this LLM is said to be unique because it can process multiple types of data at the same time along with video, images, computer code, and text. Google is considering its development as The Gemini Era, showing the importance of our AI is significant in improving our daily lives. Google Gemini can talk like a real person Gemini Ultra is the largest model and can solve extremely complex problems. Gemini models are trained on multilingual and multimodal datasets. Gemini’s Ultra performance on the MMMU benchmark has also outperformed the GPT-4V in the following results Art and Design (74.2), Business (62.7), Health and Medicine (71.3), Humanities and Social Science (78.3), and Technology and Engineering (53.00).

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Enric Domingo (2023). #ChatGPT 1000 Daily 🐦 Tweets [Dataset]. http://doi.org/10.34740/kaggle/dsv/5685262
Organization logo

#ChatGPT 1000 Daily 🐦 Tweets

1000 tweets a day about "ChatGPT", "GPT3", and "GPT4" with metadata

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 14, 2023
Dataset provided by
Kaggle
Authors
Enric Domingo
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

UPDATE: Due to new Twitter API conditions changed by Elon Musk, now it's no longer free to use the Twitter (X) API and the pricing is 100 $/month in the hobby plan. So my automated ETL notebook stopped from updating new tweets to this dataset on May 13th 2023.

This dataset is was updated everyday with the addition of 1000 tweets/day containing any of the words "ChatGPT", "GPT3", or "GPT4", starting from the 3rd of April 2023. Everyday's tweets are uploaded 24-72h later, so the counter on tweets' likes, retweets, messages and impressions gets enough time to be relevant. Tweets are from any language selected randomly from all hours of the day. There are some basic filters applied trying to discard sensitive tweets and spam.

This dataset can be used for many different applications regarding to Data Analysis and Visualization but also NLP Sentiment Analysis techniques and more.

Consider upvoting this Dataset and the ETL scheduled Notebook providing new data everyday into it if you found them interesting, thanks! 🤗

Columns Description:

  • tweet_id: Integer. unique identifier for each tweet. Older tweets have smaller IDs.

  • tweet_created: Timestamp. Time of the tweet's creation.

  • tweet_extracted: Timestamp. The UTC time when the ETL pipeline pulled the tweet and its metadata (likes count, retweets count, etc).

  • text: String. The raw payload text from the tweet.

  • lang: String. Short name for the Tweet text's language.

  • user_id: Integer. Twitter's unique user id.

  • user_name: String. The author's public name on Twitter.

  • user_username: String. The author's Twitter account username (@example)

  • user_location: String. The author's public location.

  • user_description: String. The author's public profile's bio.

  • user_created: Timestamp. Timestamp of user's Twitter account creation.

  • user_followers_count: Integer. The number of followers of the author's account at the moment of the tweet extraction

  • user_following_count: Integer. The number of followed accounts from the author's account at the moment of the Tweet extraction

  • user_tweet_count: Integer. The number of Tweets that the author has published at the moment of the Tweet extraction.

  • user_verified: Boolean. True if the user is verified (blue mark).

  • source: The device/app used to publish the tweet (Apparently not working, all values are Nan so far).

  • retweet_count: Integer. Number of retweets to the Tweet at the moment of the Tweet extraction.

  • like_count: Integer. Number of Likes to the Tweet at the moment of the Tweet extraction.

  • reply_count: Integer. Number of reply messages to the Tweet.

  • impression_count: Integer. Number of times the Tweet has been seen at the moment of the Tweet extraction.

More info: Tweets API info definition: https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet Users API info definition: https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/user

Search
Clear search
Close search
Google apps
Main menu