100+ datasets found
  1. Twitter Dataset February 2024

    • kaggle.com
    Updated Mar 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayush Kumar Singh (2024). Twitter Dataset February 2024 [Dataset]. https://www.kaggle.com/datasets/fastcurious/twitter-dataset-february-2024
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 4, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ayush Kumar Singh
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Tweets scraped will all possible datapoints provided by twitter in each tweet. For data extraction or scraping contact me on telegram - @akaseobhw

    All datapoints present for each tweet.

    Each entry in the dataset represents a tweet along with various attributes such as the tweet's ID, URL, text content, retweet count, reply count, like count, quote count, view count, creation date, language, and more. Additionally, there are details about the tweet's author, including their username, profile URL, follower count, following count, profile picture, cover picture, description, location, creation date, and more.

    Here's a brief description of the key fields present in each tweet entry:

    • type: Indicates the type of data, in this case, it's a tweet.
    • id: Unique identifier for the tweet.
    • url: URL of the tweet.
    • twitterUrl: Twitter URL of the tweet.
    • text: Text content of the tweet.
    • retweetCount: Number of retweets.
    • replyCount: Number of replies.
    • likeCount: Number of likes (favorites).
    • quoteCount: Number of times the tweet has been quoted.
    • viewCount: Number of views.
    • createdAt: Date and time when the tweet was created.
    • lang: Language of the tweet.
    • quoteId: ID of the quoted tweet, if this tweet is a quote.
    • bookmarkCount: Number of times the tweet has been bookmarked.
    • isReply: Indicates whether the tweet is a reply to another tweet.
    • author: Information about the author of the tweet.
      • userName: Username of the author.
      • url: URL of the author's profile.
      • followers: Number of followers of the author.
      • following: Number of accounts the author is following.
      • profilePicture: URL of the author's profile picture.
      • coverPicture: URL of the author's cover picture.
      • description: Description or bio of the author.
      • location: Location of the author.
      • createdAt: Date and time when the author's account was created.
    • entities: Entities present in the tweet, such as hashtags, symbols, URLs, and user mentions.
    • isRetweet: Indicates whether the tweet is a retweet.
    • isQuote: Indicates whether the tweet is a quote.
    • quote: Information about the quoted tweet, if this tweet is a quote.
    • media: Information about any media (such as images or videos) attached to the tweet.

    This dataset can be analyzed to gain insights into trends, sentiments, and user behavior on Twitter. You can use Python libraries like pandas to load this dataset and perform various analyses and visualizations.

  2. X/Twitter: Countries with the largest audience 2025

    • statista.com
    • tokrwards.com
    Updated Jun 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). X/Twitter: Countries with the largest audience 2025 [Dataset]. https://www.statista.com/statistics/242606/number-of-active-twitter-users-in-selected-countries/
    Explore at:
    Dataset updated
    Jun 19, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 2025
    Area covered
    Worldwide
    Description

    Social network X/Twitter is particularly popular in the United States, and as of February 2025, the microblogging service had an audience reach of 103.9 million users in the country. Japan and the India were ranked second and third with more than 70 million and 25 million users respectively. Global Twitter usage As of the second quarter of 2021, X/Twitter had 206 million monetizable daily active users worldwide. The most-followed Twitter accounts include figures such as Elon Musk, Justin Bieber and former U.S. president Barack Obama. X/Twitter and politics X/Twitter has become an increasingly relevant tool in domestic and international politics. The platform has become a way to promote policies and interact with citizens and other officials, and most world leaders and foreign ministries have an official Twitter account. Former U.S. president Donald Trump used to be a prolific Twitter user before the platform permanently suspended his account in January 2021. During an August 2018 survey, 61 percent of respondents stated that Trump's use of Twitter as President of the United States was inappropriate.

  3. s

    Twitter Key Statistics

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Twitter Key Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These are the key Twitter user statistics that you need to know.

  4. s

    Twitter Users Broken Down By Age

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Twitter Users Broken Down By Age [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the breakdown of Twitter users by age group.

  5. X/Twitter users in the United Kingdom 2019-2028

    • statista.com
    • tokrwards.com
    Updated Jan 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). X/Twitter users in the United Kingdom 2019-2028 [Dataset]. https://www.statista.com/topics/11843/x-formerly-twitter-in-the-united-kingdom-uk/
    Explore at:
    Dataset updated
    Jan 13, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    United Kingdom
    Description

    The number of Twitter users in the United Kingdom was forecast to continuously increase between 2024 and 2028 by in total 0.9 million users (+5.1 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 18.55 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  6. Z

    Data from: TWIGMA: A dataset of AI-Generated Images with Metadata From...

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    James Zou (2024). TWIGMA: A dataset of AI-Generated Images with Metadata From Twitter [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8031784
    Explore at:
    Dataset updated
    May 28, 2024
    Dataset provided by
    James Zou
    Yiqun Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Update May 2024: Fixed a data type issue with "id" column that prevented twitter ids from rendering correctly.

    Recent progress in generative artificial intelligence (gen-AI) has enabled the generation of photo-realistic and artistically-inspiring photos at a single click, catering to millions of users online. To explore how people use gen-AI models such as DALLE and StableDiffusion, it is critical to understand the themes, contents, and variations present in the AI-generated photos. In this work, we introduce TWIGMA (TWItter Generative-ai images with MetadatA), a comprehensive dataset encompassing 800,000 gen-AI images collected from Jan 2021 to March 2023 on Twitter, with associated metadata (e.g., tweet text, creation date, number of likes).

    Through a comparative analysis of TWIGMA with natural images and human artwork, we find that gen-AI images possess distinctive characteristics and exhibit, on average, lower variability when compared to their non-gen-AI counterparts. Additionally, we find that the similarity between a gen-AI image and human images (i) is correlated with the number of likes; and (ii) can be used to identify human images that served as inspiration for the gen-AI creations. Finally, we observe a longitudinal shift in the themes of AI-generated images on Twitter, with users increasingly sharing artistically sophisticated content such as intricate human portraits, whereas their interest in simple subjects such as natural scenes and animals has decreased. Our analyses and findings underscore the significance of TWIGMA as a unique data resource for studying AI-generated images.

    Note that in accordance with the privacy and control policy of Twitter, NO raw content from Twitter is included in this dataset and users could and need to retrieve the original Twitter content used for analysis using the Twitter id. In addition, users who want to access Twitter data should consult and follow rules and regulations closely at the official Twitter developer policy at https://developer.twitter.com/en/developer-terms/policy.

  7. s

    Twitter bot profiling

    • researchdata.smu.edu.sg
    • smu.edu.sg
    • +1more
    pdf
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Living Analytics Research Centre (2023). Twitter bot profiling [Dataset]. http://doi.org/10.25440/smu.12062706.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    SMU Research Data Repository (RDR)
    Authors
    Living Analytics Research Centre
    License

    http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/

    Description

    This dataset comprises a set of Twitter accounts in Singapore that are used for social bot profiling research conducted by the Living Analytics Research Centre (LARC) at Singapore Management University (SMU). Here a bot is defined as a Twitter account that generates contents and/or interacts with other users automatically (at least according to human judgment). In this research, Twitter bots have been categorized into three major types:

    Broadcast bot. This bot aims at disseminating information to general audience by providing, e.g., benign links to news, blogs or sites. Such bot is often managed by an organization or a group of people (e.g., bloggers). Consumption bot. The main purpose of this bot is to aggregate contents from various sources and/or provide update services (e.g., horoscope reading, weather update) for personal consumption or use. Spam bot. This type of bots posts malicious contents (e.g., to trick people by hijacking certain account or redirecting them to malicious sites), or promotes harmless but invalid/irrelevant contents aggressively.

    This categorization is general enough to cater for new, emerging types of bot (e.g., chatbots can be viewed as a special type of broadcast bots). The dataset was collected from 1 January to 30 April 2014 via the Twitter REST and streaming APIs. Starting from popular seed users (i.e., users having many followers), their follow, retweet, and user mention links were crawled. The data collection proceeds by adding those followers/followees, retweet sources, and mentioned users who state Singapore in their profile location. Using this procedure, a total of 159,724 accounts have been collected. To identify bots, the first step is to check active accounts who tweeted at least 15 times within the month of April 2014. These accounts were then manually checked and labelled, of which 589 bots were found. As many more human users are expected in the Twitter population, the remaining accounts were randomly sampled and manually checked. With this, 1,024 human accounts were identified. In total, this results in 1,613 labelled accounts. Related Publication: R. J. Oentaryo, A. Murdopo, P. K. Prasetyo, and E.-P. Lim. (2016). On profiling bots in social media. Proceedings of the International Conference on Social Informatics (SocInfo’16), 92-109. Bellevue, WA. https://doi.org/10.1007/978-3-319-47880-7_6

  8. d

    Data from: Twitter Big Data as A Resource For Exoskeleton Research: A...

    • search.dataone.org
    Updated Nov 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thakur, Nirmalya (2023). Twitter Big Data as A Resource For Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research Questions [Dataset]. http://doi.org/10.7910/DVN/VPPTRF
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Thakur, Nirmalya
    Description

    Please cite the following paper when using this dataset: N. Thakur, “Twitter Big Data as a Resource for Exoskeleton Research: A Large-Scale Dataset of about 140,000 Tweets and 100 Research Questions,” Preprints, 2022, DOI: 10.20944/preprints202206.0383.v1 Abstract The exoskeleton technology has been rapidly advancing in the recent past due to its multitude of applications and use cases in assisted living, military, healthcare, firefighting, and industries. With the projected increase in the diverse uses of exoskeletons in the next few years in these application domains and beyond, it is crucial to study, interpret, and analyze user perspectives, public opinion, reviews, and feedback related to exoskeletons, for which a dataset is necessary. The Internet of Everything era of today's living, characterized by people spending more time on the Internet than ever before, holds the potential for developing such a dataset by mining relevant web behavior data from social media communications, which have increased exponentially in the last few years. Twitter, one such social media platform, is highly popular amongst all age groups, who communicate on diverse topics including but not limited to news, current events, politics, emerging technologies, family, relationships, and career opportunities, via tweets, while sharing their views, opinions, perspectives, and feedback towards the same. Therefore, this work presents a dataset of about 140,000 Tweets related to exoskeletons. that were mined for a period of 5-years from May 21, 2017, to May 21, 2022. The tweets contain diverse forms of communications and conversations which communicate user interests, user perspectives, public opinion, reviews, feedback, suggestions, etc., related to exoskeletons. Instructions: This dataset contains about 140,000 Tweets related to exoskeletons. that were mined for a period of 5-years from May 21, 2017, to May 21, 2022. The tweets contain diverse forms of communications and conversations which communicate user interests, user perspectives, public opinion, reviews, feedback, suggestions, etc., related to exoskeletons. The dataset contains only tweet identifiers (Tweet IDs) due to the terms and conditions of Twitter to re-distribute Twitter data only for research purposes. They need to be hydrated to be used. The process of retrieving a tweet's complete information (such as the text of the tweet, username, user ID, date and time, etc.) using its ID is known as the hydration of a tweet ID. The Hydrator application (link to download the application: https://github.com/DocNow/hydrator/releases and link to a step-by-step tutorial: https://towardsdatascience.com/learn-how-to-easily-hydrate-tweets-a0f393ed340e#:~:text=Hydrating%20Tweets) or any similar application may be used for hydrating this dataset. Data Description This dataset consists of 7 .txt files. The following shows the number of Tweet IDs and the date range (of the associated tweets) in each of these files. Filename: Exoskeleton_TweetIDs_Set1.txt (Number of Tweet IDs – 22945, Date Range of Tweets - July 20, 2021 – May 21, 2022) Filename: Exoskeleton_TweetIDs_Set2.txt (Number of Tweet IDs – 19416, Date Range of Tweets - Dec 1, 2020 – July 19, 2021) Filename: Exoskeleton_TweetIDs_Set3.txt (Number of Tweet IDs – 16673, Date Range of Tweets - April 29, 2020 - Nov 30, 2020) Filename: Exoskeleton_TweetIDs_Set4.txt (Number of Tweet IDs – 16208, Date Range of Tweets - Oct 5, 2019 - Apr 28, 2020) Filename: Exoskeleton_TweetIDs_Set5.txt (Number of Tweet IDs – 17983, Date Range of Tweets - Feb 13, 2019 - Oct 4, 2019) Filename: Exoskeleton_TweetIDs_Set6.txt (Number of Tweet IDs – 34009, Date Range of Tweets - Nov 9, 2017 - Feb 12, 2019) Filename: Exoskeleton_TweetIDs_Set7.txt (Number of Tweet IDs – 11351, Date Range of Tweets - May 21, 2017 - Nov 8, 2017) Here, the last date for May is May 21 as it was the most recent date at the time of data collection. The dataset would be updated soon to incorporate more recent tweets.

  9. s

    Twitter Users Broken down By Country

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Twitter Users Broken down By Country [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The US has historically been the target country for Twitter since its launch in 2006. This is the full breakdown of Twitter users by country.

  10. Twitter users in the United States 2019-2028

    • statista.com
    Updated Jul 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). Twitter users in the United States 2019-2028 [Dataset]. https://www.statista.com/topics/3196/social-media-usage-in-the-united-states/
    Explore at:
    Dataset updated
    Jul 30, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    United States
    Description

    The number of Twitter users in the United States was forecast to continuously increase between 2024 and 2028 by in total 4.3 million users (+5.32 percent). After the ninth consecutive increasing year, the Twitter user base is estimated to reach 85.08 million users and therefore a new peak in 2028. Notably, the number of Twitter users of was continuously increasing over the past years.User figures, shown here regarding the platform twitter, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Twitter users in countries like Canada and Mexico.

  11. H

    Tweets Dataset - Top 20 most followed users in Twitter social platform

    • dataverse.harvard.edu
    Updated Aug 18, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raad Bin Tareaf (2017). Tweets Dataset - Top 20 most followed users in Twitter social platform [Dataset]. http://doi.org/10.7910/DVN/JBXKFD
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 18, 2017
    Dataset provided by
    Harvard Dataverse
    Authors
    Raad Bin Tareaf
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    -This Dataset was gathered by crawling Twitter's REST API using the Python library tweepy 3. This dataset contains the tweets of the 20 most popular twitter users (with the most followers) whereby retweets are neglected. These accounts belong to public people, such as Katy Perry and Barack Obama, platforms, YouTube, Instagram, and television channels shows, e.g., CNN Breaking News and The Ellen Show. -Consequently, the dataset contains a mix of relatively structured tweets, tweets written in a formal and informative manner, and completely unstructured tweets written in a colloquial style. Unfortunately, the geocoordinates were not available for those tweets. - H -This Dataset has been used to generate reserach paper under title "Machine Learning Techniques for Anomalies Detection in Post Arrays". -Crawled attributes are: Author (Twitter User), Content (Tweet), Date_Time, id (Twitter User ID), language (Tweet Langugage), Number_of_Likes, Number_of_Shares. Overall: 52543 tweets of top 20 users in twitter Screen_Name #Tweets Time span (in days) TheEllenShow 3,147 - 662 jimmyfallon 3,123 - 1231 ArianaGrande 3,104 - 613 YouTube 3,077 - 411 KimKardashian 2,939 - 603 katyperry 2,924 - 1,598 selenagomez 2,913 - 2,266 rihanna 2,877 - 1,557 BarackObama 2,863 - 849 britneyspears 2,776 - 1,548 instagram 2,577 - 456 shakira 2,530 - 1,850 Cristiano 2,507 - 2,407 jtimberlake 2,478 - 2,491 ladygaga 2,329 - 894 Twitter 2,290 - 2,593 ddlovato 2,217 - 741 taylorswift13 2,029 - 2,091 justinbieber 2,000 - 664 cnnbrk 1,842 - 183

  12. S

    Social media profile growth, engagement rate, and reach

    • data.sugarlandtx.gov
    xlsx
    Updated Jan 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Communications and Community Engagement (2024). Social media profile growth, engagement rate, and reach [Dataset]. https://data.sugarlandtx.gov/dataset/social-media-profile-growth-engagement-rate-and-reach
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 3, 2024
    Dataset authored and provided by
    Communications and Community Engagement
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Profile growth - the growth on our social platforms to see where and when we're gaining followers. Engagement rate - a ratio of how many people interacted with ours posts based on when users are usually online. Reach - the number of feeds our posts appeared in (doesn't mean people interacted with the post).

  13. d

    Data from: Database of Indian Social Media Influencers on Twitter

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arya, Arshia; De, Soham; Mishra, Dibyendu; Shekhawat, Gazal; Sharma, Ankur; Panda, Anmol; M Lalani, Faisal; Singh, Parantak; Kommiya Mothilal, Ramaravind; Grover, Rynaa; Nishal, Sachita; Dash, Saloni; Rashid Shora, Shehla; Akbar, Syeda Zainab; Pal, Joyojeet (2023). Database of Indian Social Media Influencers on Twitter [Dataset]. http://doi.org/10.7910/DVN/T2CFHO
    Explore at:
    Dataset updated
    Nov 11, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Arya, Arshia; De, Soham; Mishra, Dibyendu; Shekhawat, Gazal; Sharma, Ankur; Panda, Anmol; M Lalani, Faisal; Singh, Parantak; Kommiya Mothilal, Ramaravind; Grover, Rynaa; Nishal, Sachita; Dash, Saloni; Rashid Shora, Shehla; Akbar, Syeda Zainab; Pal, Joyojeet
    Description

    Databases of highly networked individuals have been indispensable in studying narratives and influence on social media. To support studies on Twitter in India, we present a systematically categorized database of accounts of influence on Twitter in India, identified and annotated through an iterative process of friends, networks, and self-described profile information, verified manually. We built an initial set of accounts based on the friend network of a seed set of accounts based on real-world renown in various fields, and then snowballed friends of friends\" multiple times, and rank ordered individuals based on the number of in-group connections, and overall followers. We then manually classified identified accounts under the categories of entertainment, sports, business, government, institutions, journalism, civil society accounts that have independent standing outside of social media, as well as a category ofdigital first" referring to accounts that derive their primary influence from online activity. Overall, we annotated 11580 unique accounts across all categories. The database is useful studying various questions related to the role of influencers in polarisation, misinformation, extreme speech, political discourse etc.

  14. Twitter Dataset Based on Depressive Words

    • kaggle.com
    zip
    Updated May 20, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saurabh Shahane (2021). Twitter Dataset Based on Depressive Words [Dataset]. https://www.kaggle.com/saurabhshahane/twitter-dataset-based-on-depressive-words
    Explore at:
    zip(119409375 bytes)Available download formats
    Dataset updated
    May 20, 2021
    Authors
    Saurabh Shahane
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context

    Right now we see that depression is one of the most common problems in our society. Most of the time people are committed suicide only cause of depression. And till now there is no proper lab test way for detecting depression. Generally, doctors are detecting depression by asking some knowledge-base questions. On the other hand, there are a good number of people using social media platforms right now, where they are sharing their daily experiences, emotion, and other activity with their friends. Twitter is one of the common social platforms and also popular for data collection. I was collecting these datasets from twitter based on some depressive words. I hope that this twitter datasets will help researchers to detect depression more precisely.

    Content

    Raw data from twitter

    Acknowledgements

    Chowdhury, Sawrav (2020), “Raw Twitter Datasets Based on Depressive Words”, Mendeley Data, V1, doi: 10.17632/4rd637tddf.1

  15. T

    sentiment140

    • tensorflow.org
    • opendatalab.com
    • +2more
    Updated Dec 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). sentiment140 [Dataset]. https://www.tensorflow.org/datasets/catalog/sentiment140
    Explore at:
    Dataset updated
    Dec 23, 2022
    Description

    Sentiment140 allows you to discover the sentiment of a brand, product, or topic on Twitter.

    The data is a CSV with emoticons removed. Data file format has 6 fields:

    1. the polarity of the tweet (0 = negative, 2 = neutral, 4 = positive)
    2. the id of the tweet (2087)
    3. the date of the tweet (Sat May 16 23:58:44 UTC 2009)
    4. the query (lyx). If there is no query, then this value is NO_QUERY.
    5. the user that tweeted (robotickilldozr)
    6. the text of the tweet (Lyx is cool)

    For more information, refer to the paper Twitter Sentiment Classification with Distant Supervision at https://cs.stanford.edu/people/alecmgo/papers/TwitterDistantSupervision09.pdf

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('sentiment140', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  16. Twitter vs. Newsletter Impact

    • kaggle.com
    Updated Sep 18, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rachael Tatman (2017). Twitter vs. Newsletter Impact [Dataset]. https://www.kaggle.com/rtatman/twitter-vs-newsletter/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 18, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rachael Tatman
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Context:

    There are lots of really cool datasets getting added to Kaggle every day, and as part of my job I want to help people find them. I’ve been tweeting about datasets on my personal Twitter accounts @rctatman and also releasing a weekly newsletter of interesting datasets.

    I wanted to know which method was more effective at getting the word out about new datasets: Twitter or the newsletter?

    Content:

    This dataset contains two .csv files. One has information on the impact of tweets with links to datasets, while the other has information on the impact of the newsletter.

    Twitter:

    The Twitter .csv has the following information:

    • month: The month of the tweet (1-12)
    • day: The day of the tweet (1-31)
    • hour: The hour of the tweet (1-24)
    • impressions: The number of impressions the tweet got
    • engagement: The number of total engagements
    • clicks: The number of URL clicks

    Fridata Newsletter:

    The Fridata .csv has the following information:

    • date: The Date the newsletter was sent out
    • month: The Month the newsletter was sent out (1-12)
    • day: The day the newsletter was sent out (1-31)
    • # of dataset links: How many links were in the newsletter
    • recipients: How many people received the email with the newsletter
    • total opens: How many times the newsletter was opened
    • unique opens: How many individuals opened the newsletter
    • total clicks: The total number of clicks on the newsletter
    • unique clicks: (unsure; provided by Tinyletter)
    • notes: notes on the newsletter

    Acknowledgements:

    This dataset was collected by the uploader, Rachael Tatman. It is released here under a CC-BY-SA license.

    Inspiration:

    • Which format receives more views?
    • Which format receives more clicks?
    • Which receives more clicks/view?
    • What’s the best time of day to send a tweet?
  17. A Twitter Dataset for Spatial Infectious Disease Surveillance

    • zenodo.org
    • data.niaid.nih.gov
    csv, txt, zip
    Updated Jan 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roberto C.S.N.P. Souza; Manoel Horta Ribeiro; Manoel Horta Ribeiro; Wagner Meira Jr.; Renato M. Assuncao; Walter dos Santos; Roberto C.S.N.P. Souza; Wagner Meira Jr.; Renato M. Assuncao; Walter dos Santos (2021). A Twitter Dataset for Spatial Infectious Disease Surveillance [Dataset]. http://doi.org/10.5281/zenodo.2541440
    Explore at:
    csv, txt, zipAvailable download formats
    Dataset updated
    Jan 6, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Roberto C.S.N.P. Souza; Manoel Horta Ribeiro; Manoel Horta Ribeiro; Wagner Meira Jr.; Renato M. Assuncao; Walter dos Santos; Roberto C.S.N.P. Souza; Wagner Meira Jr.; Renato M. Assuncao; Walter dos Santos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dengue is a mosquito-borne viral disease which infects millions of people every year, specially in developing countries. Some of the main challenges facing the disease are reporting risk indicators and rapidly detecting outbreaks. Traditional surveillance systems rely on passive reporting from health-care facilities, often ignoring human mobility and locating each individual by their home address. Yet, geolocated data are becoming commonplace in social media, which is widely used as means to discuss a large variety of health topics, including the users' health status. In this dataset paper, we make available two large collections of dengue related labeled Twitter data. One is a set of tweets available through the Streaming API using the keywords dengue and aedes from 2010 to 2016. The other is the set of all geolocated tweets in Brazil during the year of 2015 (available also through the Streaming API). We detail the process of collecting and labeling each tweet containing keywords related to dengue in one of 5 categories: personal experience, information, opinion, campaign, and joke. This dataset can be useful for the development of models for spatial disease surveillance, but also scenarios such as understanding health-related content in a language other than English, and studying human mobility.

  18. s

    Why Do People Use Twitter?

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Why Do People Use Twitter? [Dataset]. https://www.searchlogistics.com/learn/statistics/twitter-user-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    One of the biggest advantages of Twitter is the speed at which information can be passed around. People use Twitter primarily to get news and for entertainment. This is the breakdown of why people use Twitter today.

  19. M

    Data from: COVID-19 Twitter Dataset with Latent Topics, Sentiments and...

    • catalog.midasnetwork.us
    csv, zip
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ajay Vishwanath; Raj Gupta; Yinping Yang, COVID-19 Twitter Dataset with Latent Topics, Sentiments and Emotions Attributes [Dataset]. http://doi.org/10.3886/E120321
    Explore at:
    csv, zipAvailable download formats
    Dataset provided by
    MIDAS COORDINATION CENTER
    Authors
    Ajay Vishwanath; Raj Gupta; Yinping Yang
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Country, Region
    Variables measured
    media, Viruses, disease, COVID-19, pathogen, Homo sapiens, social media, host organism, infectious disease, viral Infectious disease, and 3 more
    Dataset funded by
    National Institute of General Medical Sciences
    Description

    The dataset is about public conversation on Twitter surrounding the COVID-19 pandemic. They annotated seventeen latent semantic attributes for each public tweet using natural language processing techniques and machine-learning based algorithms. The latent semantic attributes include: 1) ten attributes indicating the tweet’s relevance to ten detected topics, 2) five quantitative attributes indicating the degree of intensity in the valence (i.e., unpleasantness/pleasantness) and emotional intensities across four primary emotions of fear, anger, sadness and joy, and 3) two qualitative attributes indicating the sentiment category and the most dominant emotion category, respectively. Data is accessible to people who have an OPEN ICPSR account.

  20. Data from: Quantifying crowd size with mobile phone and Twitter data

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated May 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federico Botta; Helen Susannah Moat; Tobias Preis; Federico Botta; Helen Susannah Moat; Tobias Preis (2022). Data from: Quantifying crowd size with mobile phone and Twitter data [Dataset]. http://doi.org/10.5061/dryad.1rk60
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 29, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Federico Botta; Helen Susannah Moat; Tobias Preis; Federico Botta; Helen Susannah Moat; Tobias Preis
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Being able to infer the number of people in a specific area is of extreme importance for the avoidance of crowd disasters and to facilitate emergency evacuations. Here, using a football stadium and an airport as case studies, we present evidence of a strong relationship between the number of people in restricted areas and activity recorded by mobile phone providers and the online service Twitter. Our findings suggest that data generated through our interactions with mobile phone networks and the Internet may allow us to gain valuable measurements of the current state of society.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ayush Kumar Singh (2024). Twitter Dataset February 2024 [Dataset]. https://www.kaggle.com/datasets/fastcurious/twitter-dataset-february-2024
Organization logo

Twitter Dataset February 2024

3382 tweets scraped from twitter.

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 4, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ayush Kumar Singh
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Tweets scraped will all possible datapoints provided by twitter in each tweet. For data extraction or scraping contact me on telegram - @akaseobhw

All datapoints present for each tweet.

Each entry in the dataset represents a tweet along with various attributes such as the tweet's ID, URL, text content, retweet count, reply count, like count, quote count, view count, creation date, language, and more. Additionally, there are details about the tweet's author, including their username, profile URL, follower count, following count, profile picture, cover picture, description, location, creation date, and more.

Here's a brief description of the key fields present in each tweet entry:

  • type: Indicates the type of data, in this case, it's a tweet.
  • id: Unique identifier for the tweet.
  • url: URL of the tweet.
  • twitterUrl: Twitter URL of the tweet.
  • text: Text content of the tweet.
  • retweetCount: Number of retweets.
  • replyCount: Number of replies.
  • likeCount: Number of likes (favorites).
  • quoteCount: Number of times the tweet has been quoted.
  • viewCount: Number of views.
  • createdAt: Date and time when the tweet was created.
  • lang: Language of the tweet.
  • quoteId: ID of the quoted tweet, if this tweet is a quote.
  • bookmarkCount: Number of times the tweet has been bookmarked.
  • isReply: Indicates whether the tweet is a reply to another tweet.
  • author: Information about the author of the tweet.
    • userName: Username of the author.
    • url: URL of the author's profile.
    • followers: Number of followers of the author.
    • following: Number of accounts the author is following.
    • profilePicture: URL of the author's profile picture.
    • coverPicture: URL of the author's cover picture.
    • description: Description or bio of the author.
    • location: Location of the author.
    • createdAt: Date and time when the author's account was created.
  • entities: Entities present in the tweet, such as hashtags, symbols, URLs, and user mentions.
  • isRetweet: Indicates whether the tweet is a retweet.
  • isQuote: Indicates whether the tweet is a quote.
  • quote: Information about the quoted tweet, if this tweet is a quote.
  • media: Information about any media (such as images or videos) attached to the tweet.

This dataset can be analyzed to gain insights into trends, sentiments, and user behavior on Twitter. You can use Python libraries like pandas to load this dataset and perform various analyses and visualizations.

Search
Clear search
Close search
Google apps
Main menu