99 datasets found
  1. Number of global social network users 2017-2028

    • statista.com
    • de.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How many people use social media?

                  Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
    
                  Who uses social media?
                  Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
                  when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
    
                  How much time do people spend on social media?
                  Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
    
                  What are the most popular social media platforms?
                  Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
    
  2. Global social network penetration 2019-2028

    • statista.com
    • de.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Global social network penetration 2019-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    The global social media penetration rate in was forecast to continuously increase between 2024 and 2028 by in total 11.6 (+18.19 percent). After the ninth consecutive increasing year, the penetration rate is estimated to reach 75.31 and therefore a new peak in 2028. Notably, the social media penetration rate of was continuously increasing over the past years.

  3. Leading social media usage reasons worldwide 2024

    • statista.com
    • de.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Leading social media usage reasons worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    A global survey conducted in the third quarter of 2024 found that the main reason for using social media was to keep in touch with friends and family, with over 50.8 percent of social media users saying this was their main reason for using online networks. Overall, 39 percent of social media users said that filling spare time was their main reason for using social media platforms, whilst 34.5 percent of respondents said they used it to read news stories. Less than one in five users were on social platforms for the reason of following celebrities and influencers.

                  The most popular social network
    
                  Facebook dominates the social media landscape. The world's most popular social media platform turned 20 in February 2024, and it continues to lead the way in terms of user numbers. As of February 2025, the social network had over three billion global users. YouTube, Instagram, and WhatsApp follow, but none of these well-known brands can surpass Facebook’s audience size.
                  Moreover, as of the final quarter of 2023, there were almost four billion Meta product users.
    
                  Ever-evolving social media usage
    
                  The utilization of social media remains largely gratuitous; however, companies have been encouraging users to become paid subscribers to reduce dependence on advertising profits. Meta Verified entices users by offering a blue verification badge and proactive account protection, among other things. X (formerly Twitter), Snapchat, and Reddit also offer users the chance to upgrade their social media accounts for a monthly free.
    
  4. Global social media subscriptions comparison 2023

    • statista.com
    • de.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Global social media subscriptions comparison 2023 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Social media companies are starting to offer users the option to subscribe to their platforms in exchange for monthly fees. Until recently, social media has been predominantly free to use, with tech companies relying on advertising as their main revenue generator. However, advertising revenues have been dropping following the COVID-induced boom. As of July 2023, Meta Verified is the most costly of the subscription services, setting users back almost 15 U.S. dollars per month on iOS or Android. Twitter Blue costs between eight and 11 U.S. dollars per month and ensures users will receive the blue check mark, and have the ability to edit tweets and have NFT profile pictures. Snapchat+, drawing in four million users as of the second quarter of 2023, boasts a Story re-watch function, custom app icons, and a Snapchat+ badge.

  5. Twitter user data

    • kaggle.com
    zip
    Updated Aug 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BARKHA VERMA (2020). Twitter user data [Dataset]. https://www.kaggle.com/barkhaverma/twitter-user-data
    Explore at:
    zip(3163744 bytes)Available download formats
    Dataset updated
    Aug 23, 2020
    Authors
    BARKHA VERMA
    Description

    Context

    A Twitter dataset composed of 20,000 rows, Twitter User Data includes the following information: user name, random tweet, account profile, image, and location information.

    Content

    The dataset contains the following fields:

    unit_id: a unique id for user

    golden: whether the user was included in the gold standard for the model; TRUE or FALSE

    unit_state: state of the observation; one of finalized (for contributor-judged) or golden (for gold standard observations)

    trusted_judgments: number of trusted judgments (int); always 3 for non-golden, and what may be a unique id for gold standard observations

    last_judgment_at: date and time of last contributor judgment; blank for gold standard observations

    gender: one of male, female, or brand (for non-human profiles)

    gender:confidence: a float representing confidence in the provided gender

    profile_yn: "no" here seems to mean that the profile was meant to be part of the dataset but was not available when contributors went to judge it

    profile_yn:confidence: confidence in the existence/non-existence of the profile

    created: date and time when the profile was created

    description: the user's profile description

    fav_number: number of tweets the user has favorited

    gender_gold: if the profile is golden, what is the gender?

    link_color: the link color on the profile, as a hex value

    name: the user's name

    profile_yn_gold: whether the profile y/n value is golden

    profileimage: a link to the profile image

    retweet_count: number of times the user has retweeted (or possibly, been retweeted)

    sidebar_color: color of the profile sidebar, as a hex value

    text: text of a random one of the user's tweets

    tweet_coord: if the user has location turned on, the coordinates as a string with the format "[latitude, longitude]"

    tweet_count: number of tweets that the user has posted

    tweet_created: when the random tweet (in the text column) was created

    tweet_id: the tweet id of the random tweet

    tweet_location: location of the tweet; seems to not be particularly normalized

    user_timezone: the timezone of the user

    Acknowledgements

    https://data.world/data-society/twitter-user-data

  6. Facebook Complete Stock Data[2012 - 2020][Latest]

    • kaggle.com
    zip
    Updated Aug 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aayush Mishra (2020). Facebook Complete Stock Data[2012 - 2020][Latest] [Dataset]. https://www.kaggle.com/aayushmishra1512/facebook-complete-stock-data2012-2020latest
    Explore at:
    zip(40052 bytes)Available download formats
    Dataset updated
    Aug 19, 2020
    Authors
    Aayush Mishra
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Facebook is a company that literally every kid is aware of. Its a household name. People from various age groups are there on this social media website. It has helped many in connecting with different people and also has helped some of the investors by earning them a good amount of money. This data set contains the details of the stock of Facebook Inc.

    Content

    This data set has 7 columns with all the necessary values such as opening price of the stock, the closing price of it, its highest in the day and much more. It has date wise data of the stock starting from 2012 to 2020(August).

  7. Social Media Extremism Detection Dataset

    • kaggle.com
    zip
    Updated Nov 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aditya Suresh (2025). Social Media Extremism Detection Dataset [Dataset]. https://www.kaggle.com/datasets/adityasureshgithub/digital-extremism-detection-curated-dataset
    Explore at:
    zip(121048 bytes)Available download formats
    Dataset updated
    Nov 23, 2025
    Authors
    Aditya Suresh
    License

    https://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/

    Description

    NEW UPDATE:

    Show your skills off in the Social Media Extremism Challenge @ https://www.kaggle.com/competitions/social-media-extremism-detection-challenge! Try your luck at tackling this challenging classification problem! After the competition is completed, we will be adding 200+ hand-labelled entries to this dataset so stay tuned!

    We would like to thank Assistant Professor Leilani H. Gilpin (UC Santa Cruz) and the AIEA Lab for their guidance and support in the development of this dataset. —*Aditya Suresh, Anthony Lu, Vishnu Iyer*

    About this data: Social media has seen an increasing rise in the quantity and intensity of extremist content throughout various different services. With cases such as the various different white supremacist movements across the world, recruitment for terrorist organizations through affiliated accounts, and a general sense of hate emerging through the modern era of polarization, it becomes increasingly vital to be able to recognize these patterns and adequately combat the harms of extremism digitally on a global scale.

    Citations: Our dataset would not have been possible without the aid of an already preexisting dataset found on Kaggle, Version 1 of "Hate Speech Detection curated Dataset🤬" by Alban Nyantudre in 2023. The link can be found here: https://www.kaggle.com/datasets/waalbannyantudre/hate-speech-detection-curated-dataset/data. Accessed in 2025, it was truly essential to our work. With over 400,000 messages of real, cleaned posts, we would not have been able to source and label our data points without this crucial resource.

    Classification: Our team hand labelled nearly 3,000 pieces of data from our sourced database of posts, filtering every on of them into a blanket tag of "EXTREMIST" and "NON_EXTREMIST." As many messages digitally utilize context in order to spread harmful rhetoric, we followed a general rule of classifying terms as extremist so long as they "provoked harm to a person or a group of people, whether it be through advocacy for violence, discrimination, or other hurtful sentiments, based off of a characteristic of the group."

    Value of the data: This dataset can be utilized to create extremist sentiment analysis systems and machine learning algorithms, as it reflects on current linguistics, as stated by the source material for the data points themselves. In addition, it can be used as a benchmark for comparing with other extremism datasets and other extremist sentiment analysis systems.

    Potential Errors: Although we feel very confident in our own labeling ability, a possibility of potentially wrong data points does exist due to the fact that these data points lack quantifiable identifiers and as such human errors are possible within the data. We do not believe this to occur often, but in full transparency is an issue that we endeavor to resolve in subsequent updates.

  8. d

    Geolytica POIData.xyz Points of Interest (POI) Geo Data - Australia

    • datarade.ai
    .csv
    Updated Jul 5, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Geolytica (2021). Geolytica POIData.xyz Points of Interest (POI) Geo Data - Australia [Dataset]. https://datarade.ai/data-products/geolytica-poidata-xyz-points-of-interest-poi-geo-data-aus-geolytica
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Jul 5, 2021
    Dataset authored and provided by
    Geolytica
    Area covered
    Australia
    Description

    Point-of-interest (POI) is defined as a physical entity (such as a business) in a geo location (point) which may be (of interest).

    We strive to provide the most accurate, complete and up to date point of interest datasets for all countries of the world. The Australian POI Dataset is one of our worldwide POI datasets with over 98% coverage.

    This is our process flow:

    Our machine learning systems continuously crawl for new POI data
    Our geoparsing and geocoding calculates their geo locations
    Our categorization systems cleanup and standardize the datasets
    Our data pipeline API publishes the datasets on our data store
    

    POI Data is in a constant flux - especially so during times of drastic change such as the Covid-19 pandemic.

    Every minute worldwide on an average day over 200 businesses will move, over 600 new businesses will open their doors and over 400 businesses will cease to exist.

    In today's interconnected world, of the approximately 200 million POIs worldwide, over 94% have a public online presence. As a new POI comes into existence its information will appear very quickly in location based social networks (LBSNs), other social media, pictures, websites, blogs, press releases. Soon after that, our state-of-the-art POI Information retrieval system will pick it up.

    We offer our customers perpetual data licenses for any dataset representing this ever changing information, downloaded at any given point in time. This makes our company's licensing model unique in the current Data as a Service - DaaS Industry. Our customers don't have to delete our data after the expiration of a certain "Term", regardless of whether the data was purchased as a one time snapshot, or via a recurring payment plan on our data update pipeline.

    The main differentiators between us vs the competition are our flexible licensing terms and our data freshness.

    The core attribute coverage for Australia is as follows:

    Poi Field Data Coverage (%) poi_name 100 brand 13 poi_tel 49 formatted_address 100 main_category 94 latitude 100 longitude 100 neighborhood 3 source_url 55 email 10 opening_hours 41 building_footprint 60

    The dataset may be viewed online at https://store.poidata.xyz/au and a data sample may be downloaded at https://store.poidata.xyz/datafiles/au_sample.csv

  9. Planned changes in use of selected social media for organic marketing...

    • statista.com
    • de.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Ross, Planned changes in use of selected social media for organic marketing worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Christopher Ross
    Description

    During a January 2024 global survey among marketers, nearly 60 percent reported plans to increase their organic use of YouTube for marketing purposes in the following 12 months. LinkedIn and Instagram followed, respectively mentioned by 57 and 56 percent of the respondents intending to use them more. According to the same survey, Facebook was the most important social media platform for marketers worldwide.

  10. Jacksepticeye Tweets

    • kaggle.com
    zip
    Updated Dec 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Jacksepticeye Tweets [Dataset]. https://www.kaggle.com/datasets/thedevastator/engagement-reach-and-popularity-of-jacksepticeye
    Explore at:
    zip(1293508 bytes)Available download formats
    Dataset updated
    Dec 27, 2022
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Engagement, Reach, and Popularity of Jacksepticeye Tweets

    An Insight into Social Media Interaction

    By Twitter [source]

    About this dataset

    This dataset provides an insight into the reach and impact of Jacksepticeye's tweets. With curated content covering everything from gaming to life reflections, these tweets offer a snapshot not only of his global popularity, but also his ability to engage with an audience and ignite conversation. From each tweet, you can learn data points like its content, the number of likes it received, which replies popped up in response, how many times it was retweeted or marked as a favorite, and the overall relevance of that particular tweet in terms of its contribution to conversations worldwide. This comprehensive dataset is a great opportunity to explore the power behind Jacksepticeye's social media presence!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset is in csv format and contains information about different tweets such as their content and the response they received from audiences in terms of likes, retweets and other measures. The following columns are included:

    • Tweet ID: A unique identifier for each tweet
    • Tweet content: The text contained within a tweet
    • Likes: Number of times a user has interacted with a specific tweet by pressing the “like” button
    • Replies: Number of direct replies to the original tweet
    • Re-Tweets: Number of times users have shared/re-tweeted a specific tweet
    • Retweeted : Indicates whether or not it was retweeted by someone else
    • Relevance : A measure on how relevant this conversation was at that particular time

    This data can be used for an array of tasks such as sentiment analysis (measuring how people feel about certain topics) or network analysis (understanding who were most influential in spreading Jackseptiye's message). You could also use this data to understand any changes in engagement metrics over time or measure which topics generate greater responses from audiences.

    To begin using this dataset first import it into your scripting language. After importing you can start exploring what insights could be gained with it, by asking questions such as ‘Which type of posts perform better?’ or ‘What types on conversations does Jacksepticeye tend to have?’ By focusing on one question at a time you can start looking for correlations between variables, gaining better understanding into why certain types over post perform differently than other ones . With variable manipulation techniques like select/filter you could group posts according to adhoc groups that answer your initial questions ('gaming', 'travel' etc). Once you narrow down these interests fields together with relevance indices quickly become much more easier to manage & interpret since they now operate under meaningful contexts rather than individual observations & associated figures (likes etc). Working off existing workbooks greatly increases efficiency while analysing datasets so make sure that if one exists already (and updates don't occure frequently enough) take advantage if it!

    Research Ideas

    • Identifying the types of content that performs best on the platform: By analyzing the engagement, reach, and popularity of tweets, marketers can determine which topics generate higher engagement and reach to inform their own strategies.

    • Assessing user interactions: Examining reply counts and retweet counts reveals how users interact with Jacksepticeye's posts, helping to inform a better understanding of user dynamics on Twitter.

    • Measuring influencer marketing ROI: Since this dataset contains the number of likes and retweets for each post, marketers can compare these values to assess the success of an influencer marketing campaign by determining whether it had a positive effect on followers' engagement with Jacksepticeye's content

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Twitter.

  11. (🌅 Sunset) Kaggle Users' Country + Regions Info

    • kaggle.com
    zip
    Updated Feb 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BwandoWando (2024). (🌅 Sunset) Kaggle Users' Country + Regions Info [Dataset]. https://www.kaggle.com/datasets/bwandowando/kaggle-user-country-regions
    Explore at:
    zip(2376511 bytes)Available download formats
    Dataset updated
    Feb 14, 2024
    Authors
    BwandoWando
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    [Context]

    The official Meta-Kaggle dataset contains the Users.csv file which contains Username, DisplayName, RegisterDate, and PerformanceTier fields but doesn't contain location data of the Kaggle Users. This dataset augments that data with additional country and region information.

    [Note]

    I haven't included the username and displayname values on purpose, just the userid to be joined back to the Meta-Kaggle official Users.csv file.

    [Limitations]

    It is possible that some users haven't inputted their details when the scraper went through their accounts and thus have missing data. Another possibility is that users may have updated their info after the scraper went through their accounts, thus resulting in inconsistencies.

    [How I defined active in this dataset]

    • Users that have received an upvote in the forums, datasets, or notebooks
    • Users that have given an upvote in the forums, datasets, or notebooks
    • Users that have created a thread, a forum post, a notebook, or a dataset
    • Users that made a competition submission
    • Users that exist in the Meta-Kaggle Users dataset
    • Date cut-off of Jan 01, 2019

    [Update]

    • 15-Feb-2024- Since the Kaggle member's profile page update, the scrapers arent working anymore as the UI layout has changed. Will fix this when we get the time.
  12. m

    Top 50 trending topics (trends) of Twitter for 2018 (one hour interval)

    • data.mendeley.com
    Updated Feb 16, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Issa Annamoradnejad (2019). Top 50 trending topics (trends) of Twitter for 2018 (one hour interval) [Dataset]. http://doi.org/10.17632/d4ccnh588k.1
    Explore at:
    Dataset updated
    Feb 16, 2019
    Authors
    Issa Annamoradnejad
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains top 50 trending topics (trends) of Twitter, obtained from Twitter Trends API in an hourly rate. For each hour, there exists a row in the dataset that contains the date, time, trending topic and the related tweets count (if available). Data is for more than 97% of 2018 which our script was available.

  13. Data from: Analysis of the Quantitative Impact of Social Networks General...

    • figshare.com
    • produccioncientifica.ucm.es
    doc
    Updated Oct 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Parra; Santiago Martínez Arias; Sergio Mena Muñoz (2022). Analysis of the Quantitative Impact of Social Networks General Data.doc [Dataset]. http://doi.org/10.6084/m9.figshare.21329421.v1
    Explore at:
    docAvailable download formats
    Dataset updated
    Oct 14, 2022
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    David Parra; Santiago Martínez Arias; Sergio Mena Muñoz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    General data recollected for the studio " Analysis of the Quantitative Impact of Social Networks on Web Traffic of Cybermedia in the 27 Countries of the European Union". Four research questions are posed: what percentage of the total web traffic generated by cybermedia in the European Union comes from social networks? Is said percentage higher or lower than that provided through direct traffic and through the use of search engines via SEO positioning? Which social networks have a greater impact? And is there any degree of relationship between the specific weight of social networks in the web traffic of a cybermedia and circumstances such as the average duration of the user's visit, the number of page views or the bounce rate understood in its formal aspect of not performing any kind of interaction on the visited page beyond reading its content? To answer these questions, we have first proceeded to a selection of the cybermedia with the highest web traffic of the 27 countries that are currently part of the European Union after the United Kingdom left on December 31, 2020. In each nation we have selected five media using a combination of the global web traffic metrics provided by the tools Alexa (https://www.alexa.com/), which ceased to be operational on May 1, 2022, and SimilarWeb (https:// www.similarweb.com/). We have not used local metrics by country since the results obtained with these first two tools were sufficiently significant and our objective is not to establish a ranking of cybermedia by nation but to examine the relevance of social networks in their web traffic. In all cases, cybermedia whose property corresponds to a journalistic company have been selected, ruling out those belonging to telecommunications portals or service providers; in some cases they correspond to classic information companies (both newspapers and televisions) while in others they refer to digital natives, without this circumstance affecting the nature of the research proposed.
    Below we have proceeded to examine the web traffic data of said cybermedia. The period corresponding to the months of October, November and December 2021 and January, February and March 2022 has been selected. We believe that this six-month stretch allows possible one-time variations to be overcome for a month, reinforcing the precision of the data obtained. To secure this data, we have used the SimilarWeb tool, currently the most precise tool that exists when examining the web traffic of a portal, although it is limited to that coming from desktops and laptops, without taking into account those that come from mobile devices, currently impossible to determine with existing measurement tools on the market. It includes:

    Web traffic general data: average visit duration, pages per visit and bounce rate Web traffic origin by country Percentage of traffic generated from social media over total web traffic Distribution of web traffic generated from social networks Comparison of web traffic generated from social netwoks with direct and search procedures

  14. g

    Monitoring sozialer Medien im Bundestagswahlkampf 2017

    • search.gesis.org
    Updated Feb 28, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stier, Sebastian; Bleier, Arnim; Bonart, Malte; Mörsheim, Fabian; Bohlouli, Mahdi; Nizhegorodov, Margarita; Posch, Lisa; Maier, Jürgen; Rothmund, Tobias; Staab, Steffen (2018). Monitoring sozialer Medien im Bundestagswahlkampf 2017 [Dataset]. http://doi.org/10.4232/1.12992
    Explore at:
    (123119233), (122704753), (138827671)Available download formats
    Dataset updated
    Feb 28, 2018
    Dataset provided by
    GESIS search
    GESIS Data Archive
    Authors
    Stier, Sebastian; Bleier, Arnim; Bonart, Malte; Mörsheim, Fabian; Bohlouli, Mahdi; Nizhegorodov, Margarita; Posch, Lisa; Maier, Jürgen; Rothmund, Tobias; Staab, Steffen
    License

    https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms

    Time period covered
    Jul 5, 2017 - Sep 30, 2017
    Description

    Social Media Monitoring of the German Federal Election Campaign 2017

    This dataset contains results from the social media monitoring of Facebook and Twitter for the German federal election campaign 2017. The project collected the tweets and Facebook posts of political candidates and organizations and the engagement of users with these contents – retweets and @-mentions on Twitter, comments, shares and likes on Facebook. Finally, all messages on Twitter containing at least one keyword denoting central political topics were collected. All data was publicly available at the time of data collection. The collected data is proprietary and owned by Facebook and Twitter. Due to this and with respect to privacy restrictions, only the following aspects of the data can be shared:

    (1) A list of all candidates that were considered in the project, their key attributes and the identification of their respective Twitter accounts and Facebook pages.

    Candidate dataset: Full surname, all first names of the candidate; academic title and name pre- or suffixes (if they exist); URL of the first Facebook account; URL of the second Facebook account; URL of the Twitter account; candidate is placed on a party list; candidate’s place on the party list; candidate is a direct candidate in one of the constituencies; official number and official name of the constituency in which the candidate is running for a direct mandate; state; candidate is a member of the federal parliament (Bundestag); party of the candidate; sex, age (year of birth); place of residence; place of birth; profession.

    Additionally coded was: unique ID.

    (2) Lists of organizations relevant during an election campaign, i.e. political parties and important gatekeepers, along with their respective Twitter and Facebook accounts.

    (3) A list of tweet IDs which can be used to retrieve the tweets we collected during our research period.

  15. d

    Geolytica POIData.xyz Points of Interest (POI) Geo Data - Sweden

    • datarade.ai
    .csv
    Updated Apr 2, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Geolytica (2022). Geolytica POIData.xyz Points of Interest (POI) Geo Data - Sweden [Dataset]. https://datarade.ai/data-products/geolytica-poidata-xyz-points-of-interest-poi-geo-data-sweden-geolytica
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Apr 2, 2022
    Dataset authored and provided by
    Geolytica
    Area covered
    Sweden
    Description

    https://store.poidata.xyz/se

    Point-of-interest (POI) is defined as a physical entity (such as a business) in a geo location (point) which may be (of interest).

    We strive to provide the most accurate, complete and up to date point of interest datasets for all countries of the world. The Sweden POI Dataset is one of our worldwide POI datasets with over 98% coverage.

    This is our process flow:

    Our machine learning systems continuously crawl for new POI data
    Our geoparsing and geocoding calculates their geo locations
    Our categorization systems cleanup and standardize the datasets
    Our data pipeline API publishes the datasets on our data store
    

    POI Data is in a constant flux - especially so during times of drastic change such as the Covid-19 pandemic.

    Every minute worldwide on an average day over 200 businesses will move, over 600 new businesses will open their doors and over 400 businesses will cease to exist.

    In today's interconnected world, of the approximately 200 million POIs worldwide, over 94% have a public online presence. As a new POI comes into existence its information will appear very quickly in location based social networks (LBSNs), other social media, pictures, websites, blogs, press releases. Soon after that, our state-of-the-art POI Information retrieval system will pick it up.

    We offer our customers perpetual data licenses for any dataset representing this ever changing information, downloaded at any given point in time. This makes our company's licensing model unique in the current Data as a Service - DaaS Industry. Our customers don't have to delete our data after the expiration of a certain "Term", regardless of whether the data was purchased as a one time snapshot, or via a recurring payment plan on our data update pipeline.

    The main differentiators between us vs the competition are our flexible licensing terms and our data freshness.

    The core attribute coverage is as follows:

    Poi Field Data Coverage (%) poi_name 100 brand 9 poi_tel 46 formatted_address 100 main_category 97 latitude 100 longitude 100 neighborhood 5 source_url 60 email 12 opening_hours 38

    The dataset may be viewed online at https://store.poidata.xyz/se and a data sample may be downloaded at https://store.poidata.xyz/datafiles/se_sample.csv

  16. Tweets With Emoji

    • kaggle.com
    zip
    Updated Apr 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ericwang2001 (2023). Tweets With Emoji [Dataset]. https://www.kaggle.com/datasets/ericwang1011/tweets-with-emoji/discussion
    Explore at:
    zip(48238750 bytes)Available download formats
    Dataset updated
    Apr 12, 2023
    Authors
    ericwang2001
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The data was obtained through the utilization of snscrape. The query used for retrieval was based on individual emojis. Relevant data was identified, and subsequently assessed for the presence of emojis as well as the sentence's adherence to English language conventions. The language detection analysis was conducted using pycld3, which was inspired by the paper "The WiLI benchmark dataset for written language identification." Each csv file consists of 20,000 distinct data entries. The file name is created based on emoji package (emoji.EMOJI_DATA) in Python.

    It should be noted that given the possible occurrence of small errors associated with pycld3, along with the potential for multiple emojis per data entry, there may exist instances of non-English tweets or duplicated tweets across different CSV files.

  17. d

    Geolytica POIData.xyz Points of Interest (POI) Geo Data - Turkey

    • datarade.ai
    .csv
    Updated Nov 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Geolytica (2021). Geolytica POIData.xyz Points of Interest (POI) Geo Data - Turkey [Dataset]. https://datarade.ai/data-products/geolytica-poidata-xyz-points-of-interest-poi-geo-data-turkey-geolytica
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Nov 13, 2021
    Dataset authored and provided by
    Geolytica
    Area covered
    Turkey
    Description

    https://store.poidata.xyz/tr

    Point-of-interest (POI) is defined as a physical entity (such as a business) in a geo location (point) which may be (of interest).

    We strive to provide the most accurate, complete and up to date point of interest datasets for all countries of the world. The Turkey POI Dataset is one of our worldwide POI datasets with over 98% coverage.

    This is our process flow:

    Our machine learning systems continuously crawl for new POI data
    Our geoparsing and geocoding calculates their geo locations
    Our categorization systems cleanup and standardize the datasets
    Our data pipeline API publishes the datasets on our data store
    

    POI Data is in a constant flux - especially so during times of drastic change such as the Covid-19 pandemic.

    Every minute worldwide on an average day over 200 businesses will move, over 600 new businesses will open their doors and over 400 businesses will cease to exist.

    In today's interconnected world, of the approximately 200 million POIs worldwide, over 94% have a public online presence. As a new POI comes into existence its information will appear very quickly in location based social networks (LBSNs), other social media, pictures, websites, blogs, press releases. Soon after that, our state-of-the-art POI Information retrieval system will pick it up.

    We offer our customers perpetual data licenses for any dataset representing this ever changing information, downloaded at any given point in time. This makes our company's licensing model unique in the current Data as a Service - DaaS Industry. Our customers don't have to delete our data after the expiration of a certain "Term", regardless of whether the data was purchased as a one time snapshot, or via a recurring payment plan on our data update pipeline.

    The main differentiators between us vs the competition are our flexible licensing terms and our data freshness.

    The core attribute coverage is as follows:

    Poi Field Data Coverage (%) poi_name 100 brand 7 poi_tel 49 formatted_address 100 main_category 98 latitude 100 longitude 100 neighborhood 90 source_url 35 email 4 opening_hours 48

    The dataset may be viewed online at https://store.poidata.xyz/tr and a data sample may be downloaded at https://store.poidata.xyz/datafiles/tr_sample.csv

  18. Average daily time spent on social media worldwide 2012-2024

    • statista.com
    • de.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Average daily time spent on social media worldwide 2012-2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    How much time do people spend on social media?

                  As of 2024, the average daily social media usage of internet users worldwide amounted to 143 minutes per day, down from 151 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of three hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in
                  the U.S. was just two hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively.
                  People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general.
                  During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.
    
  19. Competition between Homophily and Information Entropy Maximization in Social...

    • plos.figshare.com
    pdf
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jichang Zhao; Xiao Liang; Ke Xu (2023). Competition between Homophily and Information Entropy Maximization in Social Networks [Dataset]. http://doi.org/10.1371/journal.pone.0136896
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jichang Zhao; Xiao Liang; Ke Xu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In social networks, it is conventionally thought that two individuals with more overlapped friends tend to establish a new friendship, which could be stated as homophily breeding new connections. While the recent hypothesis of maximum information entropy is presented as the possible origin of effective navigation in small-world networks. We find there exists a competition between information entropy maximization and homophily in local structure through both theoretical and experimental analysis. This competition suggests that a newly built relationship between two individuals with more common friends would lead to less information entropy gain for them. We demonstrate that in the evolution of the social network, both of the two assumptions coexist. The rule of maximum information entropy produces weak ties in the network, while the law of homophily makes the network highly clustered locally and the individuals would obtain strong and trust ties. A toy model is also presented to demonstrate the competition and evaluate the roles of different rules in the evolution of real networks. Our findings could shed light on the social network modeling from a new perspective.

  20. Sundanese Twitter Dataset emotions classification

    • kaggle.com
    zip
    Updated Apr 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rabie El Kharoua (2024). Sundanese Twitter Dataset emotions classification [Dataset]. https://www.kaggle.com/datasets/rabieelkharoua/sundanese-twitter-dataset
    Explore at:
    zip(104015 bytes)Available download formats
    Dataset updated
    Apr 24, 2024
    Authors
    Rabie El Kharoua
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains tweet of the second-largest local language in Indonesia and is used for emotion classification.

    Dataset Characteristics: Tabular

    Subject Area: Computer Science

    Associated Tasks: Classification

    Instances: 2510

    Dataset Information

    For what purpose was the dataset created?

    This dataset is created as contribution for NLP research particularly in Indonesia

    Who funded the creation of the dataset?

    This dataset is self-funded

    What do the instances in this dataset represent?

    tweet

    Are there recommended data splits?

    No

    Was there any data preprocessing performed?

    tokenization, stopword removal, stemming

    Has Missing Values?

    No

    Introductory Paper

    Title: Sundanese Twitter Dataset for Emotion Classification

    Authors: Oddy Virgantara Putra; Fathin Muhammad Wasmanson; Triana Harmini; Shoffin Nahwa Utama. 2020

    Journal: Published in Conference

    Link: https://ieeexplore.ieee.org/abstract/document/9297929

    Abstract of Introductory Paper

    Sundanese is the second-largest tribe in Indonesia which possesses many dialects. This condition has gained attention for many researchers to analyze emotion especially on social media. However, with barely available Sundanese dataset, this condition makes understanding sundanese emotion is a challenging task. In this research, we proposed a dataset for emotion classification of Sundanese text. The preprocessing includes case folding, stopwords removal, stemming, tokenizing, and text representation. Prior to classification, for the feature generation, we utilize term frequency-inverse document frequency (TFIDF). We evaluated our dataset using k-Fold Cross Validation. Our experiments with the proposed method exhibit an effective result for machine learning classification. Furthermore, as far as we know, this is the first Sundanese emotion dataset available for public.

    Cite

    Citation: Putra,Oddy Virgantara. (2021). Sundanese Twitter Dataset. UCI Machine Learning Repository. https://doi.org/10.24432/C5MK8C.

    BibTex: @misc{misc_sundanese_twitter_dataset_695, author = {Putra,Oddy Virgantara}, title = {{Sundanese Twitter Dataset}}, year = {2021}, howpublished = {UCI Machine Learning Repository}, note = {{DOI}: https://doi.org/10.24432/C5MK8C} }

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/
Organization logo

Number of global social network users 2017-2028

Explore at:
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description

How many people use social media?

              Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.

              Who uses social media?
              Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
              when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.

              How much time do people spend on social media?
              Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.

              What are the most popular social media platforms?
              Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
Search
Clear search
Close search
Google apps
Main menu