*** Fake News on Twitter ***
These 5 datasets are the results of an empirical study on the spreading process of newly fake news on Twitter. Particularly, we have focused on those fake news which have given rise to a truth spreading simultaneously against them. The story of each fake news is as follow:
1- FN1: A Muslim waitress refused to seat a church group at a restaurant, claiming "religious freedom" allowed her to do so.
2- FN2: Actor Denzel Washington said electing President Trump saved the U.S. from becoming an "Orwellian police state."
3- FN3: Joy Behar of "The View" sent a crass tweet about a fatal fire in Trump Tower.
4- FN4: The animated children's program 'VeggieTales' introduced a cannabis character in August 2018.
5- FN5: In September 2018, the University of Alabama football program ended its uniform contract with Nike, in response to Nike's endorsement deal with Colin Kaepernick.
The data collection has been done in two stages that each provided a new dataset: 1- attaining Dataset of Diffusion (DD) that includes information of fake news/truth tweets and retweets 2- Query of neighbors for spreaders of tweets that provides us with Dataset of Graph (DG).
DD
DD for each fake news story is an excel file, named FNx_DD where x is the number of fake news, and has the following structure:
The structure of excel files for each dataset is as follow:
Each row belongs to one captured tweet/retweet related to the rumor, and each column of the dataset presents a specific information about the tweet/retweet. These columns from left to right present the following information about the tweet/retweet:
User ID (user who has posted the current tweet/retweet)
The description sentence in the profile of the user who has published the tweet/retweet
The number of published tweet/retweet by the user at the time of posting the current tweet/retweet
Date and time of creation of the account by which the current tweet/retweet has been posted
Language of the tweet/retweet
Number of followers
Number of followings (friends)
Date and time of posting the current tweet/retweet
Number of like (favorite) the current tweet had been acquired before crawling it
Number of times the current tweet had been retweeted before crawling it
Is there any other tweet inside of the current tweet/retweet (for example this happens when the current tweet is a quote or reply or retweet)
The source (OS) of device by which the current tweet/retweet was posted
Tweet/Retweet ID
Retweet ID (if the post is a retweet then this feature gives the ID of the tweet that is retweeted by the current post)
Quote ID (if the post is a quote then this feature gives the ID of the tweet that is quoted by the current post)
Reply ID (if the post is a reply then this feature gives the ID of the tweet that is replied by the current post)
Frequency of tweet occurrences which means the number of times the current tweet is repeated in the dataset (for example the number of times that a tweet exists in the dataset in the form of retweet posted by others)
State of the tweet which can be one of the following forms (achieved by an agreement between the annotators):
r : The tweet/retweet is a fake news post
a : The tweet/retweet is a truth post
q : The tweet/retweet is a question about the fake news, however neither confirm nor deny it
n : The tweet/retweet is not related to the fake news (even though it contains the queries related to the rumor, but does not refer to the given fake news)
DG
DG for each fake news contains two files:
A file in graph format (.graph) which includes the information of graph such as who is linked to whom. (This file named FNx_DG.graph, where x is the number of fake news)
A file in Jsonl format (.jsonl) which includes the real user IDs of nodes in the graph file. (This file named FNx_Labels.jsonl, where x is the number of fake news)
Because in the graph file, the label of each node is the number of its entrance in the graph. For example if node with user ID 12345637 be the first node which has been entered into the graph file then its label in the graph is 0 and its real ID (12345637) would be at the row number 1 (because the row number 0 belongs to column labels) in the jsonl file and so on other node IDs would be at the next rows of the file (each row corresponds to 1 user id). Therefore, if we want to know for example what the user id of node 200 (labeled 200 in the graph) is, then in jsonl file we should look at row number 202.
The user IDs of spreaders in DG (those who have had a post in DD) would be available in DD to get extra information about them and their tweet/retweet. The other user IDs in DG are the neighbors of these spreaders and might not exist in DD.
Which county has the most Facebook users?
There are more than 378 million Facebook users in India alone, making it the leading country in terms of Facebook audience size. To put this into context, if India’s Facebook audience were a country then it would be ranked third in terms of largest population worldwide. Apart from India, there are several other markets with more than 100 million Facebook users each: The United States, Indonesia, and Brazil with 193.8 million, 119.05 million, and 112.55 million Facebook users respectively.
Facebook – the most used social media
Meta, the company that was previously called Facebook, owns four of the most popular social media platforms worldwide, WhatsApp, Facebook Messenger, Facebook, and Instagram. As of the third quarter of 2021, there were around 3,5 billion cumulative monthly users of the company’s products worldwide. With around 2.9 billion monthly active users, Facebook is the most popular social media worldwide. With an audience of this scale, it is no surprise that the vast majority of Facebook’s revenue is generated through advertising.
Facebook usage by device
As of July 2021, it was found that 98.5 percent of active users accessed their Facebook account from mobile devices. In fact, almost 81.8 percent of Facebook audiences worldwide access the platform only via mobile phone. Facebook is not only available through mobile browser as the company has published several mobile apps for users to access their products and services. As of the third quarter 2021, the four core Meta products were leading the ranking of most downloaded mobile apps worldwide, with WhatsApp amassing approximately six billion downloads.
Cristiano Ronaldo has one of the most popular Instagram accounts as of April 2024.
The Portuguese footballer is the most-followed person on the photo sharing app platform with 628 million followers. Instagram's own account was ranked first with roughly 672 million followers.
How popular is Instagram?
Instagram is a photo-sharing social networking service that enables users to take pictures and edit them with filters. The platform allows users to post and share their images online and directly with their friends and followers on the social network. The cross-platform app reached one billion monthly active users in mid-2018. In 2020, there were over 114 million Instagram users in the United States and experts project this figure to surpass 127 million users in 2023.
Who uses Instagram?
Instagram audiences are predominantly young – recent data states that almost 60 percent of U.S. Instagram users are aged 34 years or younger. Fall 2020 data reveals that Instagram is also one of the most popular social media for teens and one of the social networks with the biggest reach among teens in the United States.
Celebrity influencers on Instagram
Many celebrities and athletes are brand spokespeople and generate additional income with social media advertising and sponsored content. Unsurprisingly, Ronaldo ranked first again, as the average media value of one of his Instagram posts was 985,441 U.S. dollars.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
The estimated median age gives an idea of the age distribution of the population in a given area. A greater median age would suggest that the area of interest has a relatively large number of older residents, while a lower median age suggests that the area has a relatively large number of younger residents.
Champaign County’s estimated median age has risen for over a decade, but has always stayed between 28 and 31. Year-to-year changes from 2017 to 2019 were statistically significant, but not from 2019 to 2023. The Champaign County estimated median age has been consistently younger than the estimated median ages of the United States and State of Illinois. Champaign County’s figure is likely impacted to some degree by the large student population associated with the University of Illinois.
The estimated median age does not provide a significant amount of detail, and it does not provide any information on why the estimated median age is what it is. However, when placed in the context of other pieces of data and other indicators, it is a valuable starting point in understanding county demographics.
Estimated median age data was sourced from the U.S. Census Bureau’s American Community Survey (ACS) 1-Year Estimates, which are released annually.
As with any datasets that are estimates rather than exact counts, it is important to take into account the margins of error (listed in the column beside each figure) when drawing conclusions from the data.
Due to the impact of the COVID-19 pandemic, instead of providing the standard 1-year data products, the Census Bureau released experimental estimates from the 1-year data in 2020. This includes a limited number of data tables for the nation, states, and the District of Columbia. The Census Bureau states that the 2020 ACS 1-year experimental tables use an experimental estimation methodology and should not be compared with other ACS data. For these reasons, and because data is not available for Champaign County, no data for 2020 is included in this Indicator.
For interested data users, the 2020 ACS 1-Year Experimental data release includes a dataset on Median Age by Sex.
Sources: U.S. Census Bureau; American Community Survey, 2023 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using data.census.gov; (8 October 2024).; U.S. Census Bureau; American Community Survey, 2022 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using data.census.gov; (6 October 2023).; U.S. Census Bureau; American Community Survey, 2021 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using data.census.gov; (13 October 2022).; U.S. Census Bureau; American Community Survey, 2019 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using data.census.gov; (7 April 2021).; U.S. Census Bureau; American Community Survey, 2018 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using data.census.gov; (7 April 2021).; U.S. Census Bureau; American Community Survey, 2017 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (13 September 2018).; U.S. Census Bureau; American Community Survey, 2016 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (14 September 2017).; U.S. Census Bureau; American Community Survey, 2015 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (19 September 2016).; U.S. Census Bureau; American Community Survey, 2014 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (16 March 2016).; U.S. Census Bureau; American Community Survey, 2013 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (16 March 2016).; U.S. Census Bureau; American Community Survey, 2012 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (16 March 2016).; U.S. Census Bureau; American Community Survey, 2011 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (16 March 2016).; U.S. Census Bureau; American Community Survey, 2010 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (16 March 2016).; U.S. Census Bureau; American Community Survey, 2009 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (16 March 2016).; U.S. Census Bureau; American Community Survey, 2008 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (16 March 2016).; U.S. Census Bureau; American Community Survey, 2007 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (16 March 2016).; U.S. Census Bureau; American Community Survey, 2006 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (16 March 2016).; U.S. Census Bureau; American Community Survey, 2005 American Community Survey 1-Year Estimates, Table S0101; generated by CCRPC staff; using American FactFinder; (16 March 2016).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Star dataset to predict star types’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/deepu1109/star-dataset on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This is a dataset consisting of several features of stars.
Some of them are:
- Absolute Temperature (in K)
- Relative Luminosity (L/Lo)
- Relative Radius (R/Ro)
- Absolute Magnitude (Mv)
- Star Color (white,Red,Blue,Yellow,yellow-orange etc)
- Spectral Class (O,B,A,F,G,K,,M)
- Star Type **(Red Dwarf, Brown Dwarf, White Dwarf, Main Sequence , SuperGiants, HyperGiants)**
Lo = 3.828 x 10^26 Watts (Avg Luminosity of Sun)
Ro = 6.9551 x 10^8 m (Avg Radius of Sun)
The purpose of making the dataset is to prove that the stars follows a certain graph in the celestial Space ,
specifically called Hertzsprung-Russell Diagram
or simply HR-Diagram
so that we can classify stars by plotting its features based on that graph.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3791628%2F14338bbebf77d18e1faef582bccdbdd6%2Fhr.jpg?generation=1597349509841965&alt=media" alt="hr-1">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3791628%2F9fc57334a9b9fafbc71aacdd6e5cd69c%2F310px-Hertzsprung-Russel_StarData.png?generation=1597349661801284&alt=media" alt="hr-2">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3791628%2Ffe9436bf4e2d23b5b18fb3db1f1fcbcb%2FHRDiagram.png?generation=1597348809674507&alt=media" alt="hr-3">
The dataset is created based on several equations in astrophysics. They are given below:
The dataset took 3 weeks to collect for 240 stars which are mostly collected from web. The missing data were manually calculated using those equations of astrophysics given above.
--- Original source retains full ownership of the source dataset ---
This is a dataset consisting of several features of stars.
Some of them are:
- Absolute Temperature (in K)
- Relative Luminosity (L/Lo)
- Relative Radius (R/Ro)
- Absolute Magnitude (Mv)
- Star Color (white,Red,Blue,Yellow,yellow-orange etc)
- Spectral Class (O,B,A,F,G,K,,M)
- Star Type **(Red Dwarf, Brown Dwarf, White Dwarf, Main Sequence , SuperGiants, HyperGiants)**
Lo = 3.828 x 10^26 Watts (Avg Luminosity of Sun)
Ro = 6.9551 x 10^8 m (Avg Radius of Sun)
The purpose of making the dataset is to prove that the stars follows a certain graph in the celestial Space ,
specifically called Hertzsprung-Russell Diagram
or simply HR-Diagram
so that we can classify stars by plotting its features based on that graph.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3791628%2F14338bbebf77d18e1faef582bccdbdd6%2Fhr.jpg?generation=1597349509841965&alt=media" alt="hr-1">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3791628%2F9fc57334a9b9fafbc71aacdd6e5cd69c%2F310px-Hertzsprung-Russel_StarData.png?generation=1597349661801284&alt=media" alt="hr-2">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3791628%2Ffe9436bf4e2d23b5b18fb3db1f1fcbcb%2FHRDiagram.png?generation=1597348809674507&alt=media" alt="hr-3">
The dataset is created based on several equations in astrophysics. They are given below:
The dataset took 3 weeks to collect for 240 stars which are mostly collected from web. The missing data were manually calculated using those equations of astrophysics given above.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset consists of structural vibration data (vertical velocity of floor structure) induced by 10 people’s footsteps as they walk around with 8 different walking speeds, sensed by 5 geophone sensors.
The footstep-induced structural vibration data is stored as footstep traces, each consisting of a series of consecutive footsteps (see the sample plot). The dataset is stored in a MAT-file named People.mat. The dataset includes three layers of labels: 1) person identity i (i = 1, 2, ..., 10), 2) sensor number j (j = 1, 2, ..., 5), and 3) walking speed k (k = 1, 2, ..., 8). The speed k represents the walking speeds of (\mu,\ \ \mu+\sigma,\ \ \mu+2\sigma,\ \ \mu+3\sigma,\ \ \mu-\sigma,\ \ \mu-2\sigma,\ \ \mu-3\sigma), and self-selected speed by each person respectively. (\mu) and (\sigma) refer to the mean and standard deviation of the step frequencies. To access the footstep traces from the person i, sensor j with walking speed k, please use the MATLAB syntax People{i}.Sen{j}.S{k}. This gives a (m\times n) cell structure. (m) denotes the individual trace number, of which the number of traces varies from 10 to 12; (n) represents the level of amplification, including 2000X, 4000X, and 6000X, corresponding to n = 1, n = 2, and n = 3 respectively. To read and plot a sample trace of footstep-induced floor vibration, use the script read_data.m. For more details, please refer to the original FootprintID paper in the following link: https://dl-acm-org.stanford.idm.oclc.org/doi/10.1145/3130954
The human walking experiment involves 10 participants aged between 20 to 29 years old, of which 8 are male and 2 are female. Their walking area is 30ft X 6ft along a hallway with concrete floor. Each of the participants wears flat bottom shoes.
The sensing unit consists of 5 components: 1) the geophone (SM-24), 2) the amplification module, 3) the processor board, 4) the communication module (XBee radio), and 5) the batteries. The sensing unit converts the structural vibration velocity into voltages records. The sampling frequency is 1000Hz.
The hardware unit, experiment setup, and a sample data plot can be found in Experiment Introduction.pdf. Further implementation details can be found in the original FootPrintID paper in the link above.
Please cite this dataset as:
Yiwen Dong, Shijia Pan, Tong Yu, Mostafa Mirshekari, Jonathon Fagert, Amelie Bonde, Ole J. Mengshoel, Pei Zhang, and Hae Young Noh. 2021. The FootprintID Dataset: Footstep-Induced Structural Vibration Data for Person Identification with 8 Different Walking Speeds. Zenodo, DOI: https://doi.org/10.5281/zenodo.4691144
Shijia Pan, Tong Yu, Mostafa Mirshekari, Jonathon Fagert, Amelie Bonde, Ole J. Mengshoel, Hae Young Noh, and Pei Zhang. 2017. FootprintID: Indoor Pedestrian Identification through Ambient Structural Vibration Sensing. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 1, 3, Article 89 (September 2017), 31 pages. DOI: https://doi.org/10.1145/3130954
As of April 2024, it was found that men between the ages of 25 and 34 years made up Facebook largest audience, accounting for 18.4 percent of global users. Additionally, Facebook's second largest audience base could be found with men aged 18 to 24 years.
Facebook connects the world
Founded in 2004 and going public in 2012, Facebook is one of the biggest internet companies in the world with influence that goes beyond social media. It is widely considered as one of the Big Four tech companies, along with Google, Apple, and Amazon (all together known under the acronym GAFA). Facebook is the most popular social network worldwide and the company also owns three other billion-user properties: mobile messaging apps WhatsApp and Facebook Messenger,
as well as photo-sharing app Instagram. Facebook usersThe vast majority of Facebook users connect to the social network via mobile devices. This is unsurprising, as Facebook has many users in mobile-first online markets. Currently, India ranks first in terms of Facebook audience size with 378 million users. The United States, Brazil, and Indonesia also all have more than 100 million Facebook users each.
The number of LinkedIn users in the United Kingdom was forecast to continuously increase between 2024 and 2028 by in total 1.5 million users (+4.51 percent). After the eighth consecutive increasing year, the LinkedIn user base is estimated to reach 34.7 million users and therefore a new peak in 2028. User figures, shown here with regards to the platform LinkedIn, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
As of April 2024, almost 32 percent of global Instagram audiences were aged between 18 and 24 years, and 30.6 percent of users were aged between 25 and 34 years. Overall, 16 percent of users belonged to the 35 to 44 year age group.
Instagram users
With roughly one billion monthly active users, Instagram belongs to the most popular social networks worldwide. The social photo sharing app is especially popular in India and in the United States, which have respectively 362.9 million and 169.7 million Instagram users each.
Instagram features
One of the most popular features of Instagram is Stories. Users can post photos and videos to their Stories stream and the content is live for others to view for 24 hours before it disappears. In January 2019, the company reported that there were 500 million daily active Instagram Stories users. Instagram Stories directly competes with Snapchat, another photo sharing app that initially became famous due to it’s “vanishing photos” feature.
As of the second quarter of 2021, Snapchat had 293 million daily active users.
The number of Instagram users in the United Kingdom was forecast to continuously increase between 2024 and 2028 by in total 2.1 million users (+7.02 percent). After the ninth consecutive increasing year, the Instagram user base is estimated to reach 32 million users and therefore a new peak in 2028. Notably, the number of Instagram users of was continuously increasing over the past years.User figures, shown here with regards to the platform instagram, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
How much time do people spend on social media?
As of 2024, the average daily social media usage of internet users worldwide amounted to 143 minutes per day, down from 151 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of three hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in
the U.S. was just two hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively.
People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general.
During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.
The number of Pinterest users in the United Kingdom was forecast to continuously increase between 2024 and 2028 by in total 0.3 million users (+3.14 percent). After the ninth consecutive increasing year, the Pinterest user base is estimated to reach 9.88 million users and therefore a new peak in 2028. Notably, the number of Pinterest users of was continuously increasing over the past years.User figures, shown here regarding the platform pinterest, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
As of January 2024, Instagram was slightly more popular with men than women, with men accounting for 50.6 percent of the platform’s global users. Additionally, the social media app was most popular amongst younger audiences, with almost 32 percent of users aged between 18 and 24 years.
Instagram’s Global Audience
As of January 2024, Instagram was the fourth most popular social media platform globally, reaching two billion monthly active users (MAU). This number is projected to keep growing with no signs of slowing down, which is not a surprise as the global online social penetration rate across all regions is constantly increasing.
As of January 2024, the country with the largest Instagram audience was India with 362.9 million users, followed by the United States with 169.7 million users.
Who is winning over the generations?
Even though Instagram’s audience is almost twice the size of TikTok’s on a global scale, TikTok has shown itself to be a fierce competitor, particularly amongst younger audiences. TikTok was the most downloaded mobile app globally in 2022, generating 672 million downloads. As of 2022, Generation Z in the United States spent more time on TikTok than on Instagram monthly.
As of April 2024, around 16.5 percent of global active Instagram users were men between the ages of 18 and 24 years. More than half of the global Instagram population worldwide was aged 34 years or younger.
Teens and social media
As one of the biggest social networks worldwide, Instagram is especially popular with teenagers. As of fall 2020, the photo-sharing app ranked third in terms of preferred social network among teenagers in the United States, second to Snapchat and TikTok. Instagram was one of the most influential advertising channels among female Gen Z users when making purchasing decisions. Teens report feeling more confident, popular, and better about themselves when using social media, and less lonely, depressed and anxious.
Social media can have negative effects on teens, which is also much more pronounced on those with low emotional well-being. It was found that 35 percent of teenagers with low social-emotional well-being reported to have experienced cyber bullying when using social media, while in comparison only five percent of teenagers with high social-emotional well-being stated the same. As such, social media can have a big impact on already fragile states of mind.
Instagram’s most popular post
As of April 2024, the most popular post on Instagram was Lionel Messi and his teammates after winning the 2022 FIFA World Cup with Argentina, posted by the account @leomessi. Messi's post, which racked up over 61 million likes within a day, knocked off the reigning post, which was 'Photo of an Egg'. Originally posted in January 2021, 'Photo of an Egg' surpassed the world’s most popular Instagram post at that time, which was a photo by Kylie Jenner’s daughter totaling 18 million likes.
After several cryptic posts published by the account, World Record Egg revealed itself to be a part of a mental health campaign aimed at the pressures of social media use.
Instagram’s most popular accounts
As of April 2024, the official Instagram account @instagram had the most followers of any account on the platform, with 672 million followers. Portuguese footballer Cristiano Ronaldo (@cristiano) was the most followed individual with 628 million followers, while Selena Gomez (@selenagomez) was the most followed woman on the platform with 429 million. Additionally, Inter Miami CF striker Lionel Messi (@leomessi) had a total of 502 million. Celebrities such as The Rock, Kylie Jenner, and Ariana Grande all had over 380 million followers each.
Instagram influencers
In the United States, the leading content category of Instagram influencers was lifestyle, with 15.25 percent of influencers creating lifestyle content in 2021. Music ranked in second place with 10.96 percent, followed by family with 8.24 percent. Having a large audience can be very lucrative: Instagram influencers in the United States, Canada and the United Kingdom with over 90,000 followers made around 1,221 US dollars per post.
Instagram around the globe
Instagram’s worldwide popularity continues to grow, and India is the leading country in terms of number of users, with over 362.9 million users as of January 2024. The United States had 169.65 million Instagram users and Brazil had 134.6 million users. The social media platform was also very popular in Indonesia and Turkey, with 100.9 and 57.1, respectively. As of January 2024, Instagram was the fourth most popular social network in the world, behind Facebook, YouTube and WhatsApp.
As of January 2024, #love was the most used hashtag on Instagram, being included in over two billion posts on the social media platform. #Instagood and #instagram were used over one billion times as of early 2024.
In 2023, Meta Platforms had a total annual revenue of over 134 billion U.S. dollars, up from 116 billion in 2022. LinkedIn reported its highest annual revenue to date, generating over 15 billion USD, whilst Snapchat reported an annual revenue of 4.6 billion USD.
As of April 2024, Bahrain was the country with the highest Instagram audience reach with 95.6 percent. Kazakhstan also had a high Instagram audience penetration rate, with 90.8 percent of the population using the social network. In the United Arab Emirates, Turkey, and Brunei, the photo-sharing platform was used by more than 85 percent of each country's population.
As of April 2024, Facebook had an addressable ad audience reach 131.1 percent in Libya, followed by the United Arab Emirates with 120.5 percent and Mongolia with 116 percent. Additionally, the Philippines and Qatar had addressable ad audiences of 114.5 percent and 111.7 percent.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
*** Fake News on Twitter ***
These 5 datasets are the results of an empirical study on the spreading process of newly fake news on Twitter. Particularly, we have focused on those fake news which have given rise to a truth spreading simultaneously against them. The story of each fake news is as follow:
1- FN1: A Muslim waitress refused to seat a church group at a restaurant, claiming "religious freedom" allowed her to do so.
2- FN2: Actor Denzel Washington said electing President Trump saved the U.S. from becoming an "Orwellian police state."
3- FN3: Joy Behar of "The View" sent a crass tweet about a fatal fire in Trump Tower.
4- FN4: The animated children's program 'VeggieTales' introduced a cannabis character in August 2018.
5- FN5: In September 2018, the University of Alabama football program ended its uniform contract with Nike, in response to Nike's endorsement deal with Colin Kaepernick.
The data collection has been done in two stages that each provided a new dataset: 1- attaining Dataset of Diffusion (DD) that includes information of fake news/truth tweets and retweets 2- Query of neighbors for spreaders of tweets that provides us with Dataset of Graph (DG).
DD
DD for each fake news story is an excel file, named FNx_DD where x is the number of fake news, and has the following structure:
The structure of excel files for each dataset is as follow:
Each row belongs to one captured tweet/retweet related to the rumor, and each column of the dataset presents a specific information about the tweet/retweet. These columns from left to right present the following information about the tweet/retweet:
User ID (user who has posted the current tweet/retweet)
The description sentence in the profile of the user who has published the tweet/retweet
The number of published tweet/retweet by the user at the time of posting the current tweet/retweet
Date and time of creation of the account by which the current tweet/retweet has been posted
Language of the tweet/retweet
Number of followers
Number of followings (friends)
Date and time of posting the current tweet/retweet
Number of like (favorite) the current tweet had been acquired before crawling it
Number of times the current tweet had been retweeted before crawling it
Is there any other tweet inside of the current tweet/retweet (for example this happens when the current tweet is a quote or reply or retweet)
The source (OS) of device by which the current tweet/retweet was posted
Tweet/Retweet ID
Retweet ID (if the post is a retweet then this feature gives the ID of the tweet that is retweeted by the current post)
Quote ID (if the post is a quote then this feature gives the ID of the tweet that is quoted by the current post)
Reply ID (if the post is a reply then this feature gives the ID of the tweet that is replied by the current post)
Frequency of tweet occurrences which means the number of times the current tweet is repeated in the dataset (for example the number of times that a tweet exists in the dataset in the form of retweet posted by others)
State of the tweet which can be one of the following forms (achieved by an agreement between the annotators):
r : The tweet/retweet is a fake news post
a : The tweet/retweet is a truth post
q : The tweet/retweet is a question about the fake news, however neither confirm nor deny it
n : The tweet/retweet is not related to the fake news (even though it contains the queries related to the rumor, but does not refer to the given fake news)
DG
DG for each fake news contains two files:
A file in graph format (.graph) which includes the information of graph such as who is linked to whom. (This file named FNx_DG.graph, where x is the number of fake news)
A file in Jsonl format (.jsonl) which includes the real user IDs of nodes in the graph file. (This file named FNx_Labels.jsonl, where x is the number of fake news)
Because in the graph file, the label of each node is the number of its entrance in the graph. For example if node with user ID 12345637 be the first node which has been entered into the graph file then its label in the graph is 0 and its real ID (12345637) would be at the row number 1 (because the row number 0 belongs to column labels) in the jsonl file and so on other node IDs would be at the next rows of the file (each row corresponds to 1 user id). Therefore, if we want to know for example what the user id of node 200 (labeled 200 in the graph) is, then in jsonl file we should look at row number 202.
The user IDs of spreaders in DG (those who have had a post in DD) would be available in DD to get extra information about them and their tweet/retweet. The other user IDs in DG are the neighbors of these spreaders and might not exist in DD.