62 datasets found

Instagram Dataset
brightdata.com
.json, .csv, .xlsx
Updated Apr 26, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2022). Instagram Dataset [Dataset]. https://brightdata.com/products/datasets/instagram
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Apr 26, 2022
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Use our Instagram dataset (public data) to extract business and non-business information from complete public profiles and filter by hashtags, followers, account type, or engagement score. Depending on your needs, you may purchase the entire dataset or a customized subset. Popular use cases include sentiment analysis, brand monitoring, influencer marketing, and more. The dataset includes all major data points: # of followers, verified status, account type (business / non-business), links, posts, comments, location, engagement score, hashtags, and much more.
Instagram: number of global users 2020-2025
statista.com
Updated May 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Instagram: number of global users 2020-2025 [Dataset]. https://www.statista.com/statistics/183585/instagram-number-of-global-users/
Explore at:
Dataset updated
May 22, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
In 2021, there were 1.21 billion monthly active users of Meta's Instagram, making up over 28 percent of the world's internet users. By 2025, it has been forecast that there will be 1.44 billion monthly active users of the social media platform, which would account for 31.2 percent of global internet users.

How popular is Instagram?

Instagram, as of January 2022, was the fourth most popular social media platform in the world in terms of user numbers. YouTube and WhatsApp ranked in second and third place, respectively, whilst Facebook remained the most popular, with almost three billion monthly active users worldwide.

India had the largest number of Instagram users as of January 2022, with a total of over 230 million users in the country. The second-largest Instagram audience could be found in the United States, with almost 160 million people subscribing to the photo and video sharing app.

Gen Z and Instagram

As of September 2021, Gen Z users in the United States spent an average of five hours per week on Instagram. Although Instagram ranked third in terms of hours per week spent on the platform, Gen Z users spent considerably more time on TikTok, amounting to a weekly average of over 10 hours being spent on the mobile-first video app.

Most followed accounts on Instagram

As of May 2022, Instagram’s own account had 504.37 million followers. In terms of celebrities, Portuguese footballer Cristiano Ronaldo (@chistiano) had over 440.41 million followers on the social network. Moreover, the average media value of an Instagram post by Ronaldo was over 985,000 U.S. dollars.

The most liked post on Instagram as of May 2022 was Photo of an Egg, which was posted in 2019 by the account @world_record_egg. Photo of an Egg has not only exceeded 55 million likes on the platform, but it also has nearly 3.5 million comments, and the account itself has over 4.5 million Instagram followers. After mysterious posts published by the account, World Record Egg revealed itself as part of a mental health campaign aimed at the difficulties and demands of using social media.
Top Instagram Accounts Data (Cleaned)
kaggle.com
Updated Feb 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Faisal Ali (2023). Top Instagram Accounts Data (Cleaned) [Dataset]. https://www.kaggle.com/datasets/faisaljanjua0555/top-200-most-followed-instagram-accounts-2023/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 24, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Muhammad Faisal Ali
Description
The Top Instagram Accounts Dataset is a collection of 200 rows of data that provides valuable insights into the most popular Instagram accounts across different categories. The dataset contains several columns that provide comprehensive information on each account's performance, engagement rate, and audience size.

1. The "rank": column lists the accounts in order of their popularity on Instagram, starting from the most followed account.

2. The "name": column displays the Instagram handle of the account, which can be used to locate and follow the account on Instagram.

3. The "channel_info": column provides a brief description of the account, such as the type of content it features or the products and services it offers.

4. The "Category": column categorizes the account based on its primary theme or subject matter, such as fashion, sports, entertainment, or food.

5. The "posts": column displays the total number of posts on the account. This column helps to understand the account's level of activity and the amount of content it has produced over time.

6. The "followers": column indicates the number of people who follow the account on Instagram.

7. The "avg likes": column displays the average number of likes that the account's posts receive per post.

8. The "eng rate": column calculates the account's engagement rate by dividing the total number of likes and comments received by the total number of followers, expressed as a percentage.

How you can use this Dataset?

The Top Instagram Accounts Dataset can be used in a variety of ways to gain insights into the performance and engagement levels of popular Instagram accounts. Here are a few examples of what you can do with this dataset:

1. Conduct category analysis: The dataset provides information on the category of each Instagram account. You can use this information to conduct a category analysis and identify the most popular categories on Instagram.

2. Identify top influencers: The dataset ranks Instagram accounts based on their follower count. You can use this information to identify the top influencers in different categories and use them for influencer marketing campaigns.

3. Analyze engagement levels: The dataset includes columns such as "avg likes" and "eng rate" that provide insights into the engagement levels of Instagram accounts. You can use this information to understand what type of content resonates with Instagram users and create more engaging content for your own account.
Z
A set of generated Instagram Data Download Packages (DDPs) to investigate...
data.niaid.nih.gov
Updated Jan 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura Boeschoten (2021). A set of generated Instagram Data Download Packages (DDPs) to investigate their structure and content [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4472605
Explore at:
Dataset updated
Jan 28, 2021
Dataset provided by
Ruben van den Goorbergh
Laura Boeschoten
Daniel Oberski
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Instagram data-download example dataset

In this repository you can find a data-set consisting of 11 personal Instagram archives, or Data-Download Packages (DDPs).

How the data was generated

These Instagram accounts were all new and generated by a group of researchers who were interested to figure out in detail the structure and variety in structure of these Instagram DDPs. The participants user the Instagram account extensively for approximately a week. The participants also intensively communicated with each other so that the data can be used as an example of a network.

The data was primarily generated to evaluate the performance of de-identification software. Therefore, the text in the DDPs particularly contain many randomly chosen (Dutch) first names, phone numbers, e-mail addresses and URLS. In addition, the images in the DDPs contain many faces and text as well. The DDPs contain faces and text (usernames) of third parties. However, only content of so-called `professional accounts' are shared, such as accounts of famous individuals or institutions who self-consciously and actively seek publicity, and these sources are easily publicly available. Furthermore, the DDPs do not contain sensitive personal data of these individuals.

Obtaining your Instagram DDP

After using the Instagram accounts intensively for approximately a week, the participants requested their personal Instagram DDPs by using the following steps. You can follow these steps yourself if you are interested in your personal Instagram DDP.

Go to www.instagram.com and log in

Click on your profile picture, go to Settings and Privacy and Security

Scroll to Data download and click Request download

Enter your email adress and click Next

Enter your password and click Request download

Instagram then delivered the data in a compressed zip folder with the format username_YYYYMMDD.zip (i.e., Instagram handle and date of download) to the participant, and the participants shared these DDPs with us.

Data cleaning

To comply with the Instagram user agreement, participants shared their full name, phone number and e-mail address. In addition, Instagram logged the i.p. addresses the participant used during their active period on Instagram. After colleting the DDPs, we manually replaced such information with random replacements such that the DDps shared here do not contain any personal data of the participants.

How this data-set can be used

This data-set was generated with the intention to evaluate the performance of the de-identification software. We invite other researchers to use this data-set for example to investigate what type of data can be found in Instagram DDPs or to investigate the structure of Instagram DDPs. The packages can also be used for example data-analyses, although no substantive research questions can be answered using this data as the data does not reflect how research subjects behave `in the wild'.

Authors

The data collection is executed by Laura Boeschoten, Ruben van den Goorbergh and Daniel Oberski of Utrecht University. For questions, please contact l.boeschoten@uu.nl.

Acknowledgments

The researchers would like to thank everyone who participated in this data-generation project.
Instagram: distribution of global audiences 2024, by age group
statista.com
ai-chatbox.pro
Updated Jul 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stacy Jo Dixon (2024). Instagram: distribution of global audiences 2024, by age group [Dataset]. https://www.statista.com/topics/1164/social-networks/
Explore at:
Dataset updated
Jul 16, 2024
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description
As of April 2024, almost 32 percent of global Instagram audiences were aged between 18 and 24 years, and 30.6 percent of users were aged between 25 and 34 years. Overall, 16 percent of users belonged to the 35 to 44 year age group. Instagram users With roughly one billion monthly active users, Instagram belongs to the most popular social networks worldwide. The social photo sharing app is especially popular in India and in the United States, which have respectively 362.9 million and 169.7 million Instagram users each. Instagram features One of the most popular features of Instagram is Stories. Users can post photos and videos to their Stories stream and the content is live for others to view for 24 hours before it disappears. In January 2019, the company reported that there were 500 million daily active Instagram Stories users. Instagram Stories directly competes with Snapchat, another photo sharing app that initially became famous due to it’s “vanishing photos” feature. As of the second quarter of 2021, Snapchat had 293 million daily active users.
Instagram User Analysis Project
kaggle.com
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanjana Murthy (2024). Instagram User Analysis Project [Dataset]. https://www.kaggle.com/datasets/sanjanamurthy392/instagram-user-analysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 6, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sanjana Murthy
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
About Datasets: - Domain : Social Media Analytics (User Engagement) - Project: Instagram User Analysis Project - Dataset: insta user analysis dataset - Dataset Type: docx

KPI's: 1. Find the 5 oldest users of the Instagram from the database provided 2. Find the users who have never posted a single photo on instagram 3. Identify the winner of the contest and provide their details to the team 4. Identify and suggest the top 5 most commonly used hashtags on the platform 5. What day of the week do most users register on? Provide insights on when to schedule an ad campaign 6. Provide how many times does average user posts on Instagram. Also provide the total number of photos on Instagram/total number of users 7. Provide data on users (bots) who have liked every single photo on the site (since any normal user would not be able to do this)

Process: 1. Understanding the problem 2. Data Collection 3. Data Cleaning 4. Exploring and analyzing the data 5. Interpreting the results

This data contains create database, use, create table, int auto_increment unique primary key, varchar not null, timestamp default now, foreign key, references, insert into, select, count *, from group by, having, delete, where, and, describe, select distinct, left join, is null, order by, inner join, order by desc limit, date_format, sum, count.
Z
Dataset for the Instagram and TikTok problematic use
data.niaid.nih.gov
Updated Jul 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Limniou, Maria (2023). Dataset for the Instagram and TikTok problematic use [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_8159159
Explore at:
Dataset updated
Jul 19, 2023
Dataset provided by
Hendrikse, Calanthe
Limniou, Maria
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset supports research on how engagement with social media (Instagram and TikTok) was related to problematic social media use (PSMU) and mental well-being. There are three different files. The SPSS and Excel spreadsheet files include the same dataset but in a different format. The SPSS output presents the data analysis in regard to the difference between Instagram and TikTok users.
Time and Dynamics of Instagram Users
zenodo.org
data.niaid.nih.gov
bin
Updated Jan 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amirhosein Bodaghi; Sama Goliaei; Amirhosein Bodaghi; Sama Goliaei (2020). Time and Dynamics of Instagram Users [Dataset]. http://doi.org/10.5281/zenodo.1439178
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1439178
Dataset updated
Jan 21, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Amirhosein Bodaghi; Sama Goliaei; Amirhosein Bodaghi; Sama Goliaei
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These four datasets are gathered from Instagram users who were chosen randomly.

The MainDataset encompasses data for 818 users. The TestDataset encompasses data for 78 users.

Data gathered for each user includes :

1- number of posts

2- number of followers

3- number of followings

4- number of likes for the tenth previous post

5- number of likes for the eleventh previous post

6- number of likes for the twelfth previous post

7- number of self-presenting posts from nine previous posts

8- gender

The MainDataset_after_150_days and TestDataset_after_150_days encompass data of the users of the Main data set and the Test data set, respectively, for after 150 days. For example, User_1 in the MainDataset has 486 posts and in the MainDataset_after_150_days has 562 posts, which means over the course of 150 days he had published 76 posts.
Instagram user worldwide 2024, by country
statista.com
Updated Mar 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Instagram user worldwide 2024, by country [Dataset]. https://www.statista.com/forecasts/1174700/instagram-user-by-country
Explore at:
Dataset updated
Mar 10, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 1, 2024 - Dec 31, 2024
Area covered
Albania
Description
The number of Instagram users ranking is led by India with 331.94 million users, while the United States is following with 156.08 million users. In contrast, Seychelles is at the bottom of the ranking with 0.02 million users, showing a difference of 331.92 million users to India. User figures, shown here with regards to the platform instagram, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
Instagram users in Indonesia 2019-2028
statista.com
Updated Mar 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2025). Instagram users in Indonesia 2019-2028 [Dataset]. https://www.statista.com/topics/8306/social-media-in-indonesia/
Explore at:
Dataset updated
Mar 27, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
Indonesia
Description
The number of Instagram users in Indonesia was forecast to continuously increase between 2024 and 2028 by in total 5.3 million users (+4.25 percent). After the ninth consecutive increasing year, the Instagram user base is estimated to reach 129.83 million users and therefore a new peak in 2028. Notably, the number of Instagram users of was continuously increasing over the past years.User figures, shown here with regards to the platform instagram, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Instagram users in countries like Philippines and Thailand.
Threads, an Instagram app Reviews
kaggle.com
Updated Jul 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saloni Jhalani (2023). Threads, an Instagram app Reviews [Dataset]. https://www.kaggle.com/datasets/saloni1712/threads-an-instagram-app-reviews
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 26, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Saloni Jhalani
License
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
Description
The Threads, an Instagram App Reviews dataset is a comprehensive collection of user reviews from the Threads mobile app on Google Play Store & App Store, capturing valuable insights and sentiments. The dataset enables the understanding of user satisfaction, evaluation of app performance, and identification of emerging patterns.

The way data was collected

Scraping Threads App reviews on Google Play Store & App Store

Ideas for using this dataset

Sentiment analysis

What makes the application receive 1-star and 5-star

Note - It was last updated on July 26th 2023
n
Instagram users in Pakistan
napoleoncat.com
png
Updated Apr 15, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NapoleonCat (2024). Instagram users in Pakistan [Dataset]. https://napoleoncat.com/stats/instagram-users-in-pakistan/2024/04
Explore at:
pngAvailable download formats
Dataset updated
Apr 15, 2024
Dataset authored and provided by
NapoleonCat
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Apr 2024
Area covered
Pakistan
Description
There were 18 490 001 Instagram users in Pakistan in April 2024, which accounted for 7.9% of its entire population. The majority of them were men - 65.3%. People aged 18 to 24 were the largest user group (8 900 000). The highest difference between men and women occurs within people aged 18 to 24, where men lead by 5 700 000.
Instagram: distribution of global audiences 2024, by age and gender
statista.com
ai-chatbox.pro
Updated Jul 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stacy Jo Dixon (2024). Instagram: distribution of global audiences 2024, by age and gender [Dataset]. https://www.statista.com/topics/1164/social-networks/
Explore at:
Dataset updated
Jul 16, 2024
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description
As of April 2024, around 16.5 percent of global active Instagram users were men between the ages of 18 and 24 years. More than half of the global Instagram population worldwide was aged 34 years or younger. Teens and social media As one of the biggest social networks worldwide, Instagram is especially popular with teenagers. As of fall 2020, the photo-sharing app ranked third in terms of preferred social network among teenagers in the United States, second to Snapchat and TikTok. Instagram was one of the most influential advertising channels among female Gen Z users when making purchasing decisions. Teens report feeling more confident, popular, and better about themselves when using social media, and less lonely, depressed and anxious. Social media can have negative effects on teens, which is also much more pronounced on those with low emotional well-being. It was found that 35 percent of teenagers with low social-emotional well-being reported to have experienced cyber bullying when using social media, while in comparison only five percent of teenagers with high social-emotional well-being stated the same. As such, social media can have a big impact on already fragile states of mind.

im_instagram_70k

kaggle.com

Updated Nov 24, 2020

Facebook

Twitter

Click to copy link

Link copied

Cite

Kristo Radion Purba (2020). im_instagram_70k [Dataset]. https://www.kaggle.com/krpurba/im-instagram-70k/metadata

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Nov 24, 2020

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Kristo Radion Purba

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This is an Instagram social network data for Influence Maximization (IM) task. It was collected from Instagram on April to May 2020, from the followers of 24 private universities in Malaysia, using Instagram API and various third-party Instagram websites. It consists of mostly Malaysian users, with 70,409 nodes/users, 1,007,107 edges/connections (followees and followers). Please kindly cite the following paper upon usage of this dataset:

Authors: Kristo Radion Purba; David Asirvatham; Raja Kumar Murugesan
Title: Influence Maximization Diffusion Models Based On Engagement and Activeness on Instagram
Journal: Journal of King Saud University - Computer and Information Sciences
Year: 2020
Link: https://doi.org/10.1016/j.jksuci.2020.09.012

Kindly address further questions to : kristoradion@live.com

Data File	Description
Instagram User Stats.csv	pos = Number of posts flr = Number of Followers flg = Number of Following eg = Engagement grade, a scale of 1 to 12 that reflects the strength of engagement rate relative to the number of followers (click to learn more) er = Engagement Rate, i.e. likes+comments / followers fg = Followers growth % in a month op = Outsiders percentage %, i.e. non-followers that liked a post, divided by all unique likers
Network for IC LT.txt	source, target, weight (weight = 1/followees)
Network for IC-u LT-u.txt	source, target, weight (weight = 1/followees * source's EG * target's FG)

Dataset for Instagram influencers and females' consumer behaviour
zenodo.org
data.niaid.nih.gov
Updated Jan 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maria Limniou; Maria Limniou; Ellen Lovatt; Harriet Graham; Ellen Lovatt; Harriet Graham (2024). Dataset for Instagram influencers and females' consumer behaviour [Dataset]. http://doi.org/10.5281/zenodo.10467062
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10467062
Dataset updated
Jan 7, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Maria Limniou; Maria Limniou; Ellen Lovatt; Harriet Graham; Ellen Lovatt; Harriet Graham
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset supports research on how Instagram Influencers impact female consumer behaviour to purchase products and the role of factors such as envy, scepticism towards advertising, satisfaction with life, social comparison and maternalism on consumer behaviour. There are two different files. The SPSS and CVS spreadsheet files include the same dataset but in a different format.

Data from: Mpox Narrative on Instagram: A Labeled Multilingual Dataset of...

zenodo.org
data.niaid.nih.gov

bin

Updated Sep 20, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Nirmalya Thakur; Nirmalya Thakur (2024). Mpox Narrative on Instagram: A Labeled Multilingual Dataset of Instagram Posts on Mpox for Sentiment, Hate Speech, and Anxiety Analysis [Dataset]. http://doi.org/10.5281/zenodo.13738598

Explore at:

binAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.13738598

Dataset updated

Sep 20, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Nirmalya Thakur; Nirmalya Thakur

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Time period covered

Sep 9, 2024

Description

Please cite the following paper when using this dataset:

N. Thakur, “Mpox narrative on Instagram: A labeled multilingual dataset of Instagram posts on mpox for sentiment, hate speech, and anxiety analysis,” arXiv [cs.LG], 2024, URL: https://arxiv.org/abs/2409.05292

Abstract

The world is currently experiencing an outbreak of mpox, which has been declared a Public Health Emergency of International Concern by WHO. During recent virus outbreaks, social media platforms have played a crucial role in keeping the global population informed and updated regarding various aspects of the outbreaks. As a result, in the last few years, researchers from different disciplines have focused on the development of social media datasets focusing on different virus outbreaks. No prior work in this field has focused on the development of a dataset of Instagram posts about the mpox outbreak. The work presented in this paper (stated above) aims to address this research gap. It presents this multilingual dataset of 60,127 Instagram posts about mpox, published between July 23, 2022, and September 5, 2024. This dataset contains Instagram posts about mpox in 52 languages. For each of these posts, the Post ID, Post Description, Date of publication, language, and translated version of the post (translation to English was performed using the Google Translate API) are presented as separate attributes in the dataset.

After developing this dataset, sentiment analysis, hate speech detection, and anxiety or stress detection were also performed. This process included classifying each post into

one of the fine-grain sentiment classes, i.e., fear, surprise, joy, sadness, anger, disgust, or neutral,
hate or not hate
anxiety/stress detected or no anxiety/stress detected.

These results are presented as separate attributes in the dataset for the training and testing of machine learning algorithms for sentiment, hate speech, and anxiety or stress detection, as well as for other applications.

The 52 distinct languages in which Instagram posts are present in the dataset are English, Portuguese, Indonesian, Spanish, Korean, French, Hindi, Finnish, Turkish, Italian, German, Tamil, Urdu, Thai, Arabic, Persian, Tagalog, Dutch, Catalan, Bengali, Marathi, Malayalam, Swahili, Afrikaans, Panjabi, Gujarati, Somali, Lithuanian, Norwegian, Estonian, Swedish, Telugu, Russian, Danish, Slovak, Japanese, Kannada, Polish, Vietnamese, Hebrew, Romanian, Nepali, Czech, Modern Greek, Albanian, Croatian, Slovenian, Bulgarian, Ukrainian, Welsh, Hungarian, and Latvian.

The following table represents the data description for this dataset

Attribute Name	Attribute Description
Post ID	Unique ID of each Instagram post
Post Description	Complete description of each post in the language in which it was originally published
Date	Date of publication in MM/DD/YYYY format
Language	Language of the post as detected using the Google Translate API
Translated Post Description	Translated version of the post description. All posts which were not in English were translated into English using the Google Translate API. No language translation was performed for English posts.
Sentiment	Results of sentiment analysis (using translated Post Description) where each post was classified into one of the sentiment classes: fear, surprise, joy, sadness, anger, disgust, and neutral
Hate	Results of hate speech detection (using translated Post Description) where each post was classified as hate or not hate
Anxiety or Stress	Results of anxiety or stress detection (using translated Post Description) where each post was classified as stress/anxiety detected or no stress/anxiety detected.

D
Dataset of "A diary study investigating the differential impacts of...
ssh.datastations.nl
datacatalogue.cessda.eu
tsv
Updated Sep 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DANS Data Station Social Sciences and Humanities (2023). Dataset of "A diary study investigating the differential impacts of Instagram content on youths' body image" [Dataset]. http://doi.org/10.17026/SS/7M90LJ
Explore at:
tsv(2535)Available download formats
Unique identifier
https://doi.org/10.17026/SS/7M90LJ
Dataset updated
Sep 27, 2023
Dataset provided by
DANS Data Station Social Sciences and Humanities
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Through social media like Instagram, users are constantly exposed to “perfect” lives and bodies. Research in this field has predominantly focused on the mere time youth spend on Instagram and the effects on their body image, oftentimes uncovering negative effects. Little research has been done on the root of the influence: the consumed content itself. Hence, this study aims to qualitatively uncover the types of content that trigger youths’ body image. Using a diary study, 28 youth (Mage = 21.86; 79% female) reported 140 influential body image Instagram posts over five days, uncovering trigger points and providing their motivations, emotions, and impacts on body image. Based on these posts, four content categories were distinguished: Thin Ideal, Body Positivity, Fitness, and Lifestyle. These different content types triggered different emotions regarding body image, and clear gender distinctions in content could be noticed. The study increased youths’ awareness of Instagram's influence on their mood and body perception. The findings imply that the discussion about the effects of social media on body image should be nuanced, taking into account different types of content and users. Using this information, future interventions could focus on conscious use of social media rather than merely limiting its use.
f
SPSS Dataset
figshare.com
bin
Updated Nov 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
bankole temitope (2023). SPSS Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.23118368.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23118368.v1
Dataset updated
Nov 1, 2023
Dataset provided by
figshare
Authors
bankole temitope
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Quantitative analysis of adolescent exposure to fast food marketing on Instagram. Descriptive statistics were calculated and the total frequency of each marketing strategy was obtained. For the continuous variables mean and standard deviation values were obtained. Mann-Whitney U tests were conducted to examine the association between the marketing strategies and user engagement, while the Kruskal-Wallis H test was completed to test for associations between brand name and engagement.
Fake/Authentic User Instagram
kaggle.com
zip
Updated Feb 11, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kristo Radion Purba (2021). Fake/Authentic User Instagram [Dataset]. https://www.kaggle.com/krpurba/fakeauthentic-user-instagram
Explore at:
zip(3451107 bytes)Available download formats
Dataset updated
Feb 11, 2021
Authors
Kristo Radion Purba
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Kindly refer to my paper for more information. Please cite my work if you use my dataset in any work : K. R. Purba, D. Asirvatham and R. K. Murugesan, "Classification of instagram fake users using supervised machine learning algorithms," International Journal of Electrical and Computer Engineering (IJECE), vol. 10, no. 3, pp. 2763-2772, 2020.

The dataset was collected using web scraping from third-party Instagram websites, to capture their metadata and up to 12 latest media posts from each user. The collection process was executed from September 1st, 2019, until September 20th, 2019. The dataset contains authentic users and fake users, which were filtered using human annotators. The authentic users were taken from followers of 24 private university pages (8 Indonesian, 8 Malaysian, 8 Australian) on Instagram. To reduce the number of users, they are picked using proportional random sampling based on their source university. All private users were removed, which is a total of 31,335 out of 63,795 users (49.11%). The final number of public users used in this research was 32,460 users.

Var name | Feature name | Description pos | Num posts | Number of total posts that the user has ever posted. flg | Num following | Number of following flr | Num followers | Number of followers bl | Biography length | Length (number of characters) of the user's biography pic | Picture availability | Value 0 if the user has no profile picture, or 1 if has lin | Link availability | Value 0 if the user has no external URL, or 1 if has cl | Average caption length | The average number of character of captions in media cz | Caption zero | Percentage (0.0 to 1.0) of captions that has almost zero (<=3) length ni | Non image percentage | Percentage (0.0 to 1.0) of non-image media. There are three types of media on an Instagram post, i.e. image, video, carousel erl | Engagement rate (Like) | Engagement rate (ER) is commonly defined as (num likes) divide by (num media) divide by (num followers) erc | Engagement rate (Comm.) | Similar to ER like, but it is for comments lt | Location tag percentage | Percentage (0.0 to 1.0) of posts tagged with location hc | Average hashtag count | Average number of hashtags used in a post pr | Promotional keywords | Average use of promotional keywords in hashtag, i.e. {regrann, contest, repost, giveaway, mention, share, give away, quiz} fo | Followers keywords | Average use of followers hunter keywords in hashtag, i.e. {follow, like, folback, follback, f4f} cs | Cosine similarity | Average cosine similarity of between all pair of two posts a user has pi | Post interval | Average interval between posts (in hours)

Output : 2-class User classes : r (real/authentic user), f (fake user / bought followers) 4-class User classes : r (authentic/real user), a (active fake user), i (inactive fake user), s (spammer fake user) Note that the 3 fake user classes (a, i, s) were judged by human annotators.

Myket Android Application Install Dataset

paperswithcode.com

Updated Aug 12, 2023

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Erfan Loghmani; Mohammadamin Fazli (2023). Myket Android Application Install Dataset [Dataset]. https://paperswithcode.com/dataset/myket-android-application-install

Explore at:

Dataset updated

Aug 12, 2023

Authors

Erfan Loghmani; Mohammadamin Fazli

Description

This dataset contains information on application install interactions of users in the Myket android application market. The dataset was created for the purpose of evaluating interaction prediction models, requiring user and item identifiers along with timestamps of the interactions. Hence, the dataset can be used for interaction prediction and building a recommendation system. Furthermore, the data forms a dynamic network of interactions, and we can also perform network representation learning on the nodes in the network, which are users and applications.

Data Creation The dataset was initially generated by the Myket data team, and later cleaned and subsampled by Erfan Loghmani a master student at Sharif University of Technology at the time. The data team focused on a two-week period and randomly sampled 1/3 of the users with interactions during that period. They then selected install and update interactions for three months before and after the two-week period, resulting in interactions spanning about 6 months and two weeks.

We further subsampled and cleaned the data to focus on application download interactions. We identified the top 8000 most installed applications and selected interactions related to them. We retained users with more than 32 interactions, resulting in 280,391 users. From this group, we randomly selected 10,000 users, and the data was filtered to include only interactions for these users. The detailed procedure can be found in here.

Data Structure The dataset has two main files.

myket.csv: This file contains the interaction information and follows the same format as the datasets used in the "JODIE: Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks" (ACM SIGKDD 2019) project. However, this data does not contain state labels and interaction features, resulting in associated columns being all zero. app_info_sample.csv: This file comprises features associated with applications present in the sample. For each individual application, information such as the approximate number of installs, average rating, count of ratings, and category are included. These features provide insights into the applications present in the dataset.

Dataset Details

Total Instances: 694,121 install interaction instances Instances Format: Triplets of user_id, app_name, timestamp 10,000 users and 7,988 android applications Item features for 7,606 applications

For a detailed summary of the data's statistics, including information on users, applications, and interactions, please refer to the Python notebook available at summary-stats.ipynb. The notebook provides an overview of the dataset's characteristics and can be helpful for understanding the data's structure before using it for research or analysis.

Top 20 Most Installed Applications | Package Name | Count of Interactions | | ---------------------------------- | --------------------- | | com.instagram.android | 15292 | | ir.resaneh1.iptv | 12143 | | com.tencent.ig | 7919 | | com.ForgeGames.SpecialForcesGroup2 | 7797 | | ir.nomogame.ClutchGame | 6193 | | com.dts.freefireth | 6041 | | com.whatsapp | 5876 | | com.supercell.clashofclans | 5817 | | com.mojang.minecraftpe | 5649 | | com.lenovo.anyshare.gps | 5076 | | ir.medu.shad | 4673 | | com.firsttouchgames.dls3 | 4641 | | com.activision.callofduty.shooter | 4357 | | com.tencent.iglite | 4126 | | com.aparat | 3598 | | com.kiloo.subwaysurf | 3135 | | com.supercell.clashroyale | 2793 | | co.palang.QuizOfKings | 2589 | | com.nazdika.app | 2436 | | com.digikala | 2413 |

Comparison with SNAP Datasets The Myket dataset introduced in this repository exhibits distinct characteristics compared to the real-world datasets used by the project. The table below provides a comparative overview of the key dataset characteristics:

Dataset	#Users	#Items	#Interactions	Average Interactions per User	Average Unique Items per User
Myket	10,000	7,988	694,121	69.4	54.6
LastFM	980	1,000	1,293,103	1,319.5	158.2
Reddit	10,000	984	672,447	67.2	7.9
Wikipedia	8,227	1,000	157,474	19.1	2.2
MOOC	7,047	97	411,749	58.4	25.3

The Myket dataset stands out by having an ample number of both users and items, highlighting its relevance for real-world, large-scale applications. Unlike LastFM, Reddit, and Wikipedia datasets, where users exhibit repetitive item interactions, the Myket dataset contains a comparatively lower amount of repetitive interactions. This unique characteristic reflects the diverse nature of user behaviors in the Android application market environment.

Citation If you use this dataset in your research, please cite the following preprint:

@misc{loghmani2023effect, title={Effect of Choosing Loss Function when Using T-batching for Representation Learning on Dynamic Networks}, author={Erfan Loghmani and MohammadAmin Fazli}, year={2023}, eprint={2308.06862}, archivePrefix={arXiv}, primaryClass={cs.LG} }

Facebook

Twitter

Click to copy link

Link copied

Cite

Bright Data (2022). Instagram Dataset [Dataset]. https://brightdata.com/products/datasets/instagram

Instagram Dataset

Explore at:

.json, .csv, .xlsxAvailable download formats

Dataset updated

Apr 26, 2022

Dataset authored and provided by

Bright Datahttps://brightdata.com/

License

https://brightdata.com/licensehttps://brightdata.com/license

Area covered

Worldwide

Description

Use our Instagram dataset (public data) to extract business and non-business information from complete public profiles and filter by hashtags, followers, account type, or engagement score. Depending on your needs, you may purchase the entire dataset or a customized subset. Popular use cases include sentiment analysis, brand monitoring, influencer marketing, and more. The dataset includes all major data points: # of followers, verified status, account type (business / non-business), links, posts, comments, location, engagement score, hashtags, and much more.

Clear search

Close search

Google apps

Main menu

Instagram Dataset

Instagram: number of global users 2020-2025

Top Instagram Accounts Data (Cleaned)

How you can use this Dataset?

A set of generated Instagram Data Download Packages (DDPs) to investigate...

Instagram: distribution of global audiences 2024, by age group

Instagram User Analysis Project

Dataset for the Instagram and TikTok problematic use

Time and Dynamics of Instagram Users

Instagram user worldwide 2024, by country

Instagram users in Indonesia 2019-2028

Threads, an Instagram app Reviews

The way data was collected

Ideas for using this dataset

Note - It was last updated on July 26th 2023

Instagram users in Pakistan

Instagram: distribution of global audiences 2024, by age and gender

im_instagram_70k

Dataset for Instagram influencers and females' consumer behaviour

Data from: Mpox Narrative on Instagram: A Labeled Multilingual Dataset of...

Dataset of "A diary study investigating the differential impacts of...

SPSS Dataset

Fake/Authentic User Instagram

Myket Android Application Install Dataset

Instagram DatasetSee More Versions

Instagram Dataset