62 datasets found
  1. Instagram Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Apr 26, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2022). Instagram Dataset [Dataset]. https://brightdata.com/products/datasets/instagram
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Apr 26, 2022
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Use our Instagram dataset (public data) to extract business and non-business information from complete public profiles and filter by hashtags, followers, account type, or engagement score. Depending on your needs, you may purchase the entire dataset or a customized subset. Popular use cases include sentiment analysis, brand monitoring, influencer marketing, and more. The dataset includes all major data points: # of followers, verified status, account type (business / non-business), links, posts, comments, location, engagement score, hashtags, and much more.

  2. Instagram: number of global users 2020-2025

    • statista.com
    Updated May 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Instagram: number of global users 2020-2025 [Dataset]. https://www.statista.com/statistics/183585/instagram-number-of-global-users/
    Explore at:
    Dataset updated
    May 22, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2021, there were 1.21 billion monthly active users of Meta's Instagram, making up over 28 percent of the world's internet users. By 2025, it has been forecast that there will be 1.44 billion monthly active users of the social media platform, which would account for 31.2 percent of global internet users.

    How popular is Instagram?

    Instagram, as of January 2022, was the fourth most popular social media platform in the world in terms of user numbers. YouTube and WhatsApp ranked in second and third place, respectively, whilst Facebook remained the most popular, with almost three billion monthly active users worldwide.

    India had the largest number of Instagram users as of January 2022, with a total of over 230 million users in the country. The second-largest Instagram audience could be found in the United States, with almost 160 million people subscribing to the photo and video sharing app.

    Gen Z and Instagram

    As of September 2021, Gen Z users in the United States spent an average of five hours per week on Instagram. Although Instagram ranked third in terms of hours per week spent on the platform, Gen Z users spent considerably more time on TikTok, amounting to a weekly average of over 10 hours being spent on the mobile-first video app.

    Most followed accounts on Instagram

    As of May 2022, Instagram’s own account had 504.37 million followers. In terms of celebrities, Portuguese footballer Cristiano Ronaldo (@chistiano) had over 440.41 million followers on the social network. Moreover, the average media value of an Instagram post by Ronaldo was over 985,000 U.S. dollars.

    The most liked post on Instagram as of May 2022 was Photo of an Egg, which was posted in 2019 by the account @world_record_egg. Photo of an Egg has not only exceeded 55 million likes on the platform, but it also has nearly 3.5 million comments, and the account itself has over 4.5 million Instagram followers. After mysterious posts published by the account, World Record Egg revealed itself as part of a mental health campaign aimed at the difficulties and demands of using social media.

  3. Top Instagram Accounts Data (Cleaned)

    • kaggle.com
    Updated Feb 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Faisal Ali (2023). Top Instagram Accounts Data (Cleaned) [Dataset]. https://www.kaggle.com/datasets/faisaljanjua0555/top-200-most-followed-instagram-accounts-2023/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 24, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Muhammad Faisal Ali
    Description

    The Top Instagram Accounts Dataset is a collection of 200 rows of data that provides valuable insights into the most popular Instagram accounts across different categories. The dataset contains several columns that provide comprehensive information on each account's performance, engagement rate, and audience size.

    1. The "rank": column lists the accounts in order of their popularity on Instagram, starting from the most followed account.

    2. The "name": column displays the Instagram handle of the account, which can be used to locate and follow the account on Instagram.

    3. The "channel_info": column provides a brief description of the account, such as the type of content it features or the products and services it offers.

    4. The "Category": column categorizes the account based on its primary theme or subject matter, such as fashion, sports, entertainment, or food.

    5. The "posts": column displays the total number of posts on the account. This column helps to understand the account's level of activity and the amount of content it has produced over time.

    6. The "followers": column indicates the number of people who follow the account on Instagram.

    7. The "avg likes": column displays the average number of likes that the account's posts receive per post.

    8. The "eng rate": column calculates the account's engagement rate by dividing the total number of likes and comments received by the total number of followers, expressed as a percentage.

    How you can use this Dataset?

    The Top Instagram Accounts Dataset can be used in a variety of ways to gain insights into the performance and engagement levels of popular Instagram accounts. Here are a few examples of what you can do with this dataset:

    1. Conduct category analysis: The dataset provides information on the category of each Instagram account. You can use this information to conduct a category analysis and identify the most popular categories on Instagram.

    2. Identify top influencers: The dataset ranks Instagram accounts based on their follower count. You can use this information to identify the top influencers in different categories and use them for influencer marketing campaigns.

    3. Analyze engagement levels: The dataset includes columns such as "avg likes" and "eng rate" that provide insights into the engagement levels of Instagram accounts. You can use this information to understand what type of content resonates with Instagram users and create more engaging content for your own account.

  4. Z

    A set of generated Instagram Data Download Packages (DDPs) to investigate...

    • data.niaid.nih.gov
    Updated Jan 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Boeschoten (2021). A set of generated Instagram Data Download Packages (DDPs) to investigate their structure and content [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4472605
    Explore at:
    Dataset updated
    Jan 28, 2021
    Dataset provided by
    Ruben van den Goorbergh
    Laura Boeschoten
    Daniel Oberski
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Instagram data-download example dataset

    In this repository you can find a data-set consisting of 11 personal Instagram archives, or Data-Download Packages (DDPs).

    How the data was generated

    These Instagram accounts were all new and generated by a group of researchers who were interested to figure out in detail the structure and variety in structure of these Instagram DDPs. The participants user the Instagram account extensively for approximately a week. The participants also intensively communicated with each other so that the data can be used as an example of a network.

    The data was primarily generated to evaluate the performance of de-identification software. Therefore, the text in the DDPs particularly contain many randomly chosen (Dutch) first names, phone numbers, e-mail addresses and URLS. In addition, the images in the DDPs contain many faces and text as well. The DDPs contain faces and text (usernames) of third parties. However, only content of so-called `professional accounts' are shared, such as accounts of famous individuals or institutions who self-consciously and actively seek publicity, and these sources are easily publicly available. Furthermore, the DDPs do not contain sensitive personal data of these individuals.

    Obtaining your Instagram DDP

    After using the Instagram accounts intensively for approximately a week, the participants requested their personal Instagram DDPs by using the following steps. You can follow these steps yourself if you are interested in your personal Instagram DDP.

    1. Go to www.instagram.com and log in
    2. Click on your profile picture, go to Settings and Privacy and Security
    3. Scroll to Data download and click Request download
    4. Enter your email adress and click Next
    5. Enter your password and click Request download

    Instagram then delivered the data in a compressed zip folder with the format username_YYYYMMDD.zip (i.e., Instagram handle and date of download) to the participant, and the participants shared these DDPs with us.

    Data cleaning

    To comply with the Instagram user agreement, participants shared their full name, phone number and e-mail address. In addition, Instagram logged the i.p. addresses the participant used during their active period on Instagram. After colleting the DDPs, we manually replaced such information with random replacements such that the DDps shared here do not contain any personal data of the participants.

    How this data-set can be used

    This data-set was generated with the intention to evaluate the performance of the de-identification software. We invite other researchers to use this data-set for example to investigate what type of data can be found in Instagram DDPs or to investigate the structure of Instagram DDPs. The packages can also be used for example data-analyses, although no substantive research questions can be answered using this data as the data does not reflect how research subjects behave `in the wild'.

    Authors

    The data collection is executed by Laura Boeschoten, Ruben van den Goorbergh and Daniel Oberski of Utrecht University. For questions, please contact l.boeschoten@uu.nl.

    Acknowledgments

    The researchers would like to thank everyone who participated in this data-generation project.

  5. Instagram: distribution of global audiences 2024, by age group

    • statista.com
    • ai-chatbox.pro
    Updated Jul 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon (2024). Instagram: distribution of global audiences 2024, by age group [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    As of April 2024, almost 32 percent of global Instagram audiences were aged between 18 and 24 years, and 30.6 percent of users were aged between 25 and 34 years. Overall, 16 percent of users belonged to the 35 to 44 year age group. Instagram users With roughly one billion monthly active users, Instagram belongs to the most popular social networks worldwide. The social photo sharing app is especially popular in India and in the United States, which have respectively 362.9 million and 169.7 million Instagram users each. Instagram features One of the most popular features of Instagram is Stories. Users can post photos and videos to their Stories stream and the content is live for others to view for 24 hours before it disappears. In January 2019, the company reported that there were 500 million daily active Instagram Stories users. Instagram Stories directly competes with Snapchat, another photo sharing app that initially became famous due to it’s “vanishing photos” feature. As of the second quarter of 2021, Snapchat had 293 million daily active users.

  6. Instagram User Analysis Project

    • kaggle.com
    Updated Jul 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanjana Murthy (2024). Instagram User Analysis Project [Dataset]. https://www.kaggle.com/datasets/sanjanamurthy392/instagram-user-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sanjana Murthy
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    About Datasets: - Domain : Social Media Analytics (User Engagement) - Project: Instagram User Analysis Project - Dataset: insta user analysis dataset - Dataset Type: docx

    KPI's: 1. Find the 5 oldest users of the Instagram from the database provided 2. Find the users who have never posted a single photo on instagram 3. Identify the winner of the contest and provide their details to the team 4. Identify and suggest the top 5 most commonly used hashtags on the platform 5. What day of the week do most users register on? Provide insights on when to schedule an ad campaign 6. Provide how many times does average user posts on Instagram. Also provide the total number of photos on Instagram/total number of users 7. Provide data on users (bots) who have liked every single photo on the site (since any normal user would not be able to do this)

    Process: 1. Understanding the problem 2. Data Collection 3. Data Cleaning 4. Exploring and analyzing the data 5. Interpreting the results

    This data contains create database, use, create table, int auto_increment unique primary key, varchar not null, timestamp default now, foreign key, references, insert into, select, count *, from group by, having, delete, where, and, describe, select distinct, left join, is null, order by, inner join, order by desc limit, date_format, sum, count.

  7. Z

    Dataset for the Instagram and TikTok problematic use

    • data.niaid.nih.gov
    Updated Jul 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Limniou, Maria (2023). Dataset for the Instagram and TikTok problematic use [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_8159159
    Explore at:
    Dataset updated
    Jul 19, 2023
    Dataset provided by
    Hendrikse, Calanthe
    Limniou, Maria
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset supports research on how engagement with social media (Instagram and TikTok) was related to problematic social media use (PSMU) and mental well-being. There are three different files. The SPSS and Excel spreadsheet files include the same dataset but in a different format. The SPSS output presents the data analysis in regard to the difference between Instagram and TikTok users.

  8. Time and Dynamics of Instagram Users

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Jan 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amirhosein Bodaghi; Sama Goliaei; Amirhosein Bodaghi; Sama Goliaei (2020). Time and Dynamics of Instagram Users [Dataset]. http://doi.org/10.5281/zenodo.1439178
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 21, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Amirhosein Bodaghi; Sama Goliaei; Amirhosein Bodaghi; Sama Goliaei
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These four datasets are gathered from Instagram users who were chosen randomly.

    The MainDataset encompasses data for 818 users. The TestDataset encompasses data for 78 users.

    Data gathered for each user includes :

    1- number of posts

    2- number of followers

    3- number of followings

    4- number of likes for the tenth previous post

    5- number of likes for the eleventh previous post

    6- number of likes for the twelfth previous post

    7- number of self-presenting posts from nine previous posts

    8- gender


    The MainDataset_after_150_days and TestDataset_after_150_days encompass data of the users of the Main data set and the Test data set, respectively, for after 150 days. For example, User_1 in the MainDataset has 486 posts and in the MainDataset_after_150_days has 562 posts, which means over the course of 150 days he had published 76 posts.

  9. Instagram user worldwide 2024, by country

    • statista.com
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Instagram user worldwide 2024, by country [Dataset]. https://www.statista.com/forecasts/1174700/instagram-user-by-country
    Explore at:
    Dataset updated
    Mar 10, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 1, 2024 - Dec 31, 2024
    Area covered
    Albania
    Description

    The number of Instagram users ranking is led by India with 331.94 million users, while the United States is following with 156.08 million users. In contrast, Seychelles is at the bottom of the ranking with 0.02 million users, showing a difference of 331.92 million users to India. User figures, shown here with regards to the platform instagram, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  10. Instagram users in Indonesia 2019-2028

    • statista.com
    Updated Mar 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). Instagram users in Indonesia 2019-2028 [Dataset]. https://www.statista.com/topics/8306/social-media-in-indonesia/
    Explore at:
    Dataset updated
    Mar 27, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    Indonesia
    Description

    The number of Instagram users in Indonesia was forecast to continuously increase between 2024 and 2028 by in total 5.3 million users (+4.25 percent). After the ninth consecutive increasing year, the Instagram user base is estimated to reach 129.83 million users and therefore a new peak in 2028. Notably, the number of Instagram users of was continuously increasing over the past years.User figures, shown here with regards to the platform instagram, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Instagram users in countries like Philippines and Thailand.

  11. Threads, an Instagram app Reviews

    • kaggle.com
    Updated Jul 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saloni Jhalani (2023). Threads, an Instagram app Reviews [Dataset]. https://www.kaggle.com/datasets/saloni1712/threads-an-instagram-app-reviews
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 26, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Saloni Jhalani
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Description

    The Threads, an Instagram App Reviews dataset is a comprehensive collection of user reviews from the Threads mobile app on Google Play Store & App Store, capturing valuable insights and sentiments. The dataset enables the understanding of user satisfaction, evaluation of app performance, and identification of emerging patterns.

    The way data was collected

    Scraping Threads App reviews on Google Play Store & App Store

    Ideas for using this dataset
    • Sentiment analysis
    • What makes the application receive 1-star and 5-star

    Note - It was last updated on July 26th 2023

  12. n

    Instagram users in Pakistan

    • napoleoncat.com
    png
    Updated Apr 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NapoleonCat (2024). Instagram users in Pakistan [Dataset]. https://napoleoncat.com/stats/instagram-users-in-pakistan/2024/04
    Explore at:
    pngAvailable download formats
    Dataset updated
    Apr 15, 2024
    Dataset authored and provided by
    NapoleonCat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 2024
    Area covered
    Pakistan
    Description

    There were 18 490 001 Instagram users in Pakistan in April 2024, which accounted for 7.9% of its entire population. The majority of them were men - 65.3%. People aged 18 to 24 were the largest user group (8 900 000). The highest difference between men and women occurs within people aged 18 to 24, where men lead by 5 700 000.

  13. Instagram: distribution of global audiences 2024, by age and gender

    • statista.com
    • ai-chatbox.pro
    Updated Jul 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon (2024). Instagram: distribution of global audiences 2024, by age and gender [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    As of April 2024, around 16.5 percent of global active Instagram users were men between the ages of 18 and 24 years. More than half of the global Instagram population worldwide was aged 34 years or younger. Teens and social media As one of the biggest social networks worldwide, Instagram is especially popular with teenagers. As of fall 2020, the photo-sharing app ranked third in terms of preferred social network among teenagers in the United States, second to Snapchat and TikTok. Instagram was one of the most influential advertising channels among female Gen Z users when making purchasing decisions. Teens report feeling more confident, popular, and better about themselves when using social media, and less lonely, depressed and anxious. Social media can have negative effects on teens, which is also much more pronounced on those with low emotional well-being. It was found that 35 percent of teenagers with low social-emotional well-being reported to have experienced cyber bullying when using social media, while in comparison only five percent of teenagers with high social-emotional well-being stated the same. As such, social media can have a big impact on already fragile states of mind.

  14. im_instagram_70k

    • kaggle.com
    Updated Nov 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kristo Radion Purba (2020). im_instagram_70k [Dataset]. https://www.kaggle.com/krpurba/im-instagram-70k/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 24, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kristo Radion Purba
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is an Instagram social network data for Influence Maximization (IM) task. It was collected from Instagram on April to May 2020, from the followers of 24 private universities in Malaysia, using Instagram API and various third-party Instagram websites. It consists of mostly Malaysian users, with 70,409 nodes/users, 1,007,107 edges/connections (followees and followers). Please kindly cite the following paper upon usage of this dataset:

    Authors: Kristo Radion Purba; David Asirvatham; Raja Kumar Murugesan
    Title: Influence Maximization Diffusion Models Based On Engagement and Activeness on Instagram
    Journal: Journal of King Saud University - Computer and Information Sciences
    Year: 2020
    Link: https://doi.org/10.1016/j.jksuci.2020.09.012
    
    Kindly address further questions to : kristoradion@live.com
    Data FileDescription
    Instagram User Stats.csvpos = Number of posts
    flr = Number of Followers
    flg = Number of Following
    eg = Engagement grade, a scale of 1 to 12 that reflects the strength of engagement rate relative to the number of followers (click to learn more)
    er = Engagement Rate, i.e. likes+comments / followers
    fg = Followers growth % in a month
    op = Outsiders percentage %, i.e. non-followers that liked a post, divided by all unique likers
    Network for IC LT.txtsource, target, weight (weight = 1/followees)
    Network for IC-u LT-u.txtsource, target, weight (weight = 1/followees * source's EG * target's FG)

  15. Dataset for Instagram influencers and females' consumer behaviour

    • zenodo.org
    • data.niaid.nih.gov
    Updated Jan 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria Limniou; Maria Limniou; Ellen Lovatt; Harriet Graham; Ellen Lovatt; Harriet Graham (2024). Dataset for Instagram influencers and females' consumer behaviour [Dataset]. http://doi.org/10.5281/zenodo.10467062
    Explore at:
    Dataset updated
    Jan 7, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Maria Limniou; Maria Limniou; Ellen Lovatt; Harriet Graham; Ellen Lovatt; Harriet Graham
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset supports research on how Instagram Influencers impact female consumer behaviour to purchase products and the role of factors such as envy, scepticism towards advertising, satisfaction with life, social comparison and maternalism on consumer behaviour. There are two different files. The SPSS and CVS spreadsheet files include the same dataset but in a different format.

  16. Data from: Mpox Narrative on Instagram: A Labeled Multilingual Dataset of...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Sep 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nirmalya Thakur; Nirmalya Thakur (2024). Mpox Narrative on Instagram: A Labeled Multilingual Dataset of Instagram Posts on Mpox for Sentiment, Hate Speech, and Anxiety Analysis [Dataset]. http://doi.org/10.5281/zenodo.13738598
    Explore at:
    binAvailable download formats
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nirmalya Thakur; Nirmalya Thakur
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Sep 9, 2024
    Description

    Please cite the following paper when using this dataset:

    N. Thakur, “Mpox narrative on Instagram: A labeled multilingual dataset of Instagram posts on mpox for sentiment, hate speech, and anxiety analysis,” arXiv [cs.LG], 2024, URL: https://arxiv.org/abs/2409.05292

    Abstract

    The world is currently experiencing an outbreak of mpox, which has been declared a Public Health Emergency of International Concern by WHO. During recent virus outbreaks, social media platforms have played a crucial role in keeping the global population informed and updated regarding various aspects of the outbreaks. As a result, in the last few years, researchers from different disciplines have focused on the development of social media datasets focusing on different virus outbreaks. No prior work in this field has focused on the development of a dataset of Instagram posts about the mpox outbreak. The work presented in this paper (stated above) aims to address this research gap. It presents this multilingual dataset of 60,127 Instagram posts about mpox, published between July 23, 2022, and September 5, 2024. This dataset contains Instagram posts about mpox in 52 languages. For each of these posts, the Post ID, Post Description, Date of publication, language, and translated version of the post (translation to English was performed using the Google Translate API) are presented as separate attributes in the dataset.

    After developing this dataset, sentiment analysis, hate speech detection, and anxiety or stress detection were also performed. This process included classifying each post into

    • one of the fine-grain sentiment classes, i.e., fear, surprise, joy, sadness, anger, disgust, or neutral,
    • hate or not hate
    • anxiety/stress detected or no anxiety/stress detected.

    These results are presented as separate attributes in the dataset for the training and testing of machine learning algorithms for sentiment, hate speech, and anxiety or stress detection, as well as for other applications.

    The 52 distinct languages in which Instagram posts are present in the dataset are English, Portuguese, Indonesian, Spanish, Korean, French, Hindi, Finnish, Turkish, Italian, German, Tamil, Urdu, Thai, Arabic, Persian, Tagalog, Dutch, Catalan, Bengali, Marathi, Malayalam, Swahili, Afrikaans, Panjabi, Gujarati, Somali, Lithuanian, Norwegian, Estonian, Swedish, Telugu, Russian, Danish, Slovak, Japanese, Kannada, Polish, Vietnamese, Hebrew, Romanian, Nepali, Czech, Modern Greek, Albanian, Croatian, Slovenian, Bulgarian, Ukrainian, Welsh, Hungarian, and Latvian.

    The following table represents the data description for this dataset

    Attribute Name

    Attribute Description

    Post ID

    Unique ID of each Instagram post

    Post Description

    Complete description of each post in the language in which it was originally published

    Date

    Date of publication in MM/DD/YYYY format

    Language

    Language of the post as detected using the Google Translate API

    Translated Post Description

    Translated version of the post description. All posts which were not in English were translated into English using the Google Translate API. No language translation was performed for English posts.

    Sentiment

    Results of sentiment analysis (using translated Post Description) where each post was classified into one of the sentiment classes: fear, surprise, joy, sadness, anger, disgust, and neutral

    Hate

    Results of hate speech detection (using translated Post Description) where each post was classified as hate or not hate

    Anxiety or Stress

    Results of anxiety or stress detection (using translated Post Description) where each post was classified as stress/anxiety detected or no stress/anxiety detected.

  17. D

    Dataset of "A diary study investigating the differential impacts of...

    • ssh.datastations.nl
    • datacatalogue.cessda.eu
    tsv
    Updated Sep 27, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DANS Data Station Social Sciences and Humanities (2023). Dataset of "A diary study investigating the differential impacts of Instagram content on youths' body image" [Dataset]. http://doi.org/10.17026/SS/7M90LJ
    Explore at:
    tsv(2535)Available download formats
    Dataset updated
    Sep 27, 2023
    Dataset provided by
    DANS Data Station Social Sciences and Humanities
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Through social media like Instagram, users are constantly exposed to “perfect” lives and bodies. Research in this field has predominantly focused on the mere time youth spend on Instagram and the effects on their body image, oftentimes uncovering negative effects. Little research has been done on the root of the influence: the consumed content itself. Hence, this study aims to qualitatively uncover the types of content that trigger youths’ body image. Using a diary study, 28 youth (Mage = 21.86; 79% female) reported 140 influential body image Instagram posts over five days, uncovering trigger points and providing their motivations, emotions, and impacts on body image. Based on these posts, four content categories were distinguished: Thin Ideal, Body Positivity, Fitness, and Lifestyle. These different content types triggered different emotions regarding body image, and clear gender distinctions in content could be noticed. The study increased youths’ awareness of Instagram's influence on their mood and body perception. The findings imply that the discussion about the effects of social media on body image should be nuanced, taking into account different types of content and users. Using this information, future interventions could focus on conscious use of social media rather than merely limiting its use.

  18. f

    SPSS Dataset

    • figshare.com
    bin
    Updated Nov 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    bankole temitope (2023). SPSS Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.23118368.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 1, 2023
    Dataset provided by
    figshare
    Authors
    bankole temitope
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Quantitative analysis of adolescent exposure to fast food marketing on Instagram. Descriptive statistics were calculated and the total frequency of each marketing strategy was obtained. For the continuous variables mean and standard deviation values were obtained. Mann-Whitney U tests were conducted to examine the association between the marketing strategies and user engagement, while the Kruskal-Wallis H test was completed to test for associations between brand name and engagement.

  19. Fake/Authentic User Instagram

    • kaggle.com
    zip
    Updated Feb 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kristo Radion Purba (2021). Fake/Authentic User Instagram [Dataset]. https://www.kaggle.com/krpurba/fakeauthentic-user-instagram
    Explore at:
    zip(3451107 bytes)Available download formats
    Dataset updated
    Feb 11, 2021
    Authors
    Kristo Radion Purba
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Kindly refer to my paper for more information. Please cite my work if you use my dataset in any work : K. R. Purba, D. Asirvatham and R. K. Murugesan, "Classification of instagram fake users using supervised machine learning algorithms," International Journal of Electrical and Computer Engineering (IJECE), vol. 10, no. 3, pp. 2763-2772, 2020.

    The dataset was collected using web scraping from third-party Instagram websites, to capture their metadata and up to 12 latest media posts from each user. The collection process was executed from September 1st, 2019, until September 20th, 2019. The dataset contains authentic users and fake users, which were filtered using human annotators. The authentic users were taken from followers of 24 private university pages (8 Indonesian, 8 Malaysian, 8 Australian) on Instagram. To reduce the number of users, they are picked using proportional random sampling based on their source university. All private users were removed, which is a total of 31,335 out of 63,795 users (49.11%). The final number of public users used in this research was 32,460 users.

    Var name | Feature name | Description pos | Num posts | Number of total posts that the user has ever posted. flg | Num following | Number of following flr | Num followers | Number of followers bl | Biography length | Length (number of characters) of the user's biography pic | Picture availability | Value 0 if the user has no profile picture, or 1 if has lin | Link availability | Value 0 if the user has no external URL, or 1 if has cl | Average caption length | The average number of character of captions in media cz | Caption zero | Percentage (0.0 to 1.0) of captions that has almost zero (<=3) length ni | Non image percentage | Percentage (0.0 to 1.0) of non-image media. There are three types of media on an Instagram post, i.e. image, video, carousel erl | Engagement rate (Like) | Engagement rate (ER) is commonly defined as (num likes) divide by (num media) divide by (num followers) erc | Engagement rate (Comm.) | Similar to ER like, but it is for comments lt | Location tag percentage | Percentage (0.0 to 1.0) of posts tagged with location hc | Average hashtag count | Average number of hashtags used in a post pr | Promotional keywords | Average use of promotional keywords in hashtag, i.e. {regrann, contest, repost, giveaway, mention, share, give away, quiz} fo | Followers keywords | Average use of followers hunter keywords in hashtag, i.e. {follow, like, folback, follback, f4f} cs | Cosine similarity | Average cosine similarity of between all pair of two posts a user has pi | Post interval | Average interval between posts (in hours)

    Output : 2-class User classes : r (real/authentic user), f (fake user / bought followers) 4-class User classes : r (authentic/real user), a (active fake user), i (inactive fake user), s (spammer fake user) Note that the 3 fake user classes (a, i, s) were judged by human annotators.

  20. P

    Myket Android Application Install Dataset

    • paperswithcode.com
    Updated Aug 12, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erfan Loghmani; Mohammadamin Fazli (2023). Myket Android Application Install Dataset [Dataset]. https://paperswithcode.com/dataset/myket-android-application-install
    Explore at:
    Dataset updated
    Aug 12, 2023
    Authors
    Erfan Loghmani; Mohammadamin Fazli
    Description

    This dataset contains information on application install interactions of users in the Myket android application market. The dataset was created for the purpose of evaluating interaction prediction models, requiring user and item identifiers along with timestamps of the interactions. Hence, the dataset can be used for interaction prediction and building a recommendation system. Furthermore, the data forms a dynamic network of interactions, and we can also perform network representation learning on the nodes in the network, which are users and applications.

    Data Creation The dataset was initially generated by the Myket data team, and later cleaned and subsampled by Erfan Loghmani a master student at Sharif University of Technology at the time. The data team focused on a two-week period and randomly sampled 1/3 of the users with interactions during that period. They then selected install and update interactions for three months before and after the two-week period, resulting in interactions spanning about 6 months and two weeks.

    We further subsampled and cleaned the data to focus on application download interactions. We identified the top 8000 most installed applications and selected interactions related to them. We retained users with more than 32 interactions, resulting in 280,391 users. From this group, we randomly selected 10,000 users, and the data was filtered to include only interactions for these users. The detailed procedure can be found in here.

    Data Structure The dataset has two main files.

    myket.csv: This file contains the interaction information and follows the same format as the datasets used in the "JODIE: Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks" (ACM SIGKDD 2019) project. However, this data does not contain state labels and interaction features, resulting in associated columns being all zero. app_info_sample.csv: This file comprises features associated with applications present in the sample. For each individual application, information such as the approximate number of installs, average rating, count of ratings, and category are included. These features provide insights into the applications present in the dataset.

    Dataset Details

    Total Instances: 694,121 install interaction instances Instances Format: Triplets of user_id, app_name, timestamp 10,000 users and 7,988 android applications Item features for 7,606 applications

    For a detailed summary of the data's statistics, including information on users, applications, and interactions, please refer to the Python notebook available at summary-stats.ipynb. The notebook provides an overview of the dataset's characteristics and can be helpful for understanding the data's structure before using it for research or analysis.

    Top 20 Most Installed Applications | Package Name | Count of Interactions | | ---------------------------------- | --------------------- | | com.instagram.android | 15292 | | ir.resaneh1.iptv | 12143 | | com.tencent.ig | 7919 | | com.ForgeGames.SpecialForcesGroup2 | 7797 | | ir.nomogame.ClutchGame | 6193 | | com.dts.freefireth | 6041 | | com.whatsapp | 5876 | | com.supercell.clashofclans | 5817 | | com.mojang.minecraftpe | 5649 | | com.lenovo.anyshare.gps | 5076 | | ir.medu.shad | 4673 | | com.firsttouchgames.dls3 | 4641 | | com.activision.callofduty.shooter | 4357 | | com.tencent.iglite | 4126 | | com.aparat | 3598 | | com.kiloo.subwaysurf | 3135 | | com.supercell.clashroyale | 2793 | | co.palang.QuizOfKings | 2589 | | com.nazdika.app | 2436 | | com.digikala | 2413 |

    Comparison with SNAP Datasets The Myket dataset introduced in this repository exhibits distinct characteristics compared to the real-world datasets used by the project. The table below provides a comparative overview of the key dataset characteristics:

    Dataset#Users#Items#InteractionsAverage Interactions per UserAverage Unique Items per User
    Myket10,0007,988694,12169.454.6
    LastFM9801,0001,293,1031,319.5158.2
    Reddit10,000984672,44767.27.9
    Wikipedia8,2271,000157,47419.12.2
    MOOC7,04797411,74958.425.3

    The Myket dataset stands out by having an ample number of both users and items, highlighting its relevance for real-world, large-scale applications. Unlike LastFM, Reddit, and Wikipedia datasets, where users exhibit repetitive item interactions, the Myket dataset contains a comparatively lower amount of repetitive interactions. This unique characteristic reflects the diverse nature of user behaviors in the Android application market environment.

    Citation If you use this dataset in your research, please cite the following preprint:

    @misc{loghmani2023effect, title={Effect of Choosing Loss Function when Using T-batching for Representation Learning on Dynamic Networks}, author={Erfan Loghmani and MohammadAmin Fazli}, year={2023}, eprint={2308.06862}, archivePrefix={arXiv}, primaryClass={cs.LG} }

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Bright Data (2022). Instagram Dataset [Dataset]. https://brightdata.com/products/datasets/instagram
Organization logo

Instagram Dataset

Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Apr 26, 2022
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License

https://brightdata.com/licensehttps://brightdata.com/license

Area covered
Worldwide
Description

Use our Instagram dataset (public data) to extract business and non-business information from complete public profiles and filter by hashtags, followers, account type, or engagement score. Depending on your needs, you may purchase the entire dataset or a customized subset. Popular use cases include sentiment analysis, brand monitoring, influencer marketing, and more. The dataset includes all major data points: # of followers, verified status, account type (business / non-business), links, posts, comments, location, engagement score, hashtags, and much more.

Search
Clear search
Close search
Google apps
Main menu