54 datasets found
  1. Instagram Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Apr 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2022). Instagram Dataset [Dataset]. https://brightdata.com/products/datasets/instagram
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Apr 26, 2022
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Access detailed insights with our Instagram datasets, featuring follower counts, verified status, account types, and engagement scores. Explore post information including URLs, descriptions, hashtags, comments, likes, media, posting dates, locations, and reel URLs. Perfect for understanding user engagement and content trends to drive informed decisions and optimize your social media strategies. Over 750M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:

    Account Fbid Id Followers Posts Count Is Business Account Is Professional Account Is Verified Avg Engagement External Url Biography Business Category Name Category Name Post Hashtags Following Posts Profile Image Link Profile URL Profile Name Highlights Count Highlights Full Name Is Private Bio Hashtags URL Is Joined Recently And much more

  2. Instagram user worldwide 2024, by country

    • statista.com
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Instagram user worldwide 2024, by country [Dataset]. https://www.statista.com/forecasts/1174700/instagram-user-by-country
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 1, 2024 - Dec 31, 2024
    Area covered
    Albania
    Description

    The number of Instagram users ranking is led by India with ****** million users, while the United States is following with ****** million users. In contrast, Seychelles is at the bottom of the ranking with **** million users, showing a difference of ****** million users to India. User figures, shown here with regards to the platform instagram, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).

  3. Top 1000 instagrammers - world (cleaned)

    • kaggle.com
    Updated Jul 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syed Jafer (2022). Top 1000 instagrammers - world (cleaned) [Dataset]. https://www.kaggle.com/datasets/syedjaferk/top-1000-instagrammers-world-cleaned
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 12, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Syed Jafer
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    World
    Description

    Instagram[a] is a photo and video sharing social networking service founded in 2010 by Kevin Systrom and Mike Krieger, and later acquired by American company Facebook Inc. The app allows users to upload media that can be edited with filters and organized by hashtags and geographical tagging. Posts can be shared publicly or with preapproved followers. Users can browse other users' content by tag and location, view trending content, like photos, and follow other users to add their content to a personal feed.

    Instagram was originally distinguished by allowing content to be framed only in a square (1:1) aspect ratio of 640 pixels to match the display width of the iPhone at the time. In 2015, this restrictions was eased with an increase to 1080 pixels. It also added messaging features, the ability to include multiple images or videos in a single post, and a Stories feature—similar to its main competitor Snapchat—which allowed users to post their content to a sequential feed, with each post accessible to others for 24 hours. As of January 2019, Stories is used by 500 million people daily.

    This dataset comprises of the details of top 1000 influencers in instagram

  4. Instagram: countries with the highest audience reach 2024

    • statista.com
    • es.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram: countries with the highest audience reach 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    As of April 2024, Bahrain was the country with the highest Instagram audience reach with 95.6 percent. Kazakhstan also had a high Instagram audience penetration rate, with 90.8 percent of the population using the social network. In the United Arab Emirates, Turkey, and Brunei, the photo-sharing platform was used by more than 85 percent of each country's population.

  5. Countries with the most Facebook users 2024

    • statista.com
    • es.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Countries with the most Facebook users 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Which county has the most Facebook users?

                  There are more than 378 million Facebook users in India alone, making it the leading country in terms of Facebook audience size. To put this into context, if India’s Facebook audience were a country then it would be ranked third in terms of largest population worldwide. Apart from India, there are several other markets with more than 100 million Facebook users each: The United States, Indonesia, and Brazil with 193.8 million, 119.05 million, and 112.55 million Facebook users respectively.
    
                  Facebook – the most used social media
    
                  Meta, the company that was previously called Facebook, owns four of the most popular social media platforms worldwide, WhatsApp, Facebook Messenger, Facebook, and Instagram. As of the third quarter of 2021, there were around 3,5 billion cumulative monthly users of the company’s products worldwide. With around 2.9 billion monthly active users, Facebook is the most popular social media worldwide. With an audience of this scale, it is no surprise that the vast majority of Facebook’s revenue is generated through advertising.
    
                  Facebook usage by device
                  As of July 2021, it was found that 98.5 percent of active users accessed their Facebook account from mobile devices. In fact, almost 81.8 percent of Facebook audiences worldwide access the platform only via mobile phone. Facebook is not only available through mobile browser as the company has published several mobile apps for users to access their products and services. As of the third quarter 2021, the four core Meta products were leading the ranking of most downloaded mobile apps worldwide, with WhatsApp amassing approximately six billion downloads.
    
  6. Pakistan's Top 25 Instagram Accounts

    • kaggle.com
    Updated May 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zeeshan-ul-hassan Usmani (2021). Pakistan's Top 25 Instagram Accounts [Dataset]. http://doi.org/10.34740/kaggle/dsv/2223368
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 12, 2021
    Dataset provided by
    Kaggle
    Authors
    Zeeshan-ul-hassan Usmani
    Area covered
    Pakistan
    Description

    Context

    Who are leading Pakistan on Instagram?

    Content

    The dataset contains Top 25 (2 additional for tie) Instagram accounts from Pakistan with category and followers count. All accounts have more than 2 million followers.

    Inspiration

    Can you find out what kind of contents Pakistanis are interested in on Instagram?

  7. Instagram accounts with the most followers worldwide 2024

    • statista.com
    • es.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram accounts with the most followers worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Cristiano Ronaldo has one of the most popular Instagram accounts as of April 2024.

                  The Portuguese footballer is the most-followed person on the photo sharing app platform with 628 million followers. Instagram's own account was ranked first with roughly 672 million followers.
    
                  How popular is Instagram?
    
                  Instagram is a photo-sharing social networking service that enables users to take pictures and edit them with filters. The platform allows users to post and share their images online and directly with their friends and followers on the social network. The cross-platform app reached one billion monthly active users in mid-2018. In 2020, there were over 114 million Instagram users in the United States and experts project this figure to surpass 127 million users in 2023.
    
                  Who uses Instagram?
    
                  Instagram audiences are predominantly young – recent data states that almost 60 percent of U.S. Instagram users are aged 34 years or younger. Fall 2020 data reveals that Instagram is also one of the most popular social media for teens and one of the social networks with the biggest reach among teens in the United States.
    
                  Celebrity influencers on Instagram
                  Many celebrities and athletes are brand spokespeople and generate additional income with social media advertising and sponsored content. Unsurprisingly, Ronaldo ranked first again, as the average media value of one of his Instagram posts was 985,441 U.S. dollars.
    
  8. Social Media Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Sep 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2022). Social Media Datasets [Dataset]. https://brightdata.com/products/datasets/social-media
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Sep 7, 2022
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Gain valuable insights with our comprehensive Social Media Dataset, designed to help businesses, marketers, and analysts track trends, monitor engagement, and optimize strategies. This dataset provides structured and reliable social media data from multiple platforms.

    Dataset Features

    User Profiles: Access public social media profiles, including usernames, bios, follower counts, engagement metrics, and more. Ideal for audience analysis, influencer marketing, and competitive research. Posts & Content: Extract posts, captions, hashtags, media (images/videos), timestamps, and engagement metrics such as likes, shares, and comments. Useful for trend analysis, sentiment tracking, and content strategy optimization. Comments & Interactions: Analyze user interactions, including replies, mentions, and discussions. This data helps brands understand audience sentiment and engagement patterns. Hashtag & Trend Tracking: Monitor trending hashtags, topics, and viral content across platforms to stay ahead of industry trends and consumer interests.

    Customizable Subsets for Specific Needs Our Social Media Dataset is fully customizable, allowing you to filter data based on platform, region, keywords, engagement levels, or specific user profiles. Whether you need a broad dataset for market research or a focused subset for brand monitoring, we tailor the dataset to your needs.

    Popular Use Cases

    Brand Monitoring & Reputation Management: Track brand mentions, customer feedback, and sentiment analysis to manage online reputation effectively. Influencer Marketing & Audience Analysis: Identify key influencers, analyze engagement metrics, and optimize influencer partnerships. Competitive Intelligence: Monitor competitor activity, content performance, and audience engagement to refine marketing strategies. Market Research & Consumer Insights: Analyze social media trends, customer preferences, and emerging topics to inform business decisions. AI & Predictive Analytics: Leverage structured social media data for AI-driven trend forecasting, sentiment analysis, and automated content recommendations.

    Whether you're tracking brand sentiment, analyzing audience engagement, or monitoring industry trends, our Social Media Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.

  9. Social Media Usage Dataset(Applications)

    • kaggle.com
    Updated Oct 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhadra Mohit (2024). Social Media Usage Dataset(Applications) [Dataset]. https://www.kaggle.com/datasets/bhadramohit/social-media-usage-datasetapplications/suggestions?status=pending&yourSuggestions=true
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 23, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Bhadra Mohit
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    Context: This dataset offers insights into the usage patterns of social media apps for 1,000 users across seven popular platforms: Facebook, Instagram, Twitter, Snapchat, TikTok, LinkedIn, and Pinterest. It tracks various metrics such as daily time spent on the app, number of posts made, likes received, and new followers gained.

    Dataset Features:

    User_ID: Unique identifier for each user. App: The social media platform being used. Daily_Minutes_Spent: Total time a user spends on the app each day, ranging from 5 to 500 minutes. Posts_Per_Day: Number of posts a user creates per day, ranging from 0 to 20. Likes_Per_Day: Total number of likes a user receives on their posts each day, ranging from 0 to 200. Follows_Per_Day: The number of new followers a user gains daily, ranging from 0 to 50. Context & Use Cases: This dataset could be particularly useful for social media analysts, digital marketers, or researchers interested in understanding user engagement trends across different platforms. It provides insights into how much time users spend, how actively they post, and the level of engagement they receive (in terms of likes and followers).

    Conclusion & Outcome: Analyzing this dataset could yield several outcomes:

    Engagement Patterns: Identifying which platforms have higher engagement in terms of time spent or likes received. Active Users: Determining which users are the most active across various platforms based on the number of posts and followers gained. User Retention: Studying the correlation between time spent and follower growth, providing insight into user retention strategies for different platforms. Overall, the dataset allows for exploration of social media usage trends and helps drive decision-making for marketing strategies, content creation, and platform engagement.

  10. Social Media vs Productivity

    • kaggle.com
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahdi Mashayekhi (2025). Social Media vs Productivity [Dataset]. https://www.kaggle.com/datasets/mahdimashayekhi/social-media-vs-productivity/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 15, 2025
    Dataset provided by
    Kaggle
    Authors
    Mahdi Mashayekhi
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    📊 Social Media vs Productivity — Realistic Behavioral Dataset (30,000 Users)

    This dataset explores how daily digital habits — including social media usage, screen time, and notification exposure — relate to individual productivity, stress, and well-being.

    🔍 What’s Inside?

    The dataset contains 30,000 real-world-style records simulating behavioral patterns of people with various jobs, social habits, and lifestyle choices. The goal is to understand how different digital behaviors correlate with perceived and actual productivity.

    🧠 Why This Dataset is Valuable

    • Designed for real-world ML workflows
      Includes missing values, noise, and outliers — ideal for practicing data cleaning and preprocessing.

    • 🔗 High correlation between target features
      The perceived_productivity_score and actual_productivity_score are strongly correlated, making this dataset suitable for experiments in feature selection and multicollinearity.

    • 🛠️ Feature Engineering playground
      Use this dataset to practice feature scaling, encoding, binning, interaction terms, and more.

    • 🧪 Perfect for EDA, regression & classification
      You can model productivity, stress, or satisfaction based on behavior patterns and digital exposure.

    🧾 Columns & Feature Info

    Column NameDescription
    ageAge of the individual (18–65 years)
    genderGender identity: Male, Female, or Other
    job_typeEmployment sector or status (IT, Education, Student, etc.)
    daily_social_media_timeAverage daily time spent on social media (hours)
    social_platform_preferenceMost-used social platform (Instagram, TikTok, Telegram, etc.)
    number_of_notificationsNumber of mobile/social notifications per day
    work_hours_per_dayAverage hours worked each day
    perceived_productivity_scoreSelf-rated productivity score (scale: 0–10)
    actual_productivity_scoreSimulated ground-truth productivity score (scale: 0–10)
    stress_levelCurrent stress level (scale: 1–10)
    sleep_hoursAverage hours of sleep per night
    screen_time_before_sleepTime spent on screens before sleeping (hours)
    breaks_during_workNumber of breaks taken during work hours
    uses_focus_appsWhether the user uses digital focus apps (True/False)
    has_digital_wellbeing_enabledWhether Digital Wellbeing is activated (True/False)
    coffee_consumption_per_dayNumber of coffee cups consumed per day
    days_feeling_burnout_per_monthNumber of burnout days reported per month
    weekly_offline_hoursTotal hours spent offline each week (excluding sleep)
    job_satisfaction_scoreSatisfaction with job/life responsibilities (scale: 0–10)

    📌 Notes

    • Contains NaN values in critical columns (productivity, sleep, stress) for data imputation tasks
    • Includes outliers in media usage, coffee intake, and notification count
    • Target columns are strongly correlated for multicollinearity testing
    • Multi-purpose: regression, classification, clustering, visualization

    💡 Use Cases

    • Exploratory Data Analysis (EDA)
    • Feature engineering pipelines
    • Machine learning model benchmarking
    • Statistical hypothesis testing
    • Burnout and mental health prediction projects

    📥 Bonus

    👉 Sample notebook coming soon with data cleaning, visualization, and productivity prediction!

  11. c

    Social Media Usage Dataset(Applications)

    • cubig.ai
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Social Media Usage Dataset(Applications) [Dataset]. https://cubig.ai/store/products/321/social-media-usage-datasetapplications
    Explore at:
    Dataset updated
    May 28, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Social Media Usage Dataset(Applications) features patterns and activity indicators that 1,000 users use seven major social media platforms, including Facebook, Instagram, and Twitter.

    2) Data Utilization (1) Social Media Usage Dataset(Applications) has characteristics that: • This dataset provides different social media activity data for each user, including daily usage time, number of posts, number of likes received, and number of new followers. (2) Social Media Usage Dataset(Applications) can be used to: • Analysis of User Participation by Platform: You can analyze participation and popular trends by platform by comparing usage time and activity for each social media. • Establish marketing strategy: Based on user activity data, it can be used for targeted marketing, content production, and user retention strategies.

  12. Instagram: most popular posts as of 2024

    • statista.com
    • es.statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram: most popular posts as of 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Instagram’s most popular post

                  As of April 2024, the most popular post on Instagram was Lionel Messi and his teammates after winning the 2022 FIFA World Cup with Argentina, posted by the account @leomessi. Messi's post, which racked up over 61 million likes within a day, knocked off the reigning post, which was 'Photo of an Egg'. Originally posted in January 2021, 'Photo of an Egg' surpassed the world’s most popular Instagram post at that time, which was a photo by Kylie Jenner’s daughter totaling 18 million likes.
                  After several cryptic posts published by the account, World Record Egg revealed itself to be a part of a mental health campaign aimed at the pressures of social media use.
    
                  Instagram’s most popular accounts
    
                  As of April 2024, the official Instagram account @instagram had the most followers of any account on the platform, with 672 million followers. Portuguese footballer Cristiano Ronaldo (@cristiano) was the most followed individual with 628 million followers, while Selena Gomez (@selenagomez) was the most followed woman on the platform with 429 million. Additionally, Inter Miami CF striker Lionel Messi (@leomessi) had a total of 502 million. Celebrities such as The Rock, Kylie Jenner, and Ariana Grande all had over 380 million followers each.
    
                  Instagram influencers
    
                  In the United States, the leading content category of Instagram influencers was lifestyle, with 15.25 percent of influencers creating lifestyle content in 2021. Music ranked in second place with 10.96 percent, followed by family with 8.24 percent. Having a large audience can be very lucrative: Instagram influencers in the United States, Canada and the United Kingdom with over 90,000 followers made around 1,221 US dollars per post.
    
                  Instagram around the globe
    
                  Instagram’s worldwide popularity continues to grow, and India is the leading country in terms of number of users, with over 362.9 million users as of January 2024. The United States had 169.65 million Instagram users and Brazil had 134.6 million users. The social media platform was also very popular in Indonesia and Turkey, with 100.9 and 57.1, respectively. As of January 2024, Instagram was the fourth most popular social network in the world, behind Facebook, YouTube and WhatsApp.
    
  13. f

    U.S. Army vs. British Army Instagram Engagement Metrics Dataset

    • figshare.com
    xlsx
    Updated Jun 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abby Stover (2024). U.S. Army vs. British Army Instagram Engagement Metrics Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.26060866.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 18, 2024
    Dataset provided by
    figshare
    Authors
    Abby Stover
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United Kingdom, United States
    Description

    This dataset investigates the Instagram engagement metrics (likes and comments) of the U.S. and British Armies to understand their strengths and weaknesses in their marketing. For the quantitative data collection, a random number generator was used to compile a 20% data sample (73 posts) from a total of 365 posts from each account. For instance, a number 1 in the random generator corresponded to the most recent post from the start date of data collection (May 23rd, 2024). By picking from 365 posts, the data collection was meant to represent roughly a year of Instagram content, assuming their Instagram accounts posted every day. This method ensured an unbiased representation of which content was included in the 20% data sample.However, the U.S Army posted almost once a day while the British Army posted only a few days a week. In the end, data was collected across 365 U.S. Army posts from May 23rd, 2024, to October 28th, 2023. For the British Army’s Instagram, the data collection span from May 23rd, 2024, to November 25th, 2021. By engaging with recent posts, the purpose was to understand how effectively these Armies responded to their recruitment crisis (which started in 2022).For the data collection, variables for each post included the following:Date of postNumber of likesPercentage of likes by follower populationNumber of commentsPercentage of comments by follower populationTo understand which Instagram posts were successful, the content with the highest number of likes and comments were defined as the most engaged. But, to accurately compare the British Army’s Instagram engagement to the U.S., the number of likes/comments was divided by the number of their followers. As of May 23, 2024, the U.S. Army had 2.9 million followers on Instagram whereas the British Army had 594,000 followers. While social media users outside of the Armies’ followers engaged with the posts, these ratios provided a basis to fairly compare their engagement metrics.

  14. Data from: #PraCegoVer dataset

    • zenodo.org
    Updated Jan 20, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriel Oliveira dos Santos; Gabriel Oliveira dos Santos; Esther Luna Colombini; Esther Luna Colombini; Sandra Avila; Sandra Avila (2023). #PraCegoVer dataset [Dataset]. http://doi.org/10.5281/zenodo.7548638
    Explore at:
    Dataset updated
    Jan 20, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gabriel Oliveira dos Santos; Gabriel Oliveira dos Santos; Esther Luna Colombini; Esther Luna Colombini; Sandra Avila; Sandra Avila
    Description

    Automatically describing images using natural sentences is an essential task for visually impaired people's inclusion on the Internet. Although there are many datasets in the literature, most of them contain only English captions, whereas datasets with captions described in other languages are scarce.

    PraCegoVer arose on the Internet, stimulating users from social media to publish images, tag #PraCegoVer, and add a short description of their content. Inspired by this movement, we have proposed the #PraCegoVer, a multi-modal dataset with Portuguese captions based on posts from Instagram. It is the first large dataset for image captioning in Portuguese with freely annotated images.

    #PraCegoVer has 533,523 pairs with images and captions described in Portuguese collected from more than 14 thousand different profiles. Also, the average caption length in #PraCegoVer is 39.3 words and the standard deviation is 29.7.

    New Release

    We release pracegover_400k.json which contains 403,337 examples from the original dataset.json after preprocessing and duplication removal. It is split into train, validation, and test with 242036, 80628, and 80673 examples, respectively.

    Dataset Structure

    #PraCegoVer dataset comprehends a main file dataset.json and a collection of compressed files named images.tar.gz.partX
    containing the images. The file dataset.json comprehends a list of JSON objects with the attributes:

    • user: anonymized user that made the post;
    • filename: image file name;
    • raw_caption: raw caption;
    • caption: clean caption;
    • date: post date.

    Each instance in dataset.json is associated with exactly one image in the images directory whose filename is pointed by the attribute filename. Also, we provide a sample with five instances, so the users can download the sample to get an overview of the dataset before downloading it completely.

    Download Instructions

    If you just want to have an overview of the dataset structure, you can download sample.tar.gz. But, if you want to use the dataset, or any of its subsets (63k, 173k, and 400k), you must download all the files and run the following commands to uncompress and join the files:

    cat images.tar.gz.part* > images.tar.gz
    tar -xzvf images.tar.gz

    Alternatively, you can download the entire dataset from the terminal using the python script download_dataset.py available in the PraCegoVer repository. In this case, first, you have to download the script and create an access token here. Then, you can run the following command to download and uncompress the image files:

    python download_dataset.py --access_token=

  15. Instagram: number of global users 2020-2025

    • statista.com
    Updated May 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Instagram: number of global users 2020-2025 [Dataset]. https://www.statista.com/statistics/183585/instagram-number-of-global-users/
    Explore at:
    Dataset updated
    May 22, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2021, there were 1.21 billion monthly active users of Meta's Instagram, making up over 28 percent of the world's internet users. By 2025, it has been forecast that there will be 1.44 billion monthly active users of the social media platform, which would account for 31.2 percent of global internet users.

    How popular is Instagram?

    Instagram, as of January 2022, was the fourth most popular social media platform in the world in terms of user numbers. YouTube and WhatsApp ranked in second and third place, respectively, whilst Facebook remained the most popular, with almost three billion monthly active users worldwide.

    India had the largest number of Instagram users as of January 2022, with a total of over 230 million users in the country. The second-largest Instagram audience could be found in the United States, with almost 160 million people subscribing to the photo and video sharing app.

    Gen Z and Instagram

    As of September 2021, Gen Z users in the United States spent an average of five hours per week on Instagram. Although Instagram ranked third in terms of hours per week spent on the platform, Gen Z users spent considerably more time on TikTok, amounting to a weekly average of over 10 hours being spent on the mobile-first video app.

    Most followed accounts on Instagram

    As of May 2022, Instagram’s own account had 504.37 million followers. In terms of celebrities, Portuguese footballer Cristiano Ronaldo (@chistiano) had over 440.41 million followers on the social network. Moreover, the average media value of an Instagram post by Ronaldo was over 985,000 U.S. dollars.

    The most liked post on Instagram as of May 2022 was Photo of an Egg, which was posted in 2019 by the account @world_record_egg. Photo of an Egg has not only exceeded 55 million likes on the platform, but it also has nearly 3.5 million comments, and the account itself has over 4.5 million Instagram followers. After mysterious posts published by the account, World Record Egg revealed itself as part of a mental health campaign aimed at the difficulties and demands of using social media.

  16. Instagram Reach Analysis: Case Study

    • kaggle.com
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhanupratap Biswas☑️ (2023). Instagram Reach Analysis: Case Study [Dataset]. http://doi.org/10.34740/kaggle/dsv/5929375
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    Kaggle
    Authors
    Bhanupratap Biswas☑️
    Description

    data Source - https://statso.io/instagram-reach-analysis-case-study/

    Certainly! Let's conduct a case study on Instagram reach analysis. To make the case study more specific, let's imagine a scenario where a fashion brand called "Fashionista" wants to analyze the reach of their Instagram account over the past six months.

    Objective: Analyze the reach of Fashionista's Instagram account and identify trends, patterns, and insights that can help improve their reach and engagement.

    Steps for the Instagram Reach Analysis:

    1. Data Collection:

      • Gather data from Fashionista's Instagram account for the past six months.
      • Collect metrics such as follower count, post reach, impressions, likes, comments, and engagement rate.
      • Use Instagram's built-in analytics or third-party tools like Iconosquare or Sprout Social to retrieve the necessary data.
    2. Define Key Metrics:

      • Identify the key metrics that will help assess the reach of Fashionista's Instagram account.
      • Key metrics may include follower growth rate, average reach per post, total impressions, engagement rate, and engagement per post.
    3. Analyze Follower Growth:

      • Plot the follower count over the past six months to observe any trends.
      • Calculate the follower growth rate to understand the rate at which the account is gaining or losing followers.
      • Look for any significant changes in follower count and investigate potential reasons behind those changes.
    4. Evaluate Post Reach and Impressions:

      • Analyze the average reach per post and total impressions to understand the reach of Fashionista's content.
      • Identify posts with the highest and lowest reach and compare their characteristics.
      • Look for patterns or themes that resonate well with the audience and those that underperform.
    5. Assess Engagement:

      • Calculate the average engagement rate and compare it across different types of content (e.g., images, videos, stories, reels).
      • Identify posts with the highest engagement rate and analyze their content, captions, and hashtags.
      • Look for patterns or elements that encourage higher engagement from the audience.
    6. Identify Optimal Posting Times:

      • Analyze the data to identify the days and times when Fashionista's posts receive the highest reach and engagement.
      • Experiment with posting at different times and measure the impact on reach and engagement.
    7. Monitor Competitors:

      • Analyze the reach and engagement of Fashionista's competitors' Instagram accounts.
      • Identify strategies or content types that work well for competitors and consider adopting similar approaches if relevant.
    8. Generate Insights and Recommendations:

      • Summarize the findings from the analysis and identify key insights and trends.
      • Recommend strategies to improve Fashionista's Instagram reach based on the insights obtained.
      • Provide actionable recommendations such as optimizing content, using relevant hashtags, collaborating with influencers, or running Instagram ads.

    By conducting a thorough analysis of Fashionista's Instagram reach, you'll gain valuable insights into their audience's behavior, content performance, and engagement patterns. These insights can help guide future content strategies and optimize reach and engagement on Instagram.

  17. OTT consumption profile - Unicauca dataset

    • kaggle.com
    Updated Apr 15, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juan Sebastián Rojas (2019). OTT consumption profile - Unicauca dataset [Dataset]. https://www.kaggle.com/jsrojas/ott-consumption-profile-dataset/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 15, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Juan Sebastián Rojas
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Description

    Context

    Network monitoring and analysis of consumption behavior represents an important aspect for network operators allowing to obtain vital information about consumption trends in order to offer new data plans aimed at specific users and obtain an adequate perspective of the network. Over-the-top (OTT) media and communications services and applications are shifting the Internet consumption by increasing the traffic generation over the different available networks. OTT refers to applications that deliver audio, video, and other media over the Internet by leveraging the infrastructure deployed by network operators but without their involvement in the control or distribution of the content and are known by their large consumption of network resources.

    Content

    This dataset contains 1581 instances and 131 attributes on a single file. Each instance represents a user’s consumption profile which holds summarized information about the consumption behavior of the user related to the 29 OTT applications identified in the different IP flows captured in order to create the dataset

    The OTT applications that the users interacted with during the capture experiment and were stored on the dataset are: Amazon, Apple store, Apple Icloud, Apple Itunes, Deezer, Dropbox, EasyTaxi, Ebay, Facebook, Gmail, Google suite, Google Maps, Browsing (HTTP, HTTP_Connect, HTTP_Download, HTTP_Proxy), Instagram, LastFM, Microsoft One Drive (MS_One_Drive), Facebook Messenger (MSN), Netflix, Skype, Spotify, Teamspeak, Teamviewer, Twitch, Twitter, Waze, Whatsapp, Wikipedia, Yahoo and Youtube.

    Each application has 4 different types of attributes (quantity of generated flows, mean duration of the flows, average size of the packets exchanged on the flows and the mean bytes per second on the flows). These attributes summarizes the interaction that the user had with the respective OTT application in terms of consumption. Furthermore, the dataset contains the user’s IP address in network and decimal format which are used as user identifiers. Finally the User Group attribute represents the objective class (high consumption, medium consumption and low consumption) in which a user is classified considering his/her OTT consumption behavior. All of this information gives a total of 131 attributes.

    For further information you can read and please cite the following papers:

    Research Gate: https://www.researchgate.net/publication/326150046_Personalized_Service_Degradation_Policies_on_OTT_Applications_Based_on_the_Consumption_Behavior_of_Users

    Springer: https://link.springer.com/chapter/10.1007/978-3-319-95168-3_37

    Research Gate: https://www.researchgate.net/publication/335954240_Consumption_Behavior_Analysis_of_Over_The_Top_Services_Incremental_Learning_or_Traditional_Methods

    IEEExplore: https://ieeexplore.ieee.org/document/8845576

    Attribute Description

    The structure of the attributes and its definition is presented below:

    • Source.Decimal: This attribute holds the user’s IP address in decimal format and it is mainly used as a user identifier.

    • Source.IP: This attribute holds the user’s IP address in network format (e.g., 192.168.14.35) and as in the previous case its main function is to work as a user identifier.

    • Application-Name.Flows: This type of attributes hold the information about the quantity of IP flows that a user generated toward an OTT application. As was mentioned before each application has a group of 4 attributes that describe the interaction of the user with a specific OTT application (an example for this case would be Netflix.Flows or Facebook.Flows).

    • Application-Name.Flow.Duration.Mean: This type of attributes hold the information related to the mean duration (time) of the flows generated by the user towards a specific OTT application, measured in microseconds. Examples of how this attributes are stored in the dataset are: Amazon.Flow.Duration.Mean or Instagram.Flow.Duration.Mean.

    • Application-Name.AVG.Packet.Size: This type of attributes hold the average size of the IP packets that were exchanged in all the flows generated by the user towards a specific OTT application, measured in bytes. It is important to notice that this size is focused on the packet’s header only. Examples of how this attribute are presented on the dataset are: Google_Maps.AVG.Packet.Size or Spotify.AVG.Packet.Size.

    • Application-Name.Flow.Bytes.Per.Sec: This type of attributes hold the mean number of bytes per second that were exchanged in the flows generated by the user towards a specific OTT application. Examples of this kind of attributes in the dataset are: Deezer.Flow.Bytes.Per.Sec or Skype.Flow.Bytes.Per.Sec.

    • User.Group: This type of attribute represents the objective class of the dataset i.e., the different groups that the users are classified in according to their OTT consumption behavior...

  18. Instagram Reach Analysis - Excel Project

    • kaggle.com
    Updated Jun 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raghad Al-marshadi (2025). Instagram Reach Analysis - Excel Project [Dataset]. https://www.kaggle.com/datasets/raghadalmarshadi/instagram-reach-analysis-excel-project/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 14, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Raghad Al-marshadi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    📊 Instagram Reach Analysis | تحليل الوصول في إنستغرام

    An exploratory data analysis project using Excel to understand what influences Instagram post reach and engagement.
    مشروع تحليل استكشافي لفهم العوامل المؤثرة في وصول منشورات إنستغرام وتفاعل المستخدمين، باستخدام Excel.

    📁 Project Description | وصف المشروع

    This project uses an Instagram dataset imported from Kaggle to explore how different factors like hashtags, saves, shares, and caption length influence impressions and engagement.
    يستخدم هذا المشروع بيانات من إنستغرام تم استيرادها من منصة Kaggle لتحليل كيف تؤثر عوامل مثل الهاشتاقات، الحفظ، المشاركة، وطول التسمية التوضيحية في عدد مرات الظهور والتفاعل.

    🛠️ Tools Used | الأدوات المستخدمة

    • Microsoft Excel
    • Pivot Tables
    • TRIM, WRAP, and other Excel formulas
    • مايكروسوفت إكسل
    • الجداول المحورية
    • دوال مثل TRIM و WRAP وغيرها في Excel

    🧹 Data Cleaning | تنظيف البيانات

    • Removed unnecessary spaces using TRIM
    • Removed 17 duplicate rows → 103 unique rows remained
    • Standardized formatting: freeze top row, wrap text, center align

    • إزالة المسافات غير الضرورية باستخدام TRIM

    • حذف 17 صفًا مكررًا → تبقى 103 صفوف فريدة

    • تنسيق موحد: تثبيت الصف الأول، لف النص، وتوسيط المحتوى

    🔍 Key Analysis Highlights | أبرز نتائج التحليل

    1. Impressions by Source | مرات الظهور حسب المصدر

    • Highest reach: Home > Hashtags > Explore > Other
    • Some totals exceed 100% due to overlapping

    2. Engagement Insights | رؤى حول التفاعل

    • Saves strongly correlate with higher impressions
    • Caption length is inversely related to likes
    • Shares have weak correlation with impressions

    3. Hashtag Patterns | تحليل الهاشتاقات

    • Most used: #Thecleverprogrammer, #Amankharwal, #Python
    • Repeating hashtags does not guarantee higher reach

    ✅ Conclusion | الخلاصة

    Shorter captions and higher save counts contribute more to reach than repeated hashtags. Profile visits are often linked to new followers.
    العناوين القصيرة وعدد الحفظات تلعب دورًا أكبر في الوصول من تكرار الهاشتاقات. كما أن زيارات الملف الشخصي ترتبط غالبًا بزيادة المتابعين.

    👩‍💻 Author | المؤلفة

    Raghad's LinkedIn

    🧠 Inspiration | الإلهام

    Inspired by content from TheCleverProgrammer, Aman Kharwal, and Kaggle datasets.
    استُلهم المشروع من محتوى TheCleverProgrammer وأمان خروال، وبيانات من Kaggle.

    💬 Feedback | الملاحظات

    Feel free to open an issue or share suggestions!
    يسعدنا تلقي ملاحظاتكم واقتراحاتكم عبر صفحة المشروع.

  19. f

    Database- Egyptian Physician-influencers on Social Media

    • figshare.com
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Noha Atef (2023). Database- Egyptian Physician-influencers on Social Media [Dataset]. http://doi.org/10.6084/m9.figshare.23170403.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    Noha Atef
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Details of 42 Egyptian popular physicians on social media including their professional email addresses and telephone numbers.

    The dataset includes the following details for each physician-influencer: Name, Gender, Specialization, YouTube Channel, Instagram Account, Facebook Page, Email Address, Telephone, Most Popular Social Network and the Highest Number of followers they had in April 2021.

  20. Instagram: distribution of global audiences 2024, by gender

    • statista.com
    • es.statista.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Instagram: distribution of global audiences 2024, by gender [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    As of January 2024, Instagram was slightly more popular with men than women, with men accounting for 50.6 percent of the platform’s global users. Additionally, the social media app was most popular amongst younger audiences, with almost 32 percent of users aged between 18 and 24 years.

                  Instagram’s Global Audience
    
                  As of January 2024, Instagram was the fourth most popular social media platform globally, reaching two billion monthly active users (MAU). This number is projected to keep growing with no signs of slowing down, which is not a surprise as the global online social penetration rate across all regions is constantly increasing.
                  As of January 2024, the country with the largest Instagram audience was India with 362.9 million users, followed by the United States with 169.7 million users.
    
                  Who is winning over the generations?
    
                  Even though Instagram’s audience is almost twice the size of TikTok’s on a global scale, TikTok has shown itself to be a fierce competitor, particularly amongst younger audiences. TikTok was the most downloaded mobile app globally in 2022, generating 672 million downloads. As of 2022, Generation Z in the United States spent more time on TikTok than on Instagram monthly.
    
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Bright Data (2022). Instagram Dataset [Dataset]. https://brightdata.com/products/datasets/instagram
Organization logo

Instagram Dataset

Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Apr 26, 2022
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License

https://brightdata.com/licensehttps://brightdata.com/license

Area covered
Worldwide
Description

Access detailed insights with our Instagram datasets, featuring follower counts, verified status, account types, and engagement scores. Explore post information including URLs, descriptions, hashtags, comments, likes, media, posting dates, locations, and reel URLs. Perfect for understanding user engagement and content trends to drive informed decisions and optimize your social media strategies. Over 750M records available Price starts at $250/100K records Data formats are available in JSON, NDJSON, CSV, XLSX and Parquet. 100% ethical and compliant data collection Included datapoints:

Account Fbid Id Followers Posts Count Is Business Account Is Professional Account Is Verified Avg Engagement External Url Biography Business Category Name Category Name Post Hashtags Following Posts Profile Image Link Profile URL Profile Name Highlights Count Highlights Full Name Is Private Bio Hashtags URL Is Joined Recently And much more

Search
Clear search
Close search
Google apps
Main menu