47 datasets found
  1. Instagram accounts with the most followers worldwide 2024

    • statista.com
    • davegsmith.com
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon (2025). Instagram accounts with the most followers worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset updated
    Jun 17, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Cristiano Ronaldo has one of the most popular Instagram accounts as of April 2024.

                  The Portuguese footballer is the most-followed person on the photo sharing app platform with 628 million followers. Instagram's own account was ranked first with roughly 672 million followers.
    
                  How popular is Instagram?
    
                  Instagram is a photo-sharing social networking service that enables users to take pictures and edit them with filters. The platform allows users to post and share their images online and directly with their friends and followers on the social network. The cross-platform app reached one billion monthly active users in mid-2018. In 2020, there were over 114 million Instagram users in the United States and experts project this figure to surpass 127 million users in 2023.
    
                  Who uses Instagram?
    
                  Instagram audiences are predominantly young – recent data states that almost 60 percent of U.S. Instagram users are aged 34 years or younger. Fall 2020 data reveals that Instagram is also one of the most popular social media for teens and one of the social networks with the biggest reach among teens in the United States.
    
                  Celebrity influencers on Instagram
                  Many celebrities and athletes are brand spokespeople and generate additional income with social media advertising and sponsored content. Unsurprisingly, Ronaldo ranked first again, as the average media value of one of his Instagram posts was 985,441 U.S. dollars.
    
  2. d

    25M+ Images | AI Training Data | Annotated imagery data for AI | Object &...

    • datarade.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Seeds, 25M+ Images | AI Training Data | Annotated imagery data for AI | Object & Scene Detection | Global Coverage [Dataset]. https://datarade.ai/data-products/15m-images-ai-training-data-annotated-imagery-data-for-a-data-seeds
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset authored and provided by
    Data Seeds
    Area covered
    Barbados, United Arab Emirates, Yemen, Saint Lucia, French Polynesia, Nepal, Morocco, Virgin Islands (U.S.), Liberia, Iceland
    Description

    This dataset features over 25,000,000 high-quality general-purpose images sourced from photographers worldwide. Designed to support a wide range of AI and machine learning applications, it offers a richly diverse and extensively annotated collection of everyday visual content.

    Key Features: 1. Comprehensive Metadata: the dataset includes full EXIF data, detailing camera settings such as aperture, ISO, shutter speed, and focal length. Additionally, each image is pre-annotated with object and scene detection metadata, making it ideal for tasks like classification, detection, and segmentation. Popularity metrics, derived from engagement on our proprietary platform, are also included.

    2.Unique Sourcing Capabilities: the images are collected through a proprietary gamified platform for photographers. Competitions spanning various themes ensure a steady influx of diverse, high-quality submissions. Custom datasets can be sourced on-demand within 72 hours, allowing for specific requirements—such as themes, subjects, or scenarios—to be met efficiently.

    1. Global Diversity: photographs have been sourced from contributors in over 100 countries, covering a wide range of human experiences, cultures, environments, and activities. The dataset includes images of people, nature, objects, animals, urban and rural life, and more—captured across different times of day, seasons, and lighting conditions.

    2. High-Quality Imagery: the dataset includes images with resolutions ranging from standard to high-definition to meet the needs of various projects. Both professional and amateur photography styles are represented, offering a balance of realism and creativity across visual domains.

    3. Popularity Scores: each image is assigned a popularity score based on its performance in GuruShots competitions. This unique metric reflects how well the image resonates with a global audience, offering an additional layer of insight for AI models focused on aesthetics, engagement, or content curation.

    4. AI-Ready Design: this dataset is optimized for AI applications, making it ideal for training models in general image recognition, multi-label classification, content filtering, and scene understanding. It integrates easily with leading machine learning frameworks and pipelines.

    5. Licensing & Compliance: the dataset complies fully with data privacy regulations and offers transparent licensing for both commercial and academic use.

    Use Cases: 1. Training AI models for general-purpose image classification and tagging. 2. Enhancing content moderation and visual search systems. 3. Building foundational datasets for large-scale vision-language models. 4. Supporting research in computer vision, multimodal AI, and generative modeling.

    This dataset offers a comprehensive, diverse, and high-quality resource for training AI and ML models across a wide array of domains. Customizations are available to suit specific project needs. Contact us to learn more!

  3. Context Ad Clicks Dataset

    • kaggle.com
    Updated Feb 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Möbius (2021). Context Ad Clicks Dataset [Dataset]. https://www.kaggle.com/arashnic/ctrtest/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 9, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Möbius
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    The dataset generated by an E-commerce website which sells a variety of products at its online platform. The records user behaviour of its customers and stores it as a log. However, most of the times, users do not buy the products instantly and there is a time gap during which the customer might surf the internet and maybe visit competitor websites. Now, to improve sales of products, website owner has hired an Adtech company which built a system such that ads are being shown for owner products on its partner websites. If a user comes to owner website and searches for a product, and then visits these partner websites or apps, his/her previously viewed items or their similar items are shown on as an ad. If the user clicks this ad, he/she will be redirected to the owner website and might buy the product.

    The task is to predict the probability i.e. probability of user clicking the ad which is shown to them on the partner websites for the next 7 days on the basis of historical view log data, ad impression data and user data.

    Content

    You are provided with the view log of users (2018/10/15 - 2018/12/11) and the product description collected from the owner website. We also provide the training data and test data containing details for ad impressions at the partner websites(Train + Test). Train data contains the impression logs during 2018/11/15 – 2018/12/13 along with the label which specifies whether the ad is clicked or not. Your model will be evaluated on the test data which have impression logs during 2018/12/12 – 2018/12/18 without the labels. You are provided with the following files:

    • train.zip: This contains 3 files and description of each is given below:
    • train.csv
    • view_log.csv
    • item_data.csv

      • test.csv: test file contains the impressions for which the participants need to predict the click rate sample_submission.csv: This file contains the format in which you have to submit your predictions.

    Inspiration

    • Predict the probability probability of user clicking the ad which is shown to them on the partner websites for the next 7 days on the basis of historical view log data, ad impression data and user data.

    The evaluated metric could be "area under the ROC curve" between the predicted probability and the observed target.

  4. Search Engines in Germany - Market Research Report (2015-2030)

    • ibisworld.com
    Updated Jun 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBISWorld (2024). Search Engines in Germany - Market Research Report (2015-2030) [Dataset]. https://www.ibisworld.com/germany/industry/search-engines/935/
    Explore at:
    Dataset updated
    Jun 19, 2024
    Dataset authored and provided by
    IBISWorld
    License

    https://www.ibisworld.com/about/termsofuse/https://www.ibisworld.com/about/termsofuse/

    Time period covered
    2014 - 2029
    Area covered
    Germany
    Description

    In the last five years, the web portal industry has recorded significant revenue growth. Industry revenue increased by an average of 3.8% per year between 2019 and 2024 and is expected to reach 12.6 billion euros in the current year. The web portal industry comprises a variety of platforms such as social networks, search engines, video platforms and email services that are used by millions of users every day. These portals enable the exchange of information and communication as well as entertainment. Web portals generate their revenue mainly through advertising, premium services and commission payments. User numbers are rising steadily as more and more people go online and everyday processes are increasingly digitalised.In 2024, industry revenue is expected to increase by 3.2 %. Although the industry is growing, it is also facing challenges, particularly in terms of data protection. Web portals are constantly collecting user data, which can lead to misuse of the collected data. The General Data Protection Regulation (GDPR) introduced in the European Union in 2018 has prompted web portal operators to review their data protection practices and amend their terms and conditions in order to avoid fines. The aim of this regulation is to improve the protection of personal data and prevent data misuse.The industry's turnover is expected to increase by an average of 3.6% per year to 15 billion euros over the next five years. Video platforms such as YouTube often generate losses despite high user numbers. The reasons for this are the high costs of operation and infrastructure as well as expenses for copyright issues and compliance. Advertising on video platforms is perceived negatively by users, but is successful when it comes to attracting attention. Politicians are debating the taxation of revenues generated by internationally operating web portals based in tax havens. Another challenge is the copying of concepts, which inhibits innovation in the industry and can lead to legal problems.

  5. Data from: A large synthetic dataset for machine learning applications in...

    • zenodo.org
    csv, json, png, zip
    Updated Mar 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marc Gillioz; Marc Gillioz; Guillaume Dubuis; Philippe Jacquod; Philippe Jacquod; Guillaume Dubuis (2025). A large synthetic dataset for machine learning applications in power transmission grids [Dataset]. http://doi.org/10.5281/zenodo.13378476
    Explore at:
    zip, png, csv, jsonAvailable download formats
    Dataset updated
    Mar 25, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Marc Gillioz; Marc Gillioz; Guillaume Dubuis; Philippe Jacquod; Philippe Jacquod; Guillaume Dubuis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    With the ongoing energy transition, power grids are evolving fast. They operate more and more often close to their technical limit, under more and more volatile conditions. Fast, essentially real-time computational approaches to evaluate their operational safety, stability and reliability are therefore highly desirable. Machine Learning methods have been advocated to solve this challenge, however they are heavy consumers of training and testing data, while historical operational data for real-world power grids are hard if not impossible to access.

    This dataset contains long time series for production, consumption, and line flows, amounting to 20 years of data with a time resolution of one hour, for several thousands of loads and several hundreds of generators of various types representing the ultra-high-voltage transmission grid of continental Europe. The synthetic time series have been statistically validated agains real-world data.

    Data generation algorithm

    The algorithm is described in a Nature Scientific Data paper. It relies on the PanTaGruEl model of the European transmission network -- the admittance of its lines as well as the location, type and capacity of its power generators -- and aggregated data gathered from the ENTSO-E transparency platform, such as power consumption aggregated at the national level.

    Network

    The network information is encoded in the file europe_network.json. It is given in PowerModels format, which it itself derived from MatPower and compatible with PandaPower. The network features 7822 power lines and 553 transformers connecting 4097 buses, to which are attached 815 generators of various types.

    Time series

    The time series forming the core of this dataset are given in CSV format. Each CSV file is a table with 8736 rows, one for each hourly time step of a 364-day year. All years are truncated to exactly 52 weeks of 7 days, and start on a Monday (the load profiles are typically different during weekdays and weekends). The number of columns depends on the type of table: there are 4097 columns in load files, 815 for generators, and 8375 for lines (including transformers). Each column is described by a header corresponding to the element identifier in the network file. All values are given in per-unit, both in the model file and in the tables, i.e. they are multiples of a base unit taken to be 100 MW.

    There are 20 tables of each type, labeled with a reference year (2016 to 2020) and an index (1 to 4), zipped into archive files arranged by year. This amount to a total of 20 years of synthetic data. When using loads, generators, and lines profiles together, it is important to use the same label: for instance, the files loads_2020_1.csv, gens_2020_1.csv, and lines_2020_1.csv represent a same year of the dataset, whereas gens_2020_2.csv is unrelated (it actually shares some features, such as nuclear profiles, but it is based on a dispatch with distinct loads).

    Usage

    The time series can be used without a reference to the network file, simply using all or a selection of columns of the CSV files, depending on the needs. We show below how to select series from a particular country, or how to aggregate hourly time steps into days or weeks. These examples use Python and the data analyis library pandas, but other frameworks can be used as well (Matlab, Julia). Since all the yearly time series are periodic, it is always possible to define a coherent time window modulo the length of the series.

    Selecting a particular country

    This example illustrates how to select generation data for Switzerland in Python. This can be done without parsing the network file, but using instead gens_by_country.csv, which contains a list of all generators for any country in the network. We start by importing the pandas library, and read the column of the file corresponding to Switzerland (country code CH):

    import pandas as pd
    CH_gens = pd.read_csv('gens_by_country.csv', usecols=['CH'], dtype=str)

    The object created in this way is Dataframe with some null values (not all countries have the same number of generators). It can be turned into a list with:

    CH_gens_list = CH_gens.dropna().squeeze().to_list()

    Finally, we can import all the time series of Swiss generators from a given data table with

    pd.read_csv('gens_2016_1.csv', usecols=CH_gens_list)

    The same procedure can be applied to loads using the list contained in the file loads_by_country.csv.

    Averaging over time

    This second example shows how to change the time resolution of the series. Suppose that we are interested in all the loads from a given table, which are given by default with a one-hour resolution:

    hourly_loads = pd.read_csv('loads_2018_3.csv')

    To get a daily average of the loads, we can use:

    daily_loads = hourly_loads.groupby([t // 24 for t in range(24 * 364)]).mean()

    This results in series of length 364. To average further over entire weeks and get series of length 52, we use:

    weekly_loads = hourly_loads.groupby([t // (24 * 7) for t in range(24 * 364)]).mean()

    Source code

    The code used to generate the dataset is freely available at https://github.com/GeeeHesso/PowerData. It consists in two packages and several documentation notebooks. The first package, written in Python, provides functions to handle the data and to generate synthetic series based on historical data. The second package, written in Julia, is used to perform the optimal power flow. The documentation in the form of Jupyter notebooks contains numerous examples on how to use both packages. The entire workflow used to create this dataset is also provided, starting from raw ENTSO-E data files and ending with the synthetic dataset given in the repository.

    Funding

    This work was supported by the Cyber-Defence Campus of armasuisse and by an internal research grant of the Engineering and Architecture domain of HES-SO.

  6. P

    How to Login DuckDuckGo Account? | A Step-By-Step Guide Dataset

    • paperswithcode.com
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). How to Login DuckDuckGo Account? | A Step-By-Step Guide Dataset [Dataset]. https://paperswithcode.com/dataset/how-to-login-duckduckgo-account-a-step-by
    Explore at:
    Dataset updated
    Jun 17, 2025
    Description

    For Login DuckDuckGo Please Visit: 👉 DuckDuckGo Login Account

    In today’s digital age, privacy has become one of the most valued aspects of online activity. With increasing concerns over data tracking, surveillance, and targeted advertising, users are turning to privacy-first alternatives for everyday browsing. One of the most recognized names in private search is DuckDuckGo. Unlike mainstream search engines, DuckDuckGo emphasizes anonymity and transparency. However, many people wonder: Is there such a thing as a "https://duckduckgo-account.blogspot.com/ ">DuckDuckGo login account ?

    In this comprehensive guide, we’ll explore everything you need to know about the DuckDuckGo login account, what it offers (or doesn’t), and how to get the most out of DuckDuckGo’s privacy features.

    Does DuckDuckGo Offer a Login Account? To clarify up front: DuckDuckGo does not require or offer a traditional login account like Google or Yahoo. The concept of a DuckDuckGo login account is somewhat misleading if interpreted through the lens of typical internet services.

    DuckDuckGo's entire business model is built around privacy. The company does not track users, store personal information, or create user profiles. As a result, there’s no need—or intention—to implement a system that asks users to log in. This stands in stark contrast to other search engines that rely on login-based ecosystems to collect and use personal data for targeted ads.

    That said, some users still search for the term DuckDuckGo login account, usually because they’re trying to save settings, sync devices, or use features that may suggest a form of account system. Let’s break down what’s possible and what alternatives exist within DuckDuckGo’s platform.

    Saving Settings Without a DuckDuckGo Login Account Even without a traditional DuckDuckGo login account, users can still save their preferences. DuckDuckGo provides two primary ways to retain search settings:

    Local Storage (Cookies) When you customize your settings on the DuckDuckGo account homepage, such as theme, region, or safe search options, those preferences are stored in your browser’s local storage. As long as you don’t clear cookies or use incognito mode, these settings will persist.

    Cloud Save Feature To cater to users who want to retain settings across multiple devices without a DuckDuckGo login account, DuckDuckGo offers a feature called "Cloud Save." Instead of creating an account with a username or password, you generate a passphrase or unique key. This key can be used to retrieve your saved settings on another device or browser.

    While it’s not a conventional login system, it’s the closest DuckDuckGo comes to offering account-like functionality—without compromising privacy.

    Why DuckDuckGo Avoids Login Accounts Understanding why there is no DuckDuckGo login account comes down to the company’s core mission: to offer a private, non-tracking search experience. Introducing login accounts would:

    Require collecting some user data (e.g., email, password)

    Introduce potential tracking mechanisms

    Undermine their commitment to full anonymity

    By avoiding a login system, DuckDuckGo keeps user trust intact and continues to deliver on its promise of complete privacy. For users who value anonymity, the absence of a DuckDuckGo login account is actually a feature, not a flaw.

    DuckDuckGo and Device Syncing One of the most commonly searched reasons behind the term DuckDuckGo login account is the desire to sync settings or preferences across multiple devices. Although DuckDuckGo doesn’t use accounts, the Cloud Save feature mentioned earlier serves this purpose without compromising security or anonymity.

    You simply export your settings using a unique passphrase on one device, then import them using the same phrase on another. This offers similar benefits to a synced account—without the need for usernames, passwords, or emails.

    DuckDuckGo Privacy Tools Without a Login DuckDuckGo is more than just a search engine. It also offers a range of privacy tools—all without needing a DuckDuckGo login account:

    DuckDuckGo Privacy Browser (Mobile): Available for iOS and Android, this browser includes tracking protection, forced HTTPS, and built-in private search.

    DuckDuckGo Privacy Essentials (Desktop Extension): For Chrome, Firefox, and Edge, this extension blocks trackers, grades websites on privacy, and enhances encryption.

    Email Protection: DuckDuckGo recently launched a service that allows users to create "@duck.com" email addresses that forward to their real email—removing trackers in the process. Users sign up for this using a token or limited identifier, but it still doesn’t constitute a full DuckDuckGo login account.

    Is a DuckDuckGo Login Account Needed? For most users, the absence of a DuckDuckGo login account is not only acceptable—it’s ideal. You can:

    Use the search engine privately

    Customize and save settings

    Sync preferences across devices

    Block trackers and protect email

    —all without an account.

    While some people may find the lack of a traditional login unfamiliar at first, it quickly becomes a refreshing break from constant credential requests, data tracking, and login fatigue.

    The Future of DuckDuckGo Accounts As of now, DuckDuckGo maintains its position against traditional account systems. However, it’s clear the company is exploring privacy-preserving ways to offer more user features—like Email Protection and Cloud Save. These features may continue to evolve, but the core commitment remains: no tracking, no personal data storage, and no typical DuckDuckGo login account.

    Final Thoughts While the term DuckDuckGo login account is frequently searched, it represents a misunderstanding of how the platform operates . Unlike other tech companies that monetize personal data, DuckDuckGo has stayed true to its promise of privacy .

  7. COVID-19: Dataset of Global Research by Dimensions

    • console.cloud.google.com
    Updated Jan 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:Digital%20Science%20%26%20Research%20Solutions%20Inc&hl=es&inv=1&invt=Ab2viA (2023). COVID-19: Dataset of Global Research by Dimensions [Dataset]. https://console.cloud.google.com/marketplace/product/digitalscience-public/covid-19-dataset-dimensions?hl=es
    Explore at:
    Dataset updated
    Jan 5, 2023
    Dataset provided by
    Googlehttp://google.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset from Dimensions.ai contains all published articles, preprints, clinical trials, grants and research datasets that are related to COVID-19. This growing collection of research information now amounts to hundreds of thousands of items, and it is the only dataset of its kind. You can find an overview of the content in this interactive Data Studio dashboard: https://reports.dimensions.ai/covid-19/ The full metadata includes the researchers and organizations involved in the research, as well as abstracts, open access status, research categories and much more. You may wish to use the Dimensions web application to explore the dataset: https://covid-19.dimensions.ai/. This dataset is for researchers, universities, pharmaceutical & biotech companies, politicians, clinicians, journalists, and anyone else who wishes to explore the impact of the current COVID-19 pandemic. It is updated daily, and free for anyone to access. Please share this information with anyone you think would benefit from it. If you have any suggestions as to how we can improve our search terms to maximise the volume of research related to COVID-19, please contact us at support@dimensions.ai. About Dimensions: Dimensions is the largest database of research insight in the world. It contains a comprehensive collection of linked data related to the global research and innovation ecosystem, all in a single platform. This includes hundreds of millions of publications, preprints, grants, patents, clinical trials, datasets, researchers and organizations. Because Dimensions maps the entire research lifecycle, you can follow academic and industry research from early stage funding, through to output and on to social and economic impact. This Covid-19 dataset is a subset of the full database. The full Dimensions database is also available on BigQuery, via subscription. Please visit www.dimensions.ai/bigquery to gain access.Más información

  8. A

    Covid-19 Testing by Geography and Date

    • data.amerigeoss.org
    csv, json, rdf, xml
    Updated Jul 27, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States (2022). Covid-19 Testing by Geography and Date [Dataset]. https://data.amerigeoss.org/dataset/covid-19-testing-by-geography-and-date
    Explore at:
    csv, json, xml, rdfAvailable download formats
    Dataset updated
    Jul 27, 2022
    Dataset provided by
    United States
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    Note: As of April 16, 2021, this dataset will update daily with a five-day data lag.

    A. SUMMARY This dataset includes COVID-19 tests by resident neighborhood and specimen collection date (the day the test was collected). Specifically, this dataset includes tests of San Francisco residents who listed a San Francisco home address at the time of testing. These resident addresses were then geo-located and mapped to neighborhoods. The resident address associated with each test is hand-entered and susceptible to errors, therefore neighborhood data should be interpreted as an approximation, not a precise nor comprehensive total.

    In recent months, about 5% of tests are missing addresses and therefore cannot be included in any neighborhood totals. In earlier months, more tests were missing address data. Because of this high percentage of tests missing resident address data, this neighborhood testing data for March, April, and May should be interpreted with caution (see below)

    Percentage of tests missing address information, by month in 2020 Mar - 33.6% Apr - 25.9% May - 11.1% Jun - 7.2% Jul - 5.8% Aug - 5.4% Sep - 5.1% Oct (Oct 1-12) - 5.1%

    To protect the privacy of residents, the City does not disclose the number of tests in neighborhoods with resident populations of fewer than 1,000 people. These neighborhoods are omitted from the data (they include Golden Gate Park, John McLaren Park, and Lands End).

    Tests for residents that listed a Skilled Nursing Facility as their home address are not included in this neighborhood-level testing data. Skilled Nursing Facilities have required and repeated testing of residents, which would change neighborhood trends and not reflect the broader neighborhood's testing data.

    This data was de-duplicated by individual and date, so if a person gets tested multiple times on different dates, all tests will be included in this dataset (on the day each test was collected).

    The total number of positive test results is not equal to the total number of COVID-19 cases in San Francisco. During this investigation, some test results are found to be for persons living outside of San Francisco and some people in San Francisco may be tested multiple times (which is common). To see the number of new confirmed cases by neighborhood, reference this map: https://data.sfgov.org/stories/s/Map-of-Cumulative-Cases/adm5-wq8i#new-cases-map

    B. HOW THE DATASET IS CREATED COVID-19 laboratory test data is based on electronic laboratory test reports. Deduplication, quality assurance measures and other data verification processes maximize accuracy of laboratory test information. All testing data is then geo-coded by resident address. Then data is aggregated by "https://data.sfgov.org/Geographic-Locations-and-Boundaries/Analysis-Neighborhoods/p5b7-5n3h ">analysis neighborhood and specimen collection date.

    Data are prepared by close of business Monday through Saturday for public display.

    C. UPDATE PROCESS Updates automatically at 05:00 Pacific Time each day. Redundant runs are scheduled at 07:00 and 09:00 in case of pipeline failure.

    D. HOW TO USE THIS DATASET Due to the high degree of variation in the time needed to complete tests by different labs there is a delay in this reporting. On March 24 the Health Officer ordered all labs in the City to report complete COVID-19 testing information to the local and state health departments.

    In order to track trends over time, a data user can analyze this data by "specimen_collection_date".

    Calculating Percent Positivity: The positivity rate is the percentage of tests that return a positive result for COVID-19 (positive tests divided by the sum of positive and negative tests). Indeterminate results, which could not conclusively determine whether COVID-19 virus was present, are not included in the calculation of percent positive. Percent positivity indicates how widesprea

  9. P

    Volunteer task execution events in Galaxy Zoo and The Milky Way citizen...

    • paperswithcode.com
    Updated Feb 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Volunteer task execution events in Galaxy Zoo and The Milky Way citizen science projects Dataset [Dataset]. https://paperswithcode.com/dataset/data-sets-of-volunteer-task-execution-events
    Explore at:
    Dataset updated
    Feb 18, 2022
    Description

    Context of the data sets The Zooniverse platform (www.zooniverse.org) has successfully built a large community of volunteers contributing to citizen science projects. Galaxy Zoo and the Milky Way Project were hosted there.

    The original Galaxy Zoo project was launched in July 2007, but has since been redesigned and relaunched three times, building each time on the success of its predecessor. In 2010, the Zooniverse launched the third iteration of Galaxy Zoo, called Galaxy Zoo: Hubble, but for simplicity, we use the term Galaxy Zoo throughout this text to refer to this project. Each volunteer classifying on Galaxy Zoo is presented with a galaxy from the Sloan Digital Sky Survey (SDSS) or the Hubble Space Telescope as well as a decision tree of questions with answers represented by a fairly simple icon. The task is straightforward, and no specialist knowledge is required to execute it.

    Tasks in the Milky Way Project exhibit a larger cognitive load than those in Galaxy Zoo. Volunteers are asked to draw ellipses onto the image to mark the locations of bubbles. A short, online tutorial shows how to use the tool, along with examples of prominent bubbles. As a secondary task, users can also mark rectangular areas of interest, which can be labeled as small bubbles, green knots, dark nebulae, star clusters, galaxies, fuzzy red objects, or “other.” Users can add as many annotations as they wish before submitting the image, at which point they’re given another image for annotation.

    Description of the raw data In this repository, each file is a project, each line on a file is one classification record. The lines contain three pieces of information separated by commas (","). The first information is the classification id, which uniquely identifies the classification in the data set. The second information is the volunteer id, which uniquely identifies, in the data set, the volunteer who carried out the classification. The third information is the date and time in which the classification was carried out.

    The data set from the Galaxy Zoo project consists of records of 9,667,586 tasks executed by 86,413 volunteers over 840 days, starting on April 17th, 2010. The data set from the Milky Way Project consists of records from 643,408 tasks executed by 23,889 volunteers over 670 days, starting on December 3rd, 2010.

    These datasets were provided by Arfon Smith and Robert Simpson, from the Zooniverse platform, in October, 2012. To understand how volunteers make their contributions in these citizen science projects, Ponciano, Brasileiro, Simpson and Smith (2014) analyzed both data sets considering a volunteer engagement perspective.

    Metrics derived from the data set Ponciano, Brasileiro, Simpson and Smith (2014) proposed and computed the following metrics on the data set: Frequency, or the number of days in which the volunteer was actively executing tasks in the project. Daily productivity, or the average number of tasks the volunteer executed per day in which he or she was active. Typical session duration, or the short, continuous period of time the volunteer devoted to execute tasks on the project. A session begins when a volunteer starts a task execution, but it may end for a variety of reasons, such as the volunteer achieving the time he or she wanted to devote to the project, or that person getting tired or bored because of something related to the task performed. The typical session duration is the median of the duration of all the volunteer’s contribution sessions. Devoted time, or the total time the volunteer has spent executing tasks on the project. It's calculated as the sum of the duration of all the volunteer’s contribution sessions.

    They generate statistical probability distributions to the volunteer engagement characteristics that fit the parameters of Zipf and Log Normal. The results reported in the study reveal many characteristics of the distributions of volunteer participation in the projects. For example, they show that the majority of the volunteers perform tasks in just one day and do not come back, but those who come back contribute the larger proportion of tasks executed. For more information about methods and results of the first study that analysed the data set, please, see Ponciano, Brasileiro, Simpson and Smith (2014).

    In a subsequent study, Ponciano and Brasileiro (2014) deepened the study by considering a new framework to study volunteer engagement. In this new study, new metrics and a clustering approach were used to identify groups of volunteers who exhibit a similar engagement profile. A new set of metrics is designed to measure the engagement of participants that exhibit an ongoing contribution and have contributed in at least two different days, so they focus on participants that are more likely to fit into the voluntarism definition. In this perspective, they formalyzed the following metrics: Activity ratio, Daily devoted time, Relative activity duration, and Variation in periodicity. Their results show that the volunteers in such projects can be grouped into five distinct engagement profiles that we label as follows: hardworking, spasmodic, persistent, lasting, and moderate. For more information about the method and results on the engagement profiles see Ponciano and Brasileiro (2014).

    Reporting the use of the data set The data sets stored in this repository are freely available to be used at the Creative Commons Attribution licence. In case you use the data set, please, include in your work a citation of the previous studies Ponciano, Brasileiro, Simpson and Smith (2014) and Ponciano and Brasileiro (2014) that were the first to characterize the data from Galaxy Zoo and the Milky Way Project in a volunteer engagement perspective. After that, you may also inform the Zooniverse platform that you have used data from Galaxy Zoo and the Milky Way Project. To do so, you can use this form indicated by the platform at the publication page.

    References Lesandro Ponciano, Francisco Brasileiro, Robert Simpson and Arfon Smith. "Volunteers' Engagement in Human Computation Astronomy Projects". Computing in Science and Engineering vol. 16, no. 6, pp. 52-59 (2014) DOI: 10.1109/MCSE.2014.4

    Lesandro Ponciano and Francisco Brasileiro. "Finding Volunteers' Engagement Profiles in Human Computation for Citizen Science Projects". Human Computation vol. 1, no. 2, pp. 245-264 (2014). DOI: 10.15346/hc.v1i2.12

  10. d

    600K+ Household Object Images | AI Training Data | Object Detection Data |...

    • datarade.ai
    Updated Aug 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Seeds (2024). 600K+ Household Object Images | AI Training Data | Object Detection Data | Annotated imagery data | Global Coverage [Dataset]. https://datarade.ai/data-products/500k-household-object-images-ai-training-data-object-det-data-seeds
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Aug 1, 2024
    Dataset authored and provided by
    Data Seeds
    Area covered
    Ecuador, United Republic of, Kiribati, Brunei Darussalam, Serbia, Austria, New Caledonia, Ukraine, Congo, Saint Kitts and Nevis
    Description

    This dataset features over 600,000 high-quality images of household objects sourced from photographers worldwide. Designed to support AI and machine learning applications, it offers an extensively annotated and highly diverse collection of everyday indoor items across cultural and functional contexts.

    Key Features: 1. Comprehensive Metadata: the dataset includes full EXIF data such as aperture, ISO, shutter speed, and focal length. Each image is annotated with object labels, room context, material types, and functional categories—ideal for training models in object detection, classification, and scene understanding. Popularity metrics based on platform engagement are also included.

    1. Unique Sourcing Capabilities: images are gathered through a proprietary gamified platform featuring competitions focused on home environments and still life. This ensures a rich flow of authentic, high-quality submissions. Custom datasets can be created on-demand within 72 hours, targeting specific object categories, use-cases (e.g., kitchenware, electronics, decor), or room types.

    2. Global Diversity: contributions from over 100 countries showcase household items from a wide range of cultures, economic settings, and design aesthetics. The dataset includes everything from modern appliances and utensils to traditional tools and furnishings, captured in kitchens, bedrooms, bathrooms, living rooms, and utility spaces.

    3. High-Quality Imagery: includes images from standard to ultra-high-definition, covering both staged product-like photos and natural usage contexts. This variety supports robust training for real-world applications in cluttered or dynamic environments.

    4. Popularity Scores: each image has a popularity score based on its performance in GuruShots competitions. These scores provide valuable input for training models focused on product appeal, consumer trend detection, or aesthetic evaluation.

    5. AI-Ready Design: optimized for use in smart home applications, inventory systems, assistive technologies, and robotics. Fully compatible with major machine learning frameworks and annotation workflows.

    6. Licensing & Compliance: all data is compliant with global privacy and content use regulations, with transparent licensing for both commercial and academic applications.

    Use Cases: 1. Training AI for home inventory and recognition in smart devices and AR tools. 2. Powering assistive technologies for accessibility and elder care. 3. Enhancing e-commerce recommendation and visual search systems. 4. Supporting robotics for home navigation, object grasping, and task automation.

    This dataset provides a comprehensive, high-quality resource for training AI across smart living, retail, and assistive domains. Custom requests are welcome. Contact us to learn more!

  11. Covid19_ChineseSocialMedia_Hotspots

    • kaggle.com
    Updated Apr 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hirsch (2020). Covid19_ChineseSocialMedia_Hotspots [Dataset]. https://www.kaggle.com/hirschsun/covid19-chinesesocialmedia-hotspots/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 21, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Hirsch
    Description

    Context

    From the beginning of 2020 to April 8th (the day Wuhan reopened), this dataset summarizes the social media hotspots and what people focused in the mainland of China, as well as the epidemic development trend during this period. The dataset containing four .csv files covers most social media platforms in the mainland: Sina Weibo, TikTok, Toutiao and Douban.

    Sina Weibo

    a platform based on fostering user relationships to share, disseminate and receive information. Through either the website or the mobile app, users can upload pictures and videos publicly for instant sharing, with other users being able to comment with text, pictures and videos, or use a multimedia instant messaging service. The company initially invited a large number of celebrities to join the platform at the beginning, and has since invited many media personalities, government departments, businesses and non-governmental organizations to open accounts as well for the purpose of publishing and communicating information. To avoid the impersonation of celebrities, Sina Weibo uses verification symbols; celebrity accounts have an orange letter "V" and organizations' accounts have a blue letter "V". Sina Weibo has more than 500 million registered users;[12] out of these, 313 million are monthly active users, 85% use the Weibo mobile app, 70% are college-aged, 50.10% are male and 49.90% are female. There are over 100 million messages posted by users each day. With 90 million followers, actress Xie Na holds the record for the most followers on the platform. Despite fierce competition among Chinese social media platforms, Sina Weibo has proven to be the most popular; part of this success may be attributable to the wider use of mobile technologies in China.[https://en.wikipedia.org/wiki/Sina_Weibo]

    Douyin

    Douyin (English: TikTok), referred to as TikTok, is a short-video social application on mobile phones. Users can record 15-second short videos, which can easily complete mouth-to-mouth (to mouth), and built-in special effects The user can leave a message to the video. Since September 2016, Toutiao has been launched online and is positioned as a short music video community suitable for Chinese young people. The application is vertical music UGC short videos, and the number of users has grown rapidly since 2017. In June 2018, Douyin reached 500 million monthly active users worldwide and 150 million daily active users in China. [https://zh.wikipedia.org/wiki/%E6%8A%96%E9%9F%B3]

    Toutiao

    Toutiao or Jinri Toutiao is a Chinese news and information content platform, a core product of the Beijing-based company ByteDance. By analyzing the features of content, users and users’ interaction with content, the company's algorithm models generate a tailored feed list of content for each user. Toutiao is one of China's largest mobile platforms of content creation, aggregation and distribution underpinned by machine learning techniques, with 120 million daily active users as of September 2017. [https://en.wikipedia.org/wiki/Toutiao]

    Douban

    Douban.com (Chinese: 豆瓣; pinyin: Dòubàn), launched on March 6, 2005, is a Chinese social networking service website that allows registered users to record information and create content related to film, books, music, recent events, and activities in Chinese cities. It could be seen as one of the most influential web 2.0 websites in China. Douban also owns an internet radio station, which ranks No.1 in the iOS App Store in 2012. Douban was formerly open to both registered and unregistered users. For registered users, the site recommends potentially interesting books, movies, and music to them in addition to serving as a social network website such as WeChat, Weibo and record keeper; for unregistered users, the site is a place to find ratings and reviews of media. Douban has about 200 million registered users as of 2013. The site serves pan-Chinese users, and its contents are in Chinese. It covers works and media in Chinese and in foreign languages. Some Chinese authors and critics register their official personal pages on the site. [https://en.wikipedia.org/wiki/Douban]

    Content

    Weibo realTimeHotSearchList can be regarded as a platform for gathering celebrity gossip, social life and major news. In this document, I collect the top 50 topics of the hot search list every 12 hours during the day, so there are 100 hot topics each day. These topics are converted into English by Google translation, although the translation effect is not ideal due to sentence segmentation and language background deviation. In this document, I created a new column ['Coron-Related ( 1 yes, 0 not ) '] to mark topics related to the new crown, if relevant, it is marked as 1, if not then marked empty or 0. The google translation is extremely inaccurate (so maybe google the Chinese title to confirm is the best bet...

  12. Next Generation Search Engines Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Next Generation Search Engines Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/next-generation-search-engines-market-global-industry-analysis
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Next Generation Search Engines Market Outlook




    According to our latest research, the global Next Generation Search Engines market size reached USD 16.2 billion in 2024, with a robust year-on-year growth driven by rapid technological advancements and escalating demand for intelligent search solutions across industries. The market is expected to witness a CAGR of 18.7% during the forecast period from 2025 to 2033, propelling the market to a projected value of USD 82.3 billion by 2033. The accelerating adoption of artificial intelligence (AI), machine learning (ML), and natural language processing (NLP) within search technologies is a key growth factor, as organizations seek more accurate, context-aware, and personalized information retrieval solutions.




    One of the most significant growth drivers for the Next Generation Search Engines market is the exponential increase in digital content and data generation worldwide. Enterprises and consumers alike are producing vast amounts of unstructured data daily, from documents and emails to social media posts and multimedia files. Traditional search engines often struggle to deliver relevant results from such complex datasets. Next generation search engines, powered by AI and ML algorithms, are uniquely positioned to address this challenge by providing semantic understanding, contextual relevance, and intent-driven results. This capability is especially critical for industries like healthcare, BFSI, and e-commerce, where timely and precise information retrieval can directly impact decision-making, operational efficiency, and customer satisfaction.




    Another major factor fueling the growth of the Next Generation Search Engines market is the proliferation of mobile devices and the evolution of user interaction paradigms. As consumers increasingly rely on smartphones, tablets, and voice assistants, there is a growing demand for search solutions that support voice and visual queries, in addition to traditional text-based searches. Technologies such as voice search and visual search are gaining traction, enabling users to interact with search engines more naturally and intuitively. This shift is prompting enterprises to invest in advanced search platforms that can seamlessly integrate with diverse devices and channels, enhancing user engagement and accessibility. The integration of NLP further empowers these platforms to understand complex queries, colloquial language, and regional dialects, making search experiences more inclusive and effective.




    Furthermore, the rise of enterprise digital transformation initiatives is accelerating the adoption of next generation search technologies across various sectors. Organizations are increasingly seeking to unlock the value of their internal data assets by deploying enterprise search solutions that can index, analyze, and retrieve information from multiple sources, including databases, intranets, cloud storage, and third-party applications. These advanced search engines not only improve knowledge management and collaboration but also support compliance, security, and data governance requirements. As businesses continue to embrace hybrid and remote work models, the need for efficient, secure, and scalable search capabilities becomes even more pronounced, driving sustained investment in this market.




    Regionally, North America currently dominates the Next Generation Search Engines market, owing to the early adoption of AI-driven technologies, strong presence of leading technology vendors, and high digital literacy rates. However, Asia Pacific is emerging as the fastest-growing region, fueled by rapid digitalization, expanding internet penetration, and increasing investments in AI research and development. Europe is also witnessing steady growth, supported by robust regulatory frameworks and growing demand for advanced search solutions in sectors such as BFSI, healthcare, and education. Latin America and the Middle East & Africa are gradually catching up, as enterprises in these regions recognize the value of next generation search engines in enhancing operational efficiency and customer experience.




  13. d

    Harvard CGA Streaming Billion Geotweet Dataset

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CGA, Harvard (2023). Harvard CGA Streaming Billion Geotweet Dataset [Dataset]. http://doi.org/10.7910/DVN/3FDVCA
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    CGA, Harvard
    Description

    Funded by a grant from the Sloan Foundation, and with support from Massachusetts Open Cloud, the Center for Geographic Analysis(CGA) at Harvard developed a “big geodata”, remotely hosted, real-time-updated dataset which is a prototype for a new data type hosted outside Dataverse which supports streaming updates, and is accessed via an API. The CGA developed 1) the software and hardware platform to support interactive exploration of a billion spatio-temporal objects, nicknamed the "BOP" (billion object platform) 2) an API to provide query access to the archive from Dataverse 3) client-side tools for querying/visualizing the contents of the archive and extracting data subsets. This project is currently no longer active. For more information please see: http://gis.harvard.edu/services/project-consultation/project-resume/billion-object-platform-bop. “Geotweets” are tweets containing a GPS coordinate from the originating device. Currently 1-2% of tweets are geotweets, about 8 million per day. The CGA has been harvesting geotweets since 2012.

  14. C

    cta platform

    • data.cityofchicago.org
    Updated Jul 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chicago Police Department (2025). cta platform [Dataset]. https://data.cityofchicago.org/Public-Safety/cta-platform/3sa5-ne2h
    Explore at:
    csv, application/rssxml, kml, tsv, application/rdfxml, xml, application/geo+json, kmzAvailable download formats
    Dataset updated
    Jul 11, 2025
    Authors
    Chicago Police Department
    Description

    This dataset reflects reported incidents of crime (with the exception of murders where data exists for each victim) that occurred in the City of Chicago from 2001 to present, minus the most recent seven days. Data is extracted from the Chicago Police Department's CLEAR (Citizen Law Enforcement Analysis and Reporting) system. In order to protect the privacy of crime victims, addresses are shown at the block level only and specific locations are not identified. Should you have questions about this dataset, you may contact the Research & Development Division of the Chicago Police Department at 312.745.6071 or RandD@chicagopolice.org. Disclaimer: These crimes may be based upon preliminary information supplied to the Police Department by the reporting parties that have not been verified. The preliminary crime classifications may be changed at a later date based upon additional investigation and there is always the possibility of mechanical or human error. Therefore, the Chicago Police Department does not guarantee (either expressed or implied) the accuracy, completeness, timeliness, or correct sequencing of the information and the information should not be used for comparison purposes over time. The Chicago Police Department will not be responsible for any error or omission, or for the use of, or the results obtained from the use of this information. All data visualizations on maps should be considered approximate and attempts to derive specific addresses are strictly prohibited. The Chicago Police Department is not responsible for the content of any off-site pages that are referenced by or that reference this web page other than an official City of Chicago or Chicago Police Department web page. The user specifically acknowledges that the Chicago Police Department is not responsible for any defamatory, offensive, misleading, or illegal conduct of other users, links, or third parties and that the risk of injury from the foregoing rests entirely with the user. The unauthorized use of the words "Chicago Police Department," "Chicago Police," or any colorable imitation of these words or the unauthorized use of the Chicago Police Department logo is unlawful. This web page does not, in any way, authorize such use. Data is updated daily Tuesday through Sunday. The dataset contains more than 65,000 records/rows of data and cannot be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Wordpad, to view and search. To access a list of Chicago Police Department - Illinois Uniform Crime Reporting (IUCR) codes, go to http://bit.ly/rk5Tpc.

  15. d

    1M+ Footwear Images | AI Training Data | Object Detection Data | Annotated...

    • datarade.ai
    Updated Mar 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Seeds (2020). 1M+ Footwear Images | AI Training Data | Object Detection Data | Annotated imagery data | Global Coverage [Dataset]. https://datarade.ai/data-products/650k-footwear-images-ai-training-data-object-detection-d-data-seeds
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Mar 26, 2020
    Dataset authored and provided by
    Data Seeds
    Area covered
    Cyprus, Jamaica, Sweden, Azerbaijan, Åland Islands, Botswana, Bhutan, Egypt, Iceland, Ecuador
    Description

    This dataset features over 1,000,000 high-quality images of footwear sourced from photographers, fashion creators, and enthusiasts worldwide. Designed to meet the demands of AI and machine learning applications, it provides a richly annotated, diverse, and scalable dataset covering a wide range of shoe types, styles, and use contexts.

    Key Features: 1. Comprehensive Metadata: each image includes full EXIF data and detailed annotations for shoe category (e.g., sneakers, boots, heels, sandals), brand visibility, usage context (e.g., worn, shelf, outdoor), and visual orientation (top view, side view, close-up). Ideal for training in classification, detection, segmentation, and style matching.

    1. Unique Sourcing Capabilities: images are sourced via a proprietary gamified photography platform, with fashion- and product-focused competitions generating high-quality, stylistically rich content. Custom datasets can be delivered within 72 hours for specific categories, demographics (e.g., men's, women's, kids'), or settings (studio, streetwear, retail).

    2. Global Diversity: contributors from over 100 countries provide visual access to a wide variety of footwear traditions, fashion trends, climates, and consumer segments. This ensures inclusivity across seasons, cultures, and economic tiers—from designer pieces to everyday wear.

    3. High-Quality Imagery: images range from standard to ultra-HD, captured in diverse lighting and backgrounds. Both professional studio shots and in-use lifestyle photography are included, supporting robust AI training in realistic and commercial scenarios.

    4. Popularity Scores: each image includes a popularity score based on performance in GuruShots competitions, offering valuable input for models analyzing visual appeal, trend prediction, or consumer preference.

    5. AI-Ready Design: formatted for seamless use in machine learning workflows, including fashion recognition, virtual try-on systems, inventory management, and visual search. Integrates easily with retail and recommendation platforms.

    6. Licensing & Compliance: fully compliant with commercial use and intellectual property standards. Licensing is transparent and flexible for fashion tech, retail AI, and academic use cases.

    Use Cases: 1. Training AI for footwear classification, tagging, and visual search in e-commerce. 2. Powering virtual try-on applications and personalized recommendation engines. 3. Supporting trend analysis and fashion forecasting tools. 4. Enhancing inventory intelligence, style comparison, and social commerce platforms.

    This dataset offers a powerful, high-resolution foundation for AI innovation across the footwear, fashion, and retail technology sectors. Custom filtering, formats, and metadata enrichment available. Contact us to learn more!

  16. Instagram: most popular posts as of 2024

    • statista.com
    • davegsmith.com
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon (2025). Instagram: most popular posts as of 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset updated
    Jun 17, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Instagram’s most popular post

                  As of April 2024, the most popular post on Instagram was Lionel Messi and his teammates after winning the 2022 FIFA World Cup with Argentina, posted by the account @leomessi. Messi's post, which racked up over 61 million likes within a day, knocked off the reigning post, which was 'Photo of an Egg'. Originally posted in January 2021, 'Photo of an Egg' surpassed the world’s most popular Instagram post at that time, which was a photo by Kylie Jenner’s daughter totaling 18 million likes.
                  After several cryptic posts published by the account, World Record Egg revealed itself to be a part of a mental health campaign aimed at the pressures of social media use.
    
                  Instagram’s most popular accounts
    
                  As of April 2024, the official Instagram account @instagram had the most followers of any account on the platform, with 672 million followers. Portuguese footballer Cristiano Ronaldo (@cristiano) was the most followed individual with 628 million followers, while Selena Gomez (@selenagomez) was the most followed woman on the platform with 429 million. Additionally, Inter Miami CF striker Lionel Messi (@leomessi) had a total of 502 million. Celebrities such as The Rock, Kylie Jenner, and Ariana Grande all had over 380 million followers each.
    
                  Instagram influencers
    
                  In the United States, the leading content category of Instagram influencers was lifestyle, with 15.25 percent of influencers creating lifestyle content in 2021. Music ranked in second place with 10.96 percent, followed by family with 8.24 percent. Having a large audience can be very lucrative: Instagram influencers in the United States, Canada and the United Kingdom with over 90,000 followers made around 1,221 US dollars per post.
    
                  Instagram around the globe
    
                  Instagram’s worldwide popularity continues to grow, and India is the leading country in terms of number of users, with over 362.9 million users as of January 2024. The United States had 169.65 million Instagram users and Brazil had 134.6 million users. The social media platform was also very popular in Indonesia and Turkey, with 100.9 and 57.1, respectively. As of January 2024, Instagram was the fourth most popular social network in the world, behind Facebook, YouTube and WhatsApp.
    
  17. d

    GHRSST Level 3P Global Subskin Sea Surface Temperature from the Advanced...

    • catalog.data.gov
    • data.cnra.ca.gov
    • +6more
    Updated Jul 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (Point of Contact) (2025). GHRSST Level 3P Global Subskin Sea Surface Temperature from the Advanced Very High Resolution Radiometer (AVHRR) on the MetOp-A satellite (GDS version 1) [Dataset]. https://catalog.data.gov/dataset/ghrsst-level-3p-global-subskin-sea-surface-temperature-from-the-advanced-very-high-resolution-r
    Explore at:
    Dataset updated
    Jul 1, 2025
    Dataset provided by
    (Point of Contact)
    Description

    A global Level 3 Group for High Resolution Sea Surface Temperature (GHRSST) dataset from the Advanced Very High Resolution Radiometer (AVHRR) on the MetOp-A platform (launched on 19 Oct 2006). This particular dataset is produced by the European Organization for the Exploitation of Meteorological Satellites (EUMETSAT), Ocean and Sea Ice Satellite Application Facility (OSI SAF) in France. The AVHRR is a space-borne scanning sensor on the National Oceanic and Atmospheric Administration (NOAA) family of Polar Orbiting Environmental Satellites (POES) having a operational legacy that traces back to the Television Infrared Observation Satellite-N (TIROS-N) launched in 1978. AVHRR instruments measure the radiance of the Earth in 5 (or 6) relatively wide spectral bands. The first two are centered around the red (0.6 micrometer) and near-infrared (0.9 micrometer) regions, the third one is located around 3.5 micrometer, and the last two sample the emitted thermal radiation, around 11 and 12 micrometers, respectively. The legacy 5 band instrument is known as AVHRR/2 while the more recent version, the AVHRR/3 (first carried on the NOAA-15 platform), acquires data in a 6th channel located at 1.6 micrometer. Typically the 11 and 12 micron channels are used to derive sea surface temperature (SST) sometimes in combination with the 3.5 micron channel. The highest ground resolution that can be obtained from the current AVHRR instruments is 1.1 km at nadir. The MetOp-A platform is sun synchronous generally viewing the same earth location twice a day (latitude dependent) due to the relatively large AVHRR swath of approximately 2400 km. The SST fields are derived from 1km AVHRR data that are re-mapped onto a 0.02 degree equal angle grid. In the processing chain, global AVHRR level 1b data are acquired at Centre de Meteorologie Spatiale (CMS) through the EUMETSAT/EUMETCAST system. A cloud mask is applied and SST is retrieved from the AVHRR infrared (IR) channels by using a multispectral technique. The MetOp-A SST L3P data are compliant with the Group for High Resolution SST (GHRSST) Data Specification (GDS) version 1.7.

  18. n

    Repository Analytics and Metrics Portal (RAMP) 2020 data

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Jul 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Wheeler; Kenning Arlitsch (2021). Repository Analytics and Metrics Portal (RAMP) 2020 data [Dataset]. http://doi.org/10.5061/dryad.dv41ns1z4
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 23, 2021
    Dataset provided by
    University of New Mexico
    Montana State University
    Authors
    Jonathan Wheeler; Kenning Arlitsch
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Version update: The originally uploaded versions of the CSV files in this dataset included an extra column, "Unnamed: 0," which is not RAMP data and was an artifact of the process used to export the data to CSV format. This column has been removed from the revised dataset. The data are otherwise the same as in the first version.

    The Repository Analytics and Metrics Portal (RAMP) is a web service that aggregates use and performance use data of institutional repositories. The data are a subset of data from RAMP, the Repository Analytics and Metrics Portal (http://rampanalytics.org), consisting of data from all participating repositories for the calendar year 2020. For a description of the data collection, processing, and output methods, please see the "methods" section below.

    Methods Data Collection

    RAMP data are downloaded for participating IR from Google Search Console (GSC) via the Search Console API. The data consist of aggregated information about IR pages which appeared in search result pages (SERP) within Google properties (including web search and Google Scholar).

    Data are downloaded in two sets per participating IR. The first set includes page level statistics about URLs pointing to IR pages and content files. The following fields are downloaded for each URL, with one row per URL:

    url: This is returned as a 'page' by the GSC API, and is the URL of the page which was included in an SERP for a Google property.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    date: The date of the search.
    

    Following data processing describe below, on ingest into RAMP a additional field, citableContent, is added to the page level data.

    The second set includes similar information, but instead of being aggregated at the page level, the data are grouped based on the country from which the user submitted the corresponding search, and the type of device used. The following fields are downloaded for combination of country and device, with one row per country/device combination:

    country: The country from which the corresponding search originated.
    device: The device used for the search.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    date: The date of the search.
    

    Note that no personally identifiable information is downloaded by RAMP. Google does not make such information available.

    More information about click-through rates, impressions, and position is available from Google's Search Console API documentation: https://developers.google.com/webmaster-tools/search-console-api-original/v3/searchanalytics/query and https://support.google.com/webmasters/answer/7042828?hl=en

    Data Processing

    Upon download from GSC, the page level data described above are processed to identify URLs that point to citable content. Citable content is defined within RAMP as any URL which points to any type of non-HTML content file (PDF, CSV, etc.). As part of the daily download of page level statistics from Google Search Console (GSC), URLs are analyzed to determine whether they point to HTML pages or actual content files. URLs that point to content files are flagged as "citable content." In addition to the fields downloaded from GSC described above, following this brief analysis one more field, citableContent, is added to the page level data which records whether each page/URL in the GSC data points to citable content. Possible values for the citableContent field are "Yes" and "No."

    The data aggregated by the search country of origin and device type do not include URLs. No additional processing is done on these data. Harvested data are passed directly into Elasticsearch.

    Processed data are then saved in a series of Elasticsearch indices. Currently, RAMP stores data in two indices per participating IR. One index includes the page level data, the second index includes the country of origin and device type data.

    About Citable Content Downloads

    Data visualizations and aggregations in RAMP dashboards present information about citable content downloads, or CCD. As a measure of use of institutional repository content, CCD represent click activity on IR content that may correspond to research use.

    CCD information is summary data calculated on the fly within the RAMP web application. As noted above, data provided by GSC include whether and how many times a URL was clicked by users. Within RAMP, a "click" is counted as a potential download, so a CCD is calculated as the sum of clicks on pages/URLs that are determined to point to citable content (as defined above).

    For any specified date range, the steps to calculate CCD are:

    Filter data to only include rows where "citableContent" is set to "Yes."
    Sum the value of the "clicks" field on these rows.
    

    Output to CSV

    Published RAMP data are exported from the production Elasticsearch instance and converted to CSV format. The CSV data consist of one "row" for each page or URL from a specific IR which appeared in search result pages (SERP) within Google properties as described above. Also as noted above, daily data are downloaded for each IR in two sets which cannot be combined. One dataset includes the URLs of items that appear in SERP. The second dataset is aggregated by combination of the country from which a search was conducted and the device used.

    As a result, two CSV datasets are provided for each month of published data:

    page-clicks:

    The data in these CSV files correspond to the page-level data, and include the following fields:

    url: This is returned as a 'page' by the GSC API, and is the URL of the page which was included in an SERP for a Google property.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    date: The date of the search.
    citableContent: Whether or not the URL points to a content file (ending with pdf, csv, etc.) rather than HTML wrapper pages. Possible values are Yes or No.
    index: The Elasticsearch index corresponding to page click data for a single IR.
    repository_id: This is a human readable alias for the index and identifies the participating repository corresponding to each row. As RAMP has undergone platform and version migrations over time, index names as defined for the previous field have not remained consistent. That is, a single participating repository may have multiple corresponding Elasticsearch index names over time. The repository_id is a canonical identifier that has been added to the data to provide an identifier that can be used to reference a single participating repository across all datasets. Filtering and aggregation for individual repositories or groups of repositories should be done using this field.
    

    Filenames for files containing these data end with “page-clicks”. For example, the file named 2020-01_RAMP_all_page-clicks.csv contains page level click data for all RAMP participating IR for the month of January, 2020.

    country-device-info:

    The data in these CSV files correspond to the data aggregated by country from which a search was conducted and the device used. These include the following fields:

    country: The country from which the corresponding search originated.
    device: The device used for the search.
    impressions: The number of times the URL appears within the SERP.
    clicks: The number of clicks on a URL which took users to a page outside of the SERP.
    clickThrough: Calculated as the number of clicks divided by the number of impressions.
    position: The position of the URL within the SERP.
    date: The date of the search.
    index: The Elasticsearch index corresponding to country and device access information data for a single IR.
    repository_id: This is a human readable alias for the index and identifies the participating repository corresponding to each row. As RAMP has undergone platform and version migrations over time, index names as defined for the previous field have not remained consistent. That is, a single participating repository may have multiple corresponding Elasticsearch index names over time. The repository_id is a canonical identifier that has been added to the data to provide an identifier that can be used to reference a single participating repository across all datasets. Filtering and aggregation for individual repositories or groups of repositories should be done using this field.
    

    Filenames for files containing these data end with “country-device-info”. For example, the file named 2020-01_RAMP_all_country-device-info.csv contains country and device data for all participating IR for the month of January, 2020.

    References

    Google, Inc. (2021). Search Console APIs. Retrieved from https://developers.google.com/webmaster-tools/search-console-api-original.

  19. h

    Full Range Heat Anomalies - USA 2021

    • heat.gov
    • hub.arcgis.com
    Updated Jan 6, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Trust for Public Land (2022). Full Range Heat Anomalies - USA 2021 [Dataset]. https://www.heat.gov/datasets/ec2cc72c3de04c9aa9fd467f4e2cd378
    Explore at:
    Dataset updated
    Jan 6, 2022
    Dataset authored and provided by
    The Trust for Public Land
    Area covered
    Description

    Notice: this is not the latest Heat Island Anomalies image service. For 2023 data visit https://tpl.maps.arcgis.com/home/item.html?id=e89a556263e04cb9b0b4638253ca8d10.This layer contains the relative degrees Fahrenheit difference between any given pixel and the mean heat value for the city in which it is located, for every city in the contiguous United States. This 30-meter raster was derived from Landsat 8 imagery band 10 (ground-level thermal sensor) from the summer of 2021, with patching from summer of 2020 where necessary.Federal statistics over a 30-year period show extreme heat is the leading cause of weather-related deaths in the United States. Extreme heat exacerbated by urban heat islands can lead to increased respiratory difficulties, heat exhaustion, and heat stroke. These heat impacts significantly affect the most vulnerable—children, the elderly, and those with preexisting conditions.The purpose of this layer is to show where certain areas of cities are hotter or cooler than the average temperature for that same city as a whole. This dataset represents a snapshot in time. It will be updated yearly, but is static between updates. It does not take into account changes in heat during a single day, for example, from building shadows moving. The thermal readings detected by the Landsat 8 sensor are surface-level, whether that surface is the ground or the top of a building. Although there is strong correlation between surface temperature and air temperature, they are not the same. We believe that this is useful at the national level, and for cities that don’t have the ability to conduct their own hyper local temperature survey. Where local data is available, it may be more accurate than this dataset. Dataset SummaryThis dataset was developed using proprietary Python code developed at The Trust for Public Land, running on the Descartes Labs platform through the Descartes Labs API for Python. The Descartes Labs platform allows for extremely fast retrieval and processing of imagery, which makes it possible to produce heat island data for all cities in the United States in a relatively short amount of time.In order to click on the image service and see the raw pixel values in a map viewer, you must be signed in to ArcGIS Online, then Enable Pop-Ups and Configure Pop-Ups.Using the Urban Heat Island (UHI) Image ServicesThe data is made available as an image service. There is a processing template applied that supplies the yellow-to-red or blue-to-red color ramp, but once this processing template is removed (you can do this in ArcGIS Pro or ArcGIS Desktop, or in QGIS), the actual data values come through the service and can be used directly in a geoprocessing tool (for example, to extract an area of interest). Following are instructions for doing this in Pro.In ArcGIS Pro, in a Map view, in the Catalog window, click on Portal. In the Portal window, click on the far-right icon representing Living Atlas. Search on the acronyms “tpl” and “uhi”. The results returned will be the UHI image services. Right click on a result and select “Add to current map” from the context menu. When the image service is added to the map, right-click on it in the map view, and select Properties. In the Properties window, select Processing Templates. On the drop-down menu at the top of the window, the default Processing Template is either a yellow-to-red ramp or a blue-to-red ramp. Click the drop-down, and select “None”, then “OK”. Now you will have the actual pixel values displayed in the map, and available to any geoprocessing tool that takes a raster as input. Below is a screenshot of ArcGIS Pro with a UHI image service loaded, color ramp removed, and symbology changed back to a yellow-to-red ramp (a classified renderer can also be used): Other Sources of Heat Island InformationPlease see these websites for valuable information on heat islands and to learn about exciting new heat island research being led by scientists across the country:EPA’s Heat Island Resource CenterDr. Ladd Keith, University of ArizonaDr. Ben McMahan, University of Arizona Dr. Jeremy Hoffman, Science Museum of Virginia Dr. Hunter Jones, NOAA Daphne Lundi, Senior Policy Advisor, NYC Mayor's Office of Recovery and ResiliencyDisclaimer/FeedbackWith nearly 14,000 cities represented, checking each city's heat island raster for quality assurance would be prohibitively time-consuming, so The Trust for Public Land checked a statistically significant sample size for data quality. The sample passed all quality checks, with about 98.5% of the output cities error-free, but there could be instances where the user finds errors in the data. These errors will most likely take the form of a line of discontinuity where there is no city boundary; this type of error is caused by large temperature differences in two adjacent Landsat scenes, so the discontinuity occurs along scene boundaries (see figure below). The Trust for Public Land would appreciate feedback on these errors so that version 2 of the national UHI dataset can be improved. Contact Dale.Watt@tpl.org with feedback.

  20. Data from: SNAPSHOT USA 2019-2023: The first five years of data from a...

    • data.niaid.nih.gov
    • dataone.org
    • +2more
    zip
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brigit Rooney; William McShea; Roland Kays; Michael Cove (2025). SNAPSHOT USA 2019-2023: The first five years of data from a coordinated camera trap survey of the United States [Dataset]. http://doi.org/10.5061/dryad.k0p2ngfhn
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    North Carolina State University
    North Carolina Museum of Natural Sciences
    Smithsonian Conservation Biology Institute
    Authors
    Brigit Rooney; William McShea; Roland Kays; Michael Cove
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    United States
    Description

    SNAPSHOT USA is an annual, multi-contributor camera trap survey of mammals across the United States. The growing SNAPSHOT USA dataset is intended for tracking the spatial and temporal responses of mammal populations to changes in land use, land cover, and climate. These data will be useful for exploring the drivers of spatial and temporal changes in relative abundance and distribution, as well as the impacts of species interactions on daily activity patterns. SNAPSHOT USA 2019–2023 contains 987,979 records of camera trap image sequence data and 9,694 records of camera trap deployment metadata. Data were collected across the United States of America in all 50 states, 12 ecoregions, and many ecosystems. Data were collected between August 1st and December 29th each year from 2019 to 2023. The dataset includes a wide range of taxa but is primarily focused on medium to large mammals. SNAPSHOT USA 2019–2023 comprises two .csv files. The original data can be found within the SNAPSHOT USA Initiative in the Wildlife Insights platform. Methods The first three annual SNAPSHOT USA surveys were coordinated by Roland Kays, Michael Cove, and William McShea. The 2019, 2020, and 2021 datasets are accessible for public use through the Supporting Information of their respective publications. Although the 2019 and 2020 surveys were originally processed and stored in eMammal (https://www.emammal.si.edu), all data are now housed in Wildlife Insights (WI) within the SNAPSHOT USA Initiative. The two most recent surveys, 2022 and 2023, were coordinated by the SNAPSHOT USA Survey Coordinator Brigit Rooney. This dataset represents the first publication of 2022 and 2023 SNAPSHOT USA data. The SNAPSHOT USA project developed a standard protocol in 2019 to survey mammals >100 g and large identifiable birds. Cameras are unbaited and set at approximately 50 cm height across an array of at least 7 cameras with a minimum distance of 200 m and a maximum of 5 km between them. The collection period for SNAPSHOT USA data is between September and October and the target minimum of camera trap-nights per array is 400. Some contributors to SNAPSHOT USA 2019–2023 started collecting data earlier or deployed cameras later based on locations or logistics, and we chose to include data from August 1st through December 29th each year in this dataset. The first two years of SNAPSHOT USA data incorporated an Expert Review Tool to verify the accuracy of every identification, as that was built in to the eMammal repository. This tool required SNAPSHOT USA project managers (Cove and Kays in 2019, with more taxon-specific reviewers in 2020) to review and confirm all species identifications, in an effort to minimize identification errors. As eMammal automatically grouped all uploaded images into “sequences” of images taken within 60 seconds of each other, by using the image timestamps, species identifications were made for individual sequences rather than images. These data have since been transferred to WI, where they underwent opportunistic review and correction by the SNAPSHOT USA Survey Coordinator. In contrast, SNAPSHOT USA 2021, 2022, and 2023 were managed and identified entirely in WI. All SNAPSHOT USA projects in this repository were created as “Sequence” projects, to enable the identification of sequences in the same manner as eMammal. Each 60-second sequence of images was classified to the narrowest taxonomic level possible by three iterations of validation. First, WI’s Artificial Intelligence algorithm suggested a taxonomic identification. This algorithm consists of a multiclass classification deep convolutional neural network model that uses pre-trained image embedding from Inception, a model used to identify objects. Second, each array’s Principal Investigator was responsible for validating the data, fixing Artificial Intelligence identification mistakes, and approving the data they contributed to the survey. Lastly, the SNAPSHOT USA Survey Coordinator quality-checked the deployment data and as many identified sequences as possible. This was a multistep process that began with checking the sequence metadata for obvious timestamp errors by organizing them chronologically in Microsoft Excel, and the deployment metadata for location errors by mapping their coordinates and looking for outliers. Next, the coordinator checked the sequence metadata for unlikely identifications, including species detections in places outside their known range, and verified their accuracy by viewing the images in WI. Finally, identifications for the most common species were verified by using the “Species” filter on WI to look for mistakes, one species at a time. When combining the five years of SNAPSHOT USA data to create SNAPSHOT USA 2019–2023, several aspects of the data were standardized to ensure consistency across all years. These were camera array names, camera location names, and taxonomy classifications. To match protocol requirements, all camera locations less than 5 km apart were classified as one array. This resulted in combining several arrays that were originally recorded under different names and ensuring that arrays in the same place maintained the same name each year. The camera location names were standardized by ensuring that all locations with geographic coordinates that were the same to four decimal places, in Decimal Degrees notation, had the same name. However, the original coordinates were retained in the dataset. Finally, all species taxonomy classifications for the 2019 and 2020 datasets (identified in eMammal) were standardized to match those used by WI. As part of this process, all subspecies of mammals in the dataset were changed to species level (e.g., Florida black bear (Ursus americanus floridanus) became American black bear (Ursus americanus)). For mammal taxonomy classifications, WI uses a combination of the International Union for Conservation of Nature (IUCN) Red List of Threatened Species (2023; https://iucnredlist.org) and the American Society of Mammalogists Mammal Diversity Database (2024; https://www.mammaldiversity.org). For bird species, WI uses Birdlife International’s taxonomy classifications (2024; https://datazone.birdlife.org/species/search). The WI taxonomy is continually updated in response to public user suggestions and the taxonomy used in the SNAPSHOT USA 2019–2023 dataset reflects the WI taxonomy used in June 2024.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stacy Jo Dixon (2025). Instagram accounts with the most followers worldwide 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
Organization logo

Instagram accounts with the most followers worldwide 2024

Explore at:
Dataset updated
Jun 17, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Stacy Jo Dixon
Description

Cristiano Ronaldo has one of the most popular Instagram accounts as of April 2024.

              The Portuguese footballer is the most-followed person on the photo sharing app platform with 628 million followers. Instagram's own account was ranked first with roughly 672 million followers.

              How popular is Instagram?

              Instagram is a photo-sharing social networking service that enables users to take pictures and edit them with filters. The platform allows users to post and share their images online and directly with their friends and followers on the social network. The cross-platform app reached one billion monthly active users in mid-2018. In 2020, there were over 114 million Instagram users in the United States and experts project this figure to surpass 127 million users in 2023.

              Who uses Instagram?

              Instagram audiences are predominantly young – recent data states that almost 60 percent of U.S. Instagram users are aged 34 years or younger. Fall 2020 data reveals that Instagram is also one of the most popular social media for teens and one of the social networks with the biggest reach among teens in the United States.

              Celebrity influencers on Instagram
              Many celebrities and athletes are brand spokespeople and generate additional income with social media advertising and sponsored content. Unsurprisingly, Ronaldo ranked first again, as the average media value of one of his Instagram posts was 985,441 U.S. dollars.
Search
Clear search
Close search
Google apps
Main menu