60 datasets found
  1. Market share of mobile operating systems worldwide 2009-2025, by quarter

    • statista.com
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Market share of mobile operating systems worldwide 2009-2025, by quarter [Dataset]. https://www.statista.com/statistics/272698/global-market-share-held-by-mobile-operating-systems-since-2009/
    Explore at:
    Dataset updated
    Jun 23, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    Android maintained its position as the leading mobile operating system worldwide in the first quarter of 2025 with a market share of about ***** percent. Android's closest rival, Apple's iOS, had a market share of approximately ***** percent during the same period. The leading mobile operating systems Both unveiled in 2007, Google’s Android and Apple’s iOS have evolved through incremental updates introducing new features and capabilities. The latest version of iOS, iOS 18, was released in September 2024, while the most recent Android iteration, Android 15, was made available in September 2023. A key difference between the two systems concerns hardware - iOS is only available on Apple devices, whereas Android ships with devices from a range of manufacturers such as Samsung, Google and OnePlus. In addition, Apple has had far greater success in bringing its users up to date. As of February 2024, ** percent of iOS users had iOS 17 installed, while in the same month only ** percent of Android users ran the latest version. The rise of the smartphone From around 2010, the touchscreen smartphone revolution had a major impact on sales of basic feature phones, as the sales of smartphones increased from *** million units in 2008 to **** billion units in 2023. In 2020, smartphone sales decreased to **** billion units due to the coronavirus (COVID-19) pandemic. Apple, Samsung, and lately also Xiaomi, were the big winners in this shift towards smartphones, with BlackBerry and Nokia among those unable to capitalize.

  2. Number of smartphone users in the United States 2014-2029

    • statista.com
    Updated May 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista Research Department (2025). Number of smartphone users in the United States 2014-2029 [Dataset]. https://www.statista.com/topics/2711/us-smartphone-market/
    Explore at:
    Dataset updated
    May 5, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Statista Research Department
    Area covered
    United States
    Description

    The number of smartphone users in the United States was forecast to continuously increase between 2024 and 2029 by in total 17.4 million users (+5.61 percent). After the fifteenth consecutive increasing year, the smartphone user base is estimated to reach 327.54 million users and therefore a new peak in 2029. Notably, the number of smartphone users of was continuously increasing over the past years.Smartphone users here are limited to internet users of any age using a smartphone. The shown figures have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of smartphone users in countries like Mexico and Canada.

  3. m

    Mobile App Usage | 1st Party | 3B+ events verified, US consumers |...

    • omnitrafficdata.mfour.com
    • datarade.ai
    Updated Dec 13, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MFour (2021). Mobile App Usage | 1st Party | 3B+ events verified, US consumers | Event-level iOS & Android [Dataset]. https://omnitrafficdata.mfour.com/products/mobile-app-usage-1st-party-3b-events-verified-us-consum-mfour
    Explore at:
    Dataset updated
    Dec 13, 2021
    Dataset authored and provided by
    MFour
    Area covered
    United States
    Description

    This dataset encompasses mobile smartphone application (app) usage, collected from over 150,000 triple-opt-in first-party US Daily Active Users (DAU). Use it for measurement, attribution or surveying to understand the why. iOS and Android operating system coverage.

  4. p

    Data from: Mobile App Analytics

    • paradoxintelligence.com
    json/csv
    Updated Apr 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paradox Intelligence (2025). Mobile App Analytics [Dataset]. https://www.paradoxintelligence.com/datasets
    Explore at:
    json/csvAvailable download formats
    Dataset updated
    Apr 18, 2025
    Dataset authored and provided by
    Paradox Intelligence
    License

    https://www.paradoxintelligence.com/termshttps://www.paradoxintelligence.com/terms

    Time period covered
    2015 - Present
    Area covered
    Global
    Description

    App download rankings, usage metrics, and user engagement data (iOS/Android)

  5. m

    Mobile Web Clickstream | 1st Party | 3B+ events verified, US consumers |...

    • omnitrafficdata.mfour.com
    • datarade.ai
    Updated Aug 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MFour (2021). Mobile Web Clickstream | 1st Party | 3B+ events verified, US consumers | Safari, Chrome, any iOS or Android [Dataset]. https://omnitrafficdata.mfour.com/products/mobile-web-clickstream-1st-party-3b-events-verified-us-mfour
    Explore at:
    Dataset updated
    Aug 1, 2021
    Dataset authored and provided by
    MFour
    Area covered
    United States
    Description

    This dataset encompasses mobile web clickstream behavior on any browser, collected from over 150,000 triple-opt-in first-party US Daily Active Users (DAU). Use it for measurement, attribution or path to purchase and consumer journey understanding. Full URL deliverable available including searches.

  6. Sample Beiwe Dataset

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Apr 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Emedom-Nnamdi; Kenzie W. Carlson; Zachary Clement; Marta Karas; Marcin Straczkiewicz; Jukka-Pekka Onnela; Patrick Emedom-Nnamdi; Kenzie W. Carlson; Zachary Clement; Marta Karas; Marcin Straczkiewicz; Jukka-Pekka Onnela (2022). Sample Beiwe Dataset [Dataset]. http://doi.org/10.5281/zenodo.6471045
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 20, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Patrick Emedom-Nnamdi; Kenzie W. Carlson; Zachary Clement; Marta Karas; Marcin Straczkiewicz; Jukka-Pekka Onnela; Patrick Emedom-Nnamdi; Kenzie W. Carlson; Zachary Clement; Marta Karas; Marcin Straczkiewicz; Jukka-Pekka Onnela
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a public release of Beiwe-generated data. The Beiwe Research Platform collects high-density data from a variety of smartphone sensors such as GPS, WiFi, Bluetooth, gyroscope, and accelerometer in addition to metadata from active surveys. A description of passive and active data streams, and a documentation concerning the use of Beiwe can be found here. This data was collected from an internal test study and is made available solely for educational purposes. It contains no identifying information; subject locations are de-identified using the noise GPS feature of Beiwe.

    As part of the internal test study, data from 6 participants were collected from the start of March 21, 2022 to the end of March 28, 2022. The local time zone of this study is Eastern Standard Time. Each participant was notified to complete a survey at 9am EST on Monday, Thursday, and Saturday of the study week. An additional survey was administered on Tuesday at 5:15pm EST. For each survey, subjects were asked to respond to the prompt "How much time (in hours) do you think you spent at home?".

  7. Question Classification: Android or iOS?

    • kaggle.com
    zip
    Updated Oct 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    xhlulu (2020). Question Classification: Android or iOS? [Dataset]. https://www.kaggle.com/xhlulu/question-classification-android-or-ios
    Explore at:
    zip(19598168 bytes)Available download formats
    Dataset updated
    Oct 29, 2020
    Authors
    xhlulu
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Context

    Imagine you have to process bug reports about an application your company is developing, which is available for both Android and iOS. Could you find a way to automatically classify them so you can send them to the right support team?

    Content

    The dataset contains data from two StackExchange forums: Android Enthusiasts and Ask Differently (Apple). I pre-processed both datasets from the raw XML files retrieved from Internet Archive in order to only contain useful information for building Machine Learning classifiers. In the case of the Apple forum, I narrowed down to the subset of questions that have one of the following tags: "iOS", "iPhone", "iPad".

    Think of this as a fun way to learn to build ML classifiers! The training, validation and test sets are all available, but in order to build robust models please try to use the test set as little as possible (only as a last validation for your models).

    Acknowledgements

    The image was retrieved from unsplash and made by @thenewmalcolm. Link to image here.

    The data was made available for free under a CC-BY-SA 4.0 license by StackExchange and hosted by Internet Archive. Find it here.

  8. V

    LEx Loudoun Express Request

    • data.virginia.gov
    • datasets.ai
    • +2more
    Updated Oct 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Loudoun County (2022). LEx Loudoun Express Request [Dataset]. https://data.virginia.gov/dataset/lex-loudoun-express-request
    Explore at:
    arcgis geoservices rest apiAvailable download formats
    Dataset updated
    Oct 7, 2022
    Dataset provided by
    Loudoun County GIS
    Authors
    Loudoun County
    Description

    Loudoun Express Request (LEx) is a citizen request system for members of the public to submit requests for service and report concerns to the county government via the internet and a mobile application. Our goal is to increase the efficiency, security, and accountability in responding to citizen concerns and questions.

    LEx is available as a mobile app for iOS and Android users!

  9. m

    Omnichannel Consumer Behaviors | 1st Party | 3B+ events verified, US...

    • omnitrafficdata.mfour.com
    • datarade.ai
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MFour, Omnichannel Consumer Behaviors | 1st Party | 3B+ events verified, US consumers | Path to purchase across app, web and point of interest locations [Dataset]. https://omnitrafficdata.mfour.com/products/omnichannel-consumer-journeys-1st-party-3b-events-verifi-mfour
    Explore at:
    Dataset authored and provided by
    MFour
    Area covered
    United States
    Description

    This dataset encompasses mobile app usage, web clickstream and location visitation behavior, collected from over 150,000 triple-opt-in first-party US Daily Active Users (DAU). The only omnichannel meter at scale representing iOS and Android platforms.

  10. d

    COVID-19 Contact Tracing: COVID Alert CT Summary by Week - ARCHIVE

    • catalog.data.gov
    • data.ct.gov
    Updated Jun 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ct.gov (2025). COVID-19 Contact Tracing: COVID Alert CT Summary by Week - ARCHIVE [Dataset]. https://catalog.data.gov/dataset/covid-19-contact-tracing-covid-alert-ct-summary-by-week
    Explore at:
    Dataset updated
    Jun 28, 2025
    Dataset provided by
    data.ct.gov
    Area covered
    Connecticut
    Description

    Note: This dataset has been archived and is no longer being updated. COVID Alert CT is Connecticut's voluntary, anonymous, exposure-notification smartphone app. If downloaded, the app will alert users if they have come into close contact with somebody who tests positive for COVID-19. This dataset includes the cumulative and weekly activations for COVID Alert CT for iOS and Android smartphones. The location of app users is not tracked--the app uses Bluetooth technology to detect when another person with the same app comes within 6 feet. The phones exchange a secure code with the each other to record that they were near. The number of codes issued and claimed is also included in this dataset. Data presented are based on a weekly reporting period (Sunday - Saturday). All data are preliminary and are subject to change. Additional information on COVID-19 Contact Tracing can be found here: https://portal.ct.gov/coronavirus/covidalertCT/homepage

  11. Forensic Toolkit Dataset

    • kaggle.com
    Updated May 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SUNNY THAKUR (2025). Forensic Toolkit Dataset [Dataset]. https://www.kaggle.com/datasets/cyberprince/forensic-toolkit-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 26, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    SUNNY THAKUR
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Forensic Toolkit Dataset Overview The Forensic Toolkit Dataset is a comprehensive collection of 300 digital forensics and incident response (DFIR) tools, designed for training AI models, supporting forensic investigations, and enhancing cybersecurity workflows. The dataset includes both mainstream and unconventional tools, covering disk imaging, memory analysis, network forensics, mobile forensics, cloud forensics, blockchain analysis, and AI-driven forensic techniques. Each entry provides detailed information about the tool's name, commands, usage, description, supported platforms, and official links, making it a valuable resource for forensic analysts, data scientists, and machine learning engineers. Dataset Description The dataset is provided in JSON Lines (JSONL) format, with each line representing a single tool as a JSON object. It is optimized for AI training, data analysis, and integration into forensic workflows. Schema Each entry contains the following fields:

    id: Sequential integer identifier (1–300).
    tool_name: Name of the forensic tool.
    commands: List of primary commands or usage syntax (if applicable; GUI-based tools noted).
    usage: Brief description of how the tool is used in forensic or incident response tasks.
    description: Detailed explanation of the tool’s purpose, capabilities, and forensic applications.
    link: URL to the tool’s official website or documentation (verified as of May 26, 2025).
    system: List of supported platforms (e.g., Linux, Windows, macOS, Android, iOS, Cloud).
    
    
    Sample Entry
    {
     "id": 1,
     "tool_name": "The Sleuth Kit (TSK)",
     "commands": ["fls -r -m / image.dd > bodyfile", "ils -e image.dd", "icat image.dd 12345 > output.file", "istat image.dd 12345"],
     "usage": "Analyze disk images to recover files, list file metadata, and create timelines.",
     "description": "Open-source collection of command-line tools for analyzing disk images and file systems (NTFS, FAT, ext). Enables recovery of deleted files, metadata examination, and timeline generation.",
     "link": "https://www.sleuthkit.org/sleuthkit/",
     "system": ["Linux", "Windows", "macOS"]
    }
    

    Dataset Structure

    Total Entries: 300

    Content Focus: Mainstream tools (e.g., The Sleuth Kit, FTK Imager). Unconventional tools (e.g., IoTSeeker, Chainalysis Reactor, DeepCase). Specialized areas: IoT, blockchain, cloud, mobile, and AI-driven forensics.

    Purpose The dataset is designed for:

    AI Training: Fine-tuning machine learning models for forensic tool recommendation, command generation, or artifact analysis. Forensic Analysis: Reference for forensic analysts to identify tools for specific investigative tasks. Cybersecurity Research: Supporting incident response, threat hunting, and vulnerability analysis. Education: Providing a structured resource for learning about DFIR tools and their applications.

    Usage Accessing the Dataset

    Download the JSONL files from the repository. Each file can be parsed using standard JSONL libraries (e.g., jsonlines in Python, jq in Linux). Combine files for a complete dataset or use individual segments as needed. ```python

    Example: Parsing with Python import json

    with open('forensic_toolkit_dataset_1_50.jsonl', 'r') as file: for line in file: tool = json.loads(line) print(f"Tool: {tool['tool_name']}, Supported Systems: {tool['system']}")

    Applications
    
    AI Model Training: Use the dataset to train models for predicting tool usage based on forensic tasks or generating command sequences.
    Forensic Workflows: Query the dataset to select tools for specific platforms (e.g., Cloud, Android) or tasks (e.g., memory analysis).
    Data Analysis: Analyze tool distribution across platforms or forensic categories using data science tools (e.g., Pandas, R).
    
    Contribution Guidelines
    We welcome contributions to expand or refine the dataset. To contribute:
    
    Fork the repository.
    Add new tools or update existing entries in JSONL format, ensuring adherence to the schema.
    Verify links and platform compatibility as of the contribution date.
    Submit a pull request with a clear description of changes.
    Avoid duplicating tools from existing entries (check IDs 1–300).
    
    Contribution Notes
    
    Ensure tools are forensically sound (preserve evidence integrity, court-admissible where applicable).
    Include unconventional or niche tools to maintain dataset diversity.
    Validate links and commands against official documentation.
    
    License
    This dataset is licensed under the MIT License. See the LICENSE file for details.
    Acknowledgments
    
    Inspired by forensic toolkits and resources from ForensicArtifacts.com, SANS, and open-source communities.
    Thanks to contributors for identifying unique and unconventional DFIR tools.
    
    Contact
    For issues, suggestions, or inquiries, please open an issue on the repository or contact the maintainers at sunny48445@gmail.com.
    
  12. Global social media subscriptions comparison 2023

    • statista.com
    • es.statista.com
    • +1more
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stacy Jo Dixon, Global social media subscriptions comparison 2023 [Dataset]. https://www.statista.com/topics/1164/social-networks/
    Explore at:
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Stacy Jo Dixon
    Description

    Social media companies are starting to offer users the option to subscribe to their platforms in exchange for monthly fees. Until recently, social media has been predominantly free to use, with tech companies relying on advertising as their main revenue generator. However, advertising revenues have been dropping following the COVID-induced boom. As of July 2023, Meta Verified is the most costly of the subscription services, setting users back almost 15 U.S. dollars per month on iOS or Android. Twitter Blue costs between eight and 11 U.S. dollars per month and ensures users will receive the blue check mark, and have the ability to edit tweets and have NFT profile pictures. Snapchat+, drawing in four million users as of the second quarter of 2023, boasts a Story re-watch function, custom app icons, and a Snapchat+ badge.

  13. An inertial and positioning dataset for the walking activity

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Nov 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sara Caramaschi; Carl Magnus Olsson; Elizabeth Orchard; Jackson Molloy; Dario Salvi (2024). An inertial and positioning dataset for the walking activity [Dataset]. http://doi.org/10.5061/dryad.n2z34tn5q
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 1, 2024
    Dataset provided by
    Oxford University Hospitals NHS Trust
    Malmö University
    Authors
    Sara Caramaschi; Carl Magnus Olsson; Elizabeth Orchard; Jackson Molloy; Dario Salvi
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    We are publishing a walking activity dataset including inertial and positioning information from 19 volunteers, including reference distance measured using a trundle wheel. The dataset includes a total of 96.7 Km walked by the volunteers, split into 203 separate tracks. The trundle wheel is of two types: it is either an analogue trundle wheel, which provides the total amount of meters walked in a single track, or it is a sensorized trundle wheel, which measures every revolution of the wheel, therefore recording a continuous incremental distance.
    Each track has data from the accelerometer and gyroscope embedded in the phones, location information from the Global Navigation Satellite System (GNSS), and the step count obtained by the device. The dataset can be used to implement walking distance estimation algorithms and to explore data quality in the context of walking activity and physical capacity tests, fitness, and pedestrian navigation. Methods The proposed dataset is a collection of walks where participants used their own smartphones to capture inertial and positioning information. The participants involved in the data collection come from two sites. The first site is the Oxford University Hospitals NHS Foundation Trust, United Kingdom, where 10 participants (7 affected by cardiovascular diseases and 3 healthy individuals) performed unsupervised 6MWTs in an outdoor environment of their choice (ethical approval obtained by the UK National Health Service Health Research Authority protocol reference numbers: 17/WM/0355). All participants involved provided informed consent. The second site is at Malm ̈o University, in Sweden, where a group of 9 healthy researchers collected data. This dataset can be used by researchers to develop distance estimation algorithms and how data quality impacts the estimation.

    All walks were performed by holding a smartphone in one hand, with an app collecting inertial data, the GNSS signal, and the step counting. On the other free hand, participants held a trundle wheel to obtain the ground truth distance. Two different trundle wheels were used: an analogue trundle wheel that allowed the registration of a total single value of walked distance, and a sensorized trundle wheel which collected timestamps and distance at every 1-meter revolution, resulting in continuous incremental distance information. The latter configuration is innovative and allows the use of temporal windows of the IMU data as input to machine learning algorithms to estimate walked distance. In the case of data collected by researchers, if the walks were done simultaneously and at a close distance from each other, only one person used the trundle wheel, and the reference distance was associated with all walks that were collected at the same time.The walked paths are of variable length, duration, and shape. Participants were instructed to walk paths of increasing curvature, from straight to rounded. Irregular paths are particularly useful in determining limitations in the accuracy of walked distance algorithms. Two smartphone applications were developed for collecting the information of interest from the participants' devices, both available for Android and iOS operating systems. The first is a web-application that retrieves inertial data (acceleration, rotation rate, orientation) while connecting to the sensorized trundle wheel to record incremental reference distance [1]. The second app is the Timed Walk app [2], which guides the user in performing a walking test by signalling when to start and when to stop the walk while collecting both inertial and positioning data. All participants in the UK used the Timed Walk app.

    The data collected during the walk is from the Inertial Measurement Unit (IMU) of the phone and, when available, the Global Navigation Satellite System (GNSS). In addition, the step count information is retrieved by the sensors embedded in each participant’s smartphone. With the dataset, we provide a descriptive table with the characteristics of each recording, including brand and model of the smartphone, duration, reference total distance, types of signals included and additionally scoring some relevant parameters related to the quality of the various signals. The path curvature is one of the most relevant parameters. Previous literature from our team, in fact, confirmed the negative impact of curved-shaped paths with the use of multiple distance estimation algorithms [3]. We visually inspected the walked paths and clustered them in three groups, a) straight path, i.e. no turns wider than 90 degrees, b) gently curved path, i.e. between one and five turns wider than 90 degrees, and c) curved path, i.e. more than five turns wider than 90 degrees. Other features relevant to the quality of collected signals are the total amount of time above a threshold (0.05s and 6s) where, respectively, inertial and GNSS data were missing due to technical issues or due to the app going in the background thus losing access to the sensors, sampling frequency of different data streams, average walking speed and the smartphone position. The start of each walk is set as 0 ms, thus not reporting time-related information. Walks locations collected in the UK are anonymized using the following approach: the first position is fixed to a central location of the city of Oxford (latitude: 51.7520, longitude: -1.2577) and all other positions are reassigned by applying a translation along the longitudinal and latitudinal axes which maintains the original distance and angle between samples. This way, the exact geographical location is lost, but the path shape and distances between samples are maintained. The difference between consecutive points “as the crow flies” and path curvature was numerically and visually inspected to obtain the same results as the original walks. Computations were made possible by using the Haversine Python library.

    Multiple datasets are available regarding walking activity recognition among other daily living tasks. However, few studies are published with datasets that focus on the distance for both indoor and outdoor environments and that provide relevant ground truth information for it. Yan et al. [4] introduced an inertial walking dataset within indoor scenarios using a smartphone placed in 4 positions (on the leg, in a bag, in the hand, and on the body) by six healthy participants. The reference measurement used in this study is a Visual Odometry System embedded in a smartphone that has to be worn at the chest level, using a strap to hold it. While interesting and detailed, this dataset lacks GNSS data, which is likely to be used in outdoor scenarios, and the reference used for localization also suffers from accuracy issues, especially outdoors. Vezovcnik et al. [5] analysed estimation models for step length and provided an open-source dataset for a total of 22 km of only inertial walking data from 15 healthy adults. While relevant, their dataset focuses on steps rather than total distance and was acquired on a treadmill, which limits the validity in real-world scenarios. Kang et al. [6] proposed a way to estimate travelled distance by using an Android app that uses outdoor walking patterns to match them in indoor contexts for each participant. They collect data outdoors by including both inertial and positioning information and they use average values of speed obtained by the GPS data as reference labels. Afterwards, they use deep learning models to estimate walked distance obtaining high performances. Their results share that 3% to 11% of the data for each participant was discarded due to low quality. Unfortunately, the name of the used app is not reported and the paper does not mention if the dataset can be made available.

    This dataset is heterogeneous under multiple aspects. It includes a majority of healthy participants, therefore, it is not possible to generalize the outcomes from this dataset to all walking styles or physical conditions. The dataset is heterogeneous also from a technical perspective, given the difference in devices, acquired data, and used smartphone apps (i.e. some tests lack IMU or GNSS, sampling frequency in iPhone was particularly low). We suggest selecting the appropriate track based on desired characteristics to obtain reliable and consistent outcomes.

    This dataset allows researchers to develop algorithms to compute walked distance and to explore data quality and reliability in the context of the walking activity. This dataset was initiated to investigate the digitalization of the 6MWT, however, the collected information can also be useful for other physical capacity tests that involve walking (distance- or duration-based), or for other purposes such as fitness, and pedestrian navigation.

    The article related to this dataset will be published in the proceedings of the IEEE MetroXRAINE 2024 conference, held in St. Albans, UK, 21-23 October.

    This research is partially funded by the Swedish Knowledge Foundation and the Internet of Things and People research center through the Synergy project Intelligent and Trustworthy IoT Systems.

  14. Multilingual Mobile App Review Dataset August 2025

    • kaggle.com
    Updated Jul 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pratyush Puri (2025). Multilingual Mobile App Review Dataset August 2025 [Dataset]. https://www.kaggle.com/datasets/pratyushpuri/multilingual-mobile-app-reviews-dataset-2025
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 31, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Pratyush Puri
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Multilingual Mobile App Reviews Dataset 2025

    Overview

    This comprehensive synthetic dataset contains 2,514 authentic mobile app reviews spanning 40+ popular applications across 24 different languages, making it ideal for multilingual NLP, sentiment analysis, and cross-cultural user behavior research.

    Dataset Statistics

    • Total Records: 2,514 reviews
    • Columns: 15 features
    • Languages Covered: 24 international languages
    • Apps Included: 40+ popular mobile applications
    • Time Range: 2023-2025 (2-year span)
    • File Format: CSV
    • Data Quality: Intentionally includes missing values and mixed data types for data cleaning practice

    Column Specifications

    Column NameData TypeDescriptionSample ValuesNull Count
    review_idIntegerUnique identifier for each review1, 2, 3, ...0
    user_idString*User identifier (should be integer)"1967825", "9242600"0
    app_nameStringName of the mobile applicationWhatsApp, Instagram, TikTok0
    app_categoryStringApplication categorySocial Networking, Entertainment0
    review_textStringMultilingual review content"This app is amazing!"63
    review_languageStringISO language codeen, es, fr, zh, hi, ar0
    ratingMixed*App rating (1.0-5.0, some as strings)4.5, "3.2", 1.138
    review_dateDateTimeTimestamp of review submission2024-10-09 19:26:400
    verified_purchaseBooleanPurchase verification statusTrue, False0
    device_typeStringDevice platformAndroid, iOS, iPad, Windows Phone0
    num_helpful_votesMixed*Helpfulness votes (some as strings)65, "209", 1630
    user_ageFloat*User age (should be integer)14.0, 18.0, 67.00
    user_countryStringUser's countryChina, Germany, Nigeria50
    user_genderStringUser genderMale, Female, Non-binary, Prefer not to say88
    app_versionStringApplication version number1.4, v8.9, 2.8.37.592625

    Note: Data types marked with asterisk require cleaning/conversion

    Language Distribution

    The dataset includes reviews in 24 languages: - European: English (en), Spanish (es), French (fr), German (de), Italian (it), Russian (ru), Polish (pl), Dutch (nl), Swedish (sv), Danish (da), Norwegian (no), Finnish (fi) - Asian: Chinese (zh), Hindi (hi), Japanese (ja), Korean (ko), Thai (th), Vietnamese (vi), Indonesian (id), Malay (ms) - Other: Arabic (ar), Turkish (tr), Filipino (tl)

    Application Categories

    Reviews cover 18 distinct categories: - Social Networking - Entertainment
    - Productivity - Travel & Local - Music & Audio - Video Players & Editors - Shopping - Navigation - Finance - Communication - Education - Photography - Dating - Business - Utilities - Health & Fitness - Games - News & Magazines

    Popular Apps Included

    40+ applications including: - Social: WhatsApp, Instagram, Facebook, Snapchat, TikTok, LinkedIn, Twitter, Reddit, Pinterest - Entertainment: YouTube, Netflix, Spotify - Productivity: Microsoft Office, Google Drive, Dropbox, OneDrive, Zoom, Discord - Travel: Uber, Lyft, Airbnb, Booking.com, Google Maps, Waze - Finance: PayPal, Venmo - Education: Duolingo, Khan Academy, Coursera, Udemy - Tools: Grammarly, Canva, Adobe Photoshop, VLC, MX Player

    Geographic Distribution

    Reviews from 24 countries across all continents: - Asia: China, India, Japan, South Korea, Thailand, Vietnam, Indonesia, Malaysia, Philippines, Pakistan, Bangladesh - Europe: Germany, United Kingdom, France, Italy, Spain, Russia, Turkey, Poland - Americas: United States, Canada, Brazil, Mexico - Oceania: Australia - Africa: Nigeria

    Data Quality Features

    Intentional data challenges for learning: - Missing Values: Strategic nulls in review_text (63), rating (38), user_country (50), user_gender (88), app_version (25) - Data Type Issues: - user_id stored as strings (should be integers) - user_age as floats (should be integers)
    - Some ratings as strings (should be floats) - Some helpful_votes as strings (should be integers) - Mixed Version Formats: "1.4", "v8.9", "2.8.37.5926", "14.1.60.318-beta"

    Use Cases

    This dataset is perfect for: - Multilingual NLP projects and sentiment analysis - Cross-cultural user behavior analysis - App store analytics and rating prediction - Data cleaning and preprocessing practice - Text classification across multiple languages - Time series analysis of app reviews - Geographic sentiment analysis - Data engineering pipeline development

    Data Cleaning Opportunities

    • Convert string IDs to integers
    • Standardize rating values to float
    • Han...
  15. MSCardio Seismocardiography (SCG) Dataset

    • zenodo.org
    zip
    Updated Mar 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amirtahà Taebi; Mohammad Muntasir Rahman; Amirtahà Taebi; Mohammad Muntasir Rahman (2025). MSCardio Seismocardiography (SCG) Dataset [Dataset]. http://doi.org/10.5281/zenodo.14975878
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 5, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Amirtahà Taebi; Mohammad Muntasir Rahman; Amirtahà Taebi; Mohammad Muntasir Rahman
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Overview

    The MSCardio Seismocardiography Dataset is an open-access dataset collected as part of the Mississippi State Remote Cardiovascular Monitoring (MSCardio) study. This dataset includes seismocardiogram (SCG) signals recorded from participants using smartphone sensors, enabling scalable, real-world cardiovascular monitoring without requiring specialized equipment. The dataset aims to support research in SCG signal processing, machine learning applications in health monitoring, and cardiovascular assessment.

    See the GitHub repository of this dataset for the latest updates: https://github.com/TaebiLab/MSCardio

    Background

    Cardiovascular diseases remain the leading cause of morbidity and mortality worldwide. SCG is a non-invasive technique that captures chest vibrations induced by cardiac activity and respiration, providing valuable insights into cardiac function. However, the scarcity of open-access SCG datasets has been a significant limitation for research in this field. The MSCardio dataset addresses this gap by providing real-world SCG signals collected via smartphone sensors from a diverse population.

    Data Description

    Study Population

    • Total participants enrolled: 123
    • Participants who uploaded data: 108 (46 males, 61 females, 1 unspecified)
    • Age range: 18 to 62 years
    • Total recordings uploaded: 515
    • Unique recordings after duplicate removal: 502
    • Platforms used: iOS and Android smartphones

    Signal Data

    • Axial vibrations in three directions (SCG) recorded using smartphone sensors
    • Sampling frequency varies depending on the device capabilities
    • Data synchronization is ensured for temporal accuracy
    • Missing SCG data identified in certain recordings, addressed through preprocessing

    Metadata

    Each recording includes:

    • Device model (e.g., iPhone Pro Max)
    • Recording time (UTC) and time zone
    • Platform (iOS or Android)
    • General demographic details (gender, race, age, height, weight)

    File Structure

    The dataset is organized as follows:


    MSCardio_SCG_Dataset/
    │── info/
    │ └── all_subject_data.csv # Consolidated metadata for all subjects
    │── MSCardio/
    │ ├── Subject_XXXX/ # Subject-specific folder
    │ │ ├── general_metadata.json # Demographic and device information
    │ │ ├── Recording_XXX/ # Individual recordings
    │ │ │ ├── scg.csv # SCG signal data
    │ │ │ ├── recording_metadata.json # Timestamp and device details

    Data Collection Protocol

    • Participants placed their smartphone on their chest while lying in a supine position.
    • The app recorded SCG signals for approximately two minutes.
    • Self-reported demographic data were collected.
    • Data were uploaded to the study's cloud storage.

    Usage and Applications

    This dataset is intended for research in:

    • SCG signal processing and feature extraction
    • Machine learning applications in cardiovascular monitoring
    • Investigating inter- and intra-subject variability in SCG signals
    • Remote cardiovascular health assessment
    • The Data_visualization.py script is provided for data visualization

    Citation

    If you use this dataset in your research, please cite:


    @article{rahman2025MSCardio,
    author = {Taebi, Amirtah{\`a} and Rahman, Mohammad Muntasir},
    title = {MSCardio: Initial insights from remote monitoring of cardiovascular-induced chest vibrations via smartphones},
    journal = {Data in Brief},
    year = {2025},
    publisher = {Elsevier}
    }

    Contact

    For any questions regarding the dataset, please contact:

    • Amirtahà Taebi and Mohammad Muntasir Rahman
    • E-mail: ataebi@abe.msstate.edu, mmr510@msstate.edu
    • Biomedical Engineering Program, Mississippi State University

    ---

    This dataset is provided under an open-access license. Please ensure ethical and responsible use when utilizing this dataset for research.

  16. e

    Environmental behaviour data collected through smartphones in a...

    • b2find.eudat.eu
    Updated May 27, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Environmental behaviour data collected through smartphones in a field-experimental setup - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/47cd73f2-a4b1-50d4-b6eb-687177627a1a
    Explore at:
    Dataset updated
    May 27, 2018
    Description

    This pilot study data was collected to test the feasibility of a new methodological approach that could help to investigate how environmental behaviour (transport behaviour, energy consumption, food consumption, goods consumption, wasting) dilemmas can be overcome on an individual level in real life by using smartphones to collect daily behavioural data in a field-experimental setup. The data includes information on the above-mentioned behaviour based on survey responses, GPS records, barcode scans and electric meter counter images. The data were collected in June 2017 daily over two weeks from 20 study participants of whom 12 were female and 8 male. Moreover, 13 were University students and 7 had a professional background. The two field-experimental interventions were implemented in the second week of data collection and included (1) behavioural targeting (individualised message nudges based on past behaviour) and (2) social monitoring (messages that allowed participants to monitor their own and others' environmental performance). The 20 study participants were randomly and evenly assigned to the two field-experimental interventions. Given the lack of a control group (due to financial limitations to include more study participants), the first week serves as a reference point for assessing treatment effects. Additional to the smartphone-based daily data, basic socio-demographic and attitudinal data were collected through an initial online survey. This data includes information on study participants' gender, age, financial situation and environmental attitudes (e.g. on climate change and recycling). Moreover, a final online survey was conducted after the two-weeks smartphone-based data collection to assess study participants' experience with the study design. The study participants were compensated with a 50 GBP Amazon vouchers for their study participation. This project is a pilot (feasibility) research project to study environmental behaviour (transport behaviour, energy consumption, food consumption, goods consumption, waste production) in real life situations by using smartphones to collect daily behavioural data over two weeks in a field-experimental setup. Demonstrating the feasibility of a novel approach to studying environmental behaviour will enable us to subsequently raise funds for and conduct a major study with additional field-experimental treatments and a larger, more representative sample. For the pilot project, 20 study participants will be recruited among University students and members of staff. They will be assigned to two groups to study to what extent two experimental treatments can alter environmental behaviour: (1) behavioural targeting: study participants' past behaviour will be analysed to deliver individually tailored tips on how they can increase the sustainability of their behaviour, testing nudge theory assumptions; (2) social monitoring: study participants (anonymised) will be able to monitor each other's environmental behaviour through the smartphone application, testing social influence theory assumptions. Data collection will include short survey question responses (e.g. type of transport used and why) on environmental behaviour, GPS coordinates, electric meter data and barcode scans. In the first week, the data will be collected without a field-experimental intervention. In the second week, the 20 study participants will be split into two groups of 10 in order to receive one of the two field-experimental treatments. EpiCollect 5 Smartphone application was used for data collection. The app operated on Android and iOS phones. The data collection fields implemented in the app and used in the project are free text entry (username), multiple choice and single choice responses to survey questions (see questionnaire), images (of electric meter counters, voluntarily), GPS coordinates (voluntarily), barcode scans (voluntarily). The users could collect the data throughout the day and would then upload the data actively to the server in the evening via the EpiCollect 5 app. All data was time-stamped. Furthermore, initial and final online survey data was collected before and after the smartphone-based data collection. The online survey data was collected via Q-set. The initial survey data contains single choice survey responses. The final survey data contains single choice survey responses as well as free text entry data (see questionnaires).

  17. Pegasus Spyware Attack(Synthetic Dataset)

    • kaggle.com
    Updated Aug 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Krishna1502 (2024). Pegasus Spyware Attack(Synthetic Dataset) [Dataset]. https://www.kaggle.com/datasets/krishna1502/pegasus-spyware-attacksynthetic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 1, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Krishna1502
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains synthetic logs designed to simulate the activity of the Pegasus malware, providing a rich resource for cybersecurity research, anomaly detection, and machine learning applications. The dataset includes 1000 entries with 17 columns, each capturing detailed information about device usage, network traffic, and potential security events

    Columns: user_id: Unique identifier for each user device_type: Type of device used (e.g., iPhone, Android) os_version: Operating system version of the device app_usage_pattern: Usage pattern of the applications (Low, Normal, High) timestamp: Timestamp of the recorded activity source_ip: Source IP address of the network traffic destination_ip: Destination IP address of the network traffic protocol: Network protocol used (e.g., HTTPS, FTP, SSH) data_volume: Volume of data transferred in the session log_type: Type of log entry (system, application, security) event: Specific event type (e.g., App Install, System Update, Logout, App Crash) event_description: Description of the event error_code: Error code associated with the event file_accessed: File path accessed during the event process: Process name involved in the event anomaly_detected: Description of any detected anomalies (e.g., Unknown Process Execution, Suspicious File Access) ioc: Indicators of Compromise (e.g., Pegasus Signature, Known Malicious IP)

    This dataset is ideal for those looking to develop and test cybersecurity solutions, understand malware behavior, or create educational tools for cybersecurity training. The data captures various scenarios of potential malware activities, making it a versatile resource for a range of cybersecurity applications.

  18. F

    Arabic Newspaper, Magazine, and Books OCR Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Arabic Newspaper, Magazine, and Books OCR Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/ocr-dataset/arabic-newspaper-book-magazine-ocr-image-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    What’s Included

    Introducing the Arabic Newspaper, Books, and Magazine Image Dataset - a diverse and comprehensive collection of images meticulously curated to propel the advancement of text recognition and optical character recognition (OCR) models designed specifically for the Arabic language.

    Dataset Contain & Diversity:

    Containing a total of 5000 images, this Arabic OCR dataset offers an equal distribution across newspapers, books, and magazines. Within, you'll find a diverse collection of content, including articles, advertisements, cover pages, headlines, call outs, and author sections from a variety of newspapers, books, and magazines. Images in this dataset showcases distinct fonts, writing formats, colors, designs, and layouts.

    To ensure the diversity of the dataset and to build robust text recognition model we allow limited (less than five) unique images from a single resource. Stringent measures have been taken to exclude any personal identifiable information (PII), and in each image a minimum of 80% space is contain visible Arabic text.

    Images have been captured under varying lighting conditions – both day and night – along with different capture angles and backgrounds, further enhancing dataset diversity. The collection features images in portrait and landscape modes.

    All these images were captured by native Arabic people to ensure the text quality, avoid toxic content and PII text. We used latest iOS and android mobile devices above 5MP camera to click all these images to maintain the image quality. In this training dataset images are available in both JPEG and HEIC formats.

    Metadata:

    Along with the image data you will also receive detailed structured metadata in CSV format. For each image it includes metadata like device information, source type like newspaper, magazine or book image, and image type like portrait or landscape etc. Each image is properly renamed corresponding to the metadata.

    The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of Arabic text recognition models.

    Update & Custom Collection:

    We're committed to expanding this dataset by continuously adding more images with the assistance of our native Arabic crowd community.

    If you require a custom dataset tailored to your guidelines or specific device distribution, feel free to contact us. We're equipped to curate specialized data to meet your unique needs.

    Furthermore, we can annotate or label the images with bounding box or transcribe the text in the image to align with your specific requirements using our crowd community.

    License:

    This Image dataset, created by FutureBeeAI, is now available for commercial use.

    Conclusion:

    Leverage the power of this image dataset to elevate the training and performance of text recognition, text detection, and optical character recognition models within the realm of the Arabic language. Your journey to enhanced language understanding and processing starts here.

  19. Z

    Data from: ADVIO: An Authentic Dataset for Visual-Inertial Odometry

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cortés, Santiago (2020). ADVIO: An Authentic Dataset for Visual-Inertial Odometry [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1320824
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Cortés, Santiago
    Solin, Arno
    Rahtu, Esa
    Kannala, Juho
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Data abstract: This Zenodo upload contains the ADVIO data for benchmarking and developing visual-inertial odometry methods. The data documentation is available on Github: https://github.com/AaltoVision/ADVIO

    Paper abstract: The lack of realistic and open benchmarking datasets for pedestrian visual-inertial odometry has made it hard to pinpoint differences in published methods. Existing datasets either lack a full six degree-of-freedom ground-truth or are limited to small spaces with optical tracking systems. We take advantage of advances in pure inertial navigation, and develop a set of versatile and challenging real-world computer vision benchmark sets for visual-inertial odometry. For this purpose, we have built a test rig equipped with an iPhone, a Google Pixel Android phone, and a Google Tango device. We provide a wide range of raw sensor data that is accessible on almost any modern-day smartphone together with a high-quality ground-truth track. We also compare resulting visual-inertial tracks from Google Tango, ARCore, and Apple ARKit with two recent methods published in academic forums. The data sets cover both indoor and outdoor cases, with stairs, escalators, elevators, office environments, a shopping mall, and metro station.

    Attribution: If you use this data set in your own work, please cite this paper:

    Santiago Cortés, Arno Solin, Esa Rahtu, and Juho Kannala (2018). ADVIO: An authentic dataset for visual-inertial odometry. In European Conference on Computer Vision (ECCV). Munich, Germany.

  20. F

    Filipino Newspaper, Magazine, and Books OCR Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Filipino Newspaper, Magazine, and Books OCR Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/ocr-dataset/filipino-newspaper-book-magazine-ocr-image-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    What’s Included

    Introducing the Filipino Newspaper, Books, and Magazine Image Dataset - a diverse and comprehensive collection of images meticulously curated to propel the advancement of text recognition and optical character recognition (OCR) models designed specifically for the Filipino language.

    Dataset Contain & Diversity:

    Containing a total of 5000 images, this Filipino OCR dataset offers an equal distribution across newspapers, books, and magazines. Within, you'll find a diverse collection of content, including articles, advertisements, cover pages, headlines, call outs, and author sections from a variety of newspapers, books, and magazines. Images in this dataset showcases distinct fonts, writing formats, colors, designs, and layouts.

    To ensure the diversity of the dataset and to build robust text recognition model we allow limited (less than five) unique images from a single resource. Stringent measures have been taken to exclude any personal identifiable information (PII), and in each image a minimum of 80% space is contain visible Filipino text.

    Images have been captured under varying lighting conditions – both day and night – along with different capture angles and backgrounds, further enhancing dataset diversity. The collection features images in portrait and landscape modes.

    All these images were captured by native Filipino people to ensure the text quality, avoid toxic content and PII text. We used latest iOS and android mobile devices above 5MP camera to click all these images to maintain the image quality. In this training dataset images are available in both JPEG and HEIC formats.

    Metadata:

    Along with the image data you will also receive detailed structured metadata in CSV format. For each image it includes metadata like device information, source type like newspaper, magazine or book image, and image type like portrait or landscape etc. Each image is properly renamed corresponding to the metadata.

    The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of Filipino text recognition models.

    Update & Custom Collection:

    We're committed to expanding this dataset by continuously adding more images with the assistance of our native Filipino crowd community.

    If you require a custom dataset tailored to your guidelines or specific device distribution, feel free to contact us. We're equipped to curate specialized data to meet your unique needs.

    Furthermore, we can annotate or label the images with bounding box or transcribe the text in the image to align with your specific requirements using our crowd community.

    License:

    This Image dataset, created by FutureBeeAI, is now available for commercial use.

    Conclusion:

    Leverage the power of this image dataset to elevate the training and performance of text recognition, text detection, and optical character recognition models within the realm of the Filipino language. Your journey to enhanced language understanding and processing starts here.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Market share of mobile operating systems worldwide 2009-2025, by quarter [Dataset]. https://www.statista.com/statistics/272698/global-market-share-held-by-mobile-operating-systems-since-2009/
Organization logo

Market share of mobile operating systems worldwide 2009-2025, by quarter

Explore at:
398 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jun 23, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description

Android maintained its position as the leading mobile operating system worldwide in the first quarter of 2025 with a market share of about ***** percent. Android's closest rival, Apple's iOS, had a market share of approximately ***** percent during the same period. The leading mobile operating systems Both unveiled in 2007, Google’s Android and Apple’s iOS have evolved through incremental updates introducing new features and capabilities. The latest version of iOS, iOS 18, was released in September 2024, while the most recent Android iteration, Android 15, was made available in September 2023. A key difference between the two systems concerns hardware - iOS is only available on Apple devices, whereas Android ships with devices from a range of manufacturers such as Samsung, Google and OnePlus. In addition, Apple has had far greater success in bringing its users up to date. As of February 2024, ** percent of iOS users had iOS 17 installed, while in the same month only ** percent of Android users ran the latest version. The rise of the smartphone From around 2010, the touchscreen smartphone revolution had a major impact on sales of basic feature phones, as the sales of smartphones increased from *** million units in 2008 to **** billion units in 2023. In 2020, smartphone sales decreased to **** billion units due to the coronavirus (COVID-19) pandemic. Apple, Samsung, and lately also Xiaomi, were the big winners in this shift towards smartphones, with BlackBerry and Nokia among those unable to capitalize.

Search
Clear search
Close search
Google apps
Main menu