Facebook
TwitterThis dataset was created by Michael Lomuscio
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset simulates anonymized mobile screen time and app usage data collected from Android/iOS users over a 3-month period (Jan–April 2024). It captures daily usage trends across various app categories including:
Productivity: Google Docs, Notion, Slack
Entertainment: YouTube, Netflix, TikTok
Social Media: Instagram, WhatsApp, Facebook
Utilities: Chrome, Gmail, Maps
For YouTube, additional engagement statistics such as views, likes, and comments are included to analyze video popularity and content consumption behavior.
The dataset enables exploration of:
Productivity vs. entertainment screen time patterns
Daily usage fluctuations
App-specific user engagement
Correlation between time spent and user interactions
YouTube content virality metrics
This is a great resource for:
EDA projects
Behavioral clustering
Dashboard development
Time series and anomaly detection
Building recommendation or focus-assistive apps
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset offers a comprehensive overview of the iPhone's journey in the global smartphone market from 2010 to 2024 . It includes:
📊 Number of iPhone Users: Total users worldwide and within the USA. 📈 Sales Figures: Yearly iPhone sales data. 🏆 Market Share: Comparison of iOS and Android market shares across years. This dataset is perfect for:
Market forecasting and trend analysis. Competitive landscape studies between iOS and Android. Consumer behavior research in the tech industry. Whether you're a data scientist, market analyst, or tech enthusiast, this dataset provides valuable insights to support your research and projects.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Benchmarks allow for easy comparison between multiple devices by scoring their performance on a standardized series of tests, and they are useful in many instances: When buying a new phone or tablet
Newest data as of May 3rd, 2022. This dataset contains benchmarks of Android and iOS devices
Benchmark apps gives your device an overall numerical score as well as individual scores for each test it performs. The overall score is created by adding the results of those individual scores. These score numbers don't mean much on their own, they're just helpful for comparing different devices. For example, if your device's score is 300000, a device with a score of 600000 is about twice as fast. You can use individual test scores to compare the relative performance of specific parts of different devices. For example, you could compare how fast your phone's storage performs compared to another phone's storage.
The first part of the overall score is your CPU score. The CPU score in turn includes the output of CPU Mathematical Operations, CPU Common Algorithms, and CPU Multi-Core. In simpler words, the CPU score means how fast your phone processes commands. Your device's central processing unit (CPU) does most of the number-crunching. A faster CPU can run apps faster, so everything on your device will seem faster. Of course, once you get to a certain point, CPU speed won't affect performance much. However, a faster CPU may still help when running more demanding applications, such as high-end games.
The second part of the overall score is your GPU score. This score is comprised of the output of graphical components like Metal, OpenGL or Vulkan, depending on your device. The GPU score means how well your phone displays 2D and 3D graphics. Your device's graphics processing unit (GPU) handles accelerated graphics. When you play a game, your GPU kicks into gear and renders the 3D graphics or accelerates the shiny 2D graphics. Many interface animations and other transitions also use the GPU. The GPU is optimized for these sorts of graphics operations. The CPU could perform them, but it's more general-purpose and would take more time and battery power. You can say that your GPU does the graphics number-crunching, so a higher score here is better.
The third part of the overall score is your MEM score. The MEM score includes the results of the output of RAM Access, ROM APP IO, ROM Sequential Read and Write, and ROM Random Access. In simpler words, the MEM score means how fast and how much memory your phone possesses. RAM stands for random-access memory; while ROM stands for read-only memory. Your device uses RAM as working memory, while flash storage or an internal SD card is used for long-term storage. The faster it can write to and read data from its RAM, the faster your device will perform. Your RAM is constantly being used on your device, whatever you're doing. While RAM is volatile in nature, ROM is its opposite. RAM mostly stores temporary data, while ROM is used to store permanent data like the firmware of your phone. Both the RAM and ROM make up the memory of your phone, helping it to perform tasks efficiently.
The fourth and final part of the overall score is your UX score. The UX score is made up of the results of the output of the Data Security, Data Processing, Image Processing, User Experience, and Video CTS and Decode tests. The UX score means an overall score that represents how the device's "user experience" will be in the real world. It's a number you can look at to get a feel for a device's overall performance without digging into the above benchmarks or relying too much on the overall score.
Data scrapped from AnTuTu, cross-platform adjusted using 3DMark and Geekbench
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Explore the Indian Smartphone Market Dataset, featuring demographics, brand preferences (iPhone vs Android), pricing, purchase behavior, and usage trends.
Facebook
TwitterPercentage of smartphone users by selected smartphone use habits in a typical day.
Facebook
TwitterThis dataset encompasses mobile smartphone application (app) usage, collected from over 150,000 triple-opt-in first-party US Daily Active Users (DAU). Use it for measurement, attribution or surveying to understand the why. iOS and Android operating system coverage.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Video Dataset - 1,300+ files
The dataset comprises 1,300+ videos of 300+ people captured using mobile phones (including Android devices and iPhone) and webcams under varying lighting conditions. It is designed for research in face detection, object recognition, and event detection, leveraging high-quality videos from smartphone cameras and webcam streams. — Get the data
Dataset characteristics:
Characteristic Data
Description Each person recorded 4 videos… See the full description on the dataset page: https://huggingface.co/datasets/ud-biometrics/phone-and-webcam-dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Differences between operating systems (Android, iOS, Mac OS, and Windows; Study 2).
Facebook
TwitterThis dataset encompasses mobile web clickstream behavior on any browser, collected from over 150,000 triple-opt-in first-party US Daily Active Users (DAU). Use it for measurement, attribution or path to purchase and consumer journey understanding. Full URL deliverable available including searches.
Facebook
TwitterThis is a GPS dataset acquired from Google.
Google tracks the user’s device location through Google Maps, which also works on Android devices, the iPhone, and the web. It’s possible to see the Timeline from the user’s settings in the Google Maps app on Android or directly from the Google Timeline Website. It has detailed information such as when an individual is walking, driving, and flying. Such functionality of tracking can be enabled or disabled on demand by the user directly from the smartphone or via the website. Google has a Take Out service where the users can download all their data or select from the Google products they use the data they want to download. The dataset contains 120,847 instances from a period of 9 months or 253 unique days from February 2019 to October 2019 from a single user. The dataset comprises a pair of (latitude, and longitude), and a timestamp. All the data was delivered in a single CSV file. As the locations of this dataset are well known by the researchers, this dataset will be used as ground truth in many mobility studies.
Please cite the following papers in order to use the datasets:
T. Andrade, B. Cancela, and J. Gama, "Discovering locations and habits from human mobility data," Annals of Telecommunications, vol. 75, no. 9, pp. 505–521, 2020. 10.1007/s12243-020-00807-x (DOI)and T. Andrade, B. Cancela, and J. Gama, "From mobility data to habits and common pathways," Expert Systems, vol. 37, no. 6, p. e12627, 2020.10.1111/exsy.12627 (DOI)
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset has been artificially generated to mimic real-world user interactions within a mobile application. It contains 100,000 rows of data, each row of which represents a single event or action performed by a synthetic user. The dataset was designed to capture many of the attributes commonly tracked by app analytics platforms, such as device details, network information, user demographics, session data, and event-level interactions.
User & Session Metadata
User ID: A unique integer identifier for each synthetic user. Session ID: Randomly generated session identifiers (e.g., S-123456), capturing the concept of user sessions. IP Address: Fake IP addresses generated via Faker to simulate different network origins. Timestamp: Randomized timestamps (within the last 30 days) indicating when each interaction occurred. Session Duration: An approximate measure (in seconds) of how long a user remained active. Device & Technical Details
Device OS & OS Version: Simulated operating systems (Android/iOS) with plausible version numbers. Device Model: Common phone models (e.g., “Samsung Galaxy S22,” “iPhone 14 Pro,” etc.). Screen Resolution: Typical screen resolutions found in smartphones (e.g., “1080x1920”). Network Type: Indicates whether the user was on Wi-Fi, 5G, 4G, or 3G. Location & Locale
Location Country & City: Random global locations generated using Faker. App Language: Represents the user’s app language setting (e.g., “en,” “es,” “fr,” etc.). User Properties
Battery Level: The phone’s battery level as a percentage (0–100). Memory Usage (MB): Approximate memory consumption at the time of the event. Subscription Status: Boolean flag indicating if the user is subscribed to a premium service. User Age: Random integer ranging from teenagers to seniors (13–80). Phone Number: Fake phone numbers generated via Faker. Push Enabled: Boolean flag indicating if the user has push notifications turned on. Event-Level Interactions
Event Type: The action taken by the user (e.g., “click,” “view,” “scroll,” “like,” “share,” etc.). Event Target: The UI element or screen component interacted with (e.g., “home_page_banner,” “search_bar,” “notification_popup”). Event Value: A numeric field indicating additional context for the event (e.g., intensity, count, rating). App Version: Simulated version identifier for the mobile application (e.g., “4.2.8”). Data Quality & “Noise” To better approximate real-world data, 1% of all fields have been intentionally “corrupted” or altered:
Typos and Misspellings: Random single-character edits, e.g., “Andro1d” instead of “Android.” Missing Values: Some cells might be blank (None) to reflect dropped or unrecorded data. Random String Injections: Occasional random alphanumeric strings inserted where they don’t belong. These intentional discrepancies can help data scientists practice data cleaning, outlier detection, and data wrangling techniques.
Data Cleaning & Preprocessing: Ideal for practicing how to handle missing values, inconsistent data, and noise in a realistic scenario. Analytics & Visualization: Demonstrate user interaction funnels, session durations, usage by device/OS, etc. Machine Learning & Modeling: Suitable for building classification or clustering models (e.g., user segmentation, event classification). Simulation for Feature Engineering: Experiment with deriving new features (e.g., session frequency, average battery drain, etc.).
Synthetic Data: All entries (users, device info, IPs, phone numbers, etc.) are artificially generated and do not correspond to real individuals. Privacy & Compliance: Since no real personal data is present, there are no direct privacy concerns. However, always handle synthetic data ethically.
Facebook
TwitterThe number of mobile broadband connections in the Philippines was forecast to continuously increase between 2024 and 2029 by in total 18.3 million connections (+20.46 percent). After the ninth consecutive increasing year, the number of connections is estimated to reach 107.69 million connections and therefore a new peak in 2029. Mobile broadband connections include cellular connections with a download speed of at least 256 kbit/s (without satellite or fixed-wireless connections). Cellular Internet-of-Things (IoT) or machine-to-machine (M2M) connections are excluded. The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of mobile broadband connections in countries like Vietnam and Laos.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
We are publishing a walking activity dataset including inertial and positioning information from 19 volunteers, including reference distance measured using a trundle wheel. The dataset includes a total of 96.7 Km walked by the volunteers, split into 203 separate tracks. The trundle wheel is of two types: it is either an analogue trundle wheel, which provides the total amount of meters walked in a single track, or it is a sensorized trundle wheel, which measures every revolution of the wheel, therefore recording a continuous incremental distance.
Each track has data from the accelerometer and gyroscope embedded in the phones, location information from the Global Navigation Satellite System (GNSS), and the step count obtained by the device. The dataset can be used to implement walking distance estimation algorithms and to explore data quality in the context of walking activity and physical capacity tests, fitness, and pedestrian navigation.
Methods
The proposed dataset is a collection of walks where participants used their own smartphones to capture inertial and positioning information. The participants involved in the data collection come from two sites. The first site is the Oxford University Hospitals NHS Foundation Trust, United Kingdom, where 10 participants (7 affected by cardiovascular diseases and 3 healthy individuals) performed unsupervised 6MWTs in an outdoor environment of their choice (ethical approval obtained by the UK National Health Service Health Research Authority protocol reference numbers: 17/WM/0355). All participants involved provided informed consent. The second site is at Malm ̈o University, in Sweden, where a group of 9 healthy researchers collected data. This dataset can be used by researchers to develop distance estimation algorithms and how data quality impacts the estimation.
All walks were performed by holding a smartphone in one hand, with an app collecting inertial data, the GNSS signal, and the step counting. On the other free hand, participants held a trundle wheel to obtain the ground truth distance. Two different trundle wheels were used: an analogue trundle wheel that allowed the registration of a total single value of walked distance, and a sensorized trundle wheel which collected timestamps and distance at every 1-meter revolution, resulting in continuous incremental distance information. The latter configuration is innovative and allows the use of temporal windows of the IMU data as input to machine learning algorithms to estimate walked distance. In the case of data collected by researchers, if the walks were done simultaneously and at a close distance from each other, only one person used the trundle wheel, and the reference distance was associated with all walks that were collected at the same time.The walked paths are of variable length, duration, and shape. Participants were instructed to walk paths of increasing curvature, from straight to rounded. Irregular paths are particularly useful in determining limitations in the accuracy of walked distance algorithms. Two smartphone applications were developed for collecting the information of interest from the participants' devices, both available for Android and iOS operating systems. The first is a web-application that retrieves inertial data (acceleration, rotation rate, orientation) while connecting to the sensorized trundle wheel to record incremental reference distance [1]. The second app is the Timed Walk app [2], which guides the user in performing a walking test by signalling when to start and when to stop the walk while collecting both inertial and positioning data. All participants in the UK used the Timed Walk app.
The data collected during the walk is from the Inertial Measurement Unit (IMU) of the phone and, when available, the Global Navigation Satellite System (GNSS). In addition, the step count information is retrieved by the sensors embedded in each participant’s smartphone. With the dataset, we provide a descriptive table with the characteristics of each recording, including brand and model of the smartphone, duration, reference total distance, types of signals included and additionally scoring some relevant parameters related to the quality of the various signals. The path curvature is one of the most relevant parameters. Previous literature from our team, in fact, confirmed the negative impact of curved-shaped paths with the use of multiple distance estimation algorithms [3]. We visually inspected the walked paths and clustered them in three groups, a) straight path, i.e. no turns wider than 90 degrees, b) gently curved path, i.e. between one and five turns wider than 90 degrees, and c) curved path, i.e. more than five turns wider than 90 degrees. Other features relevant to the quality of collected signals are the total amount of time above a threshold (0.05s and 6s) where, respectively, inertial and GNSS data were missing due to technical issues or due to the app going in the background thus losing access to the sensors, sampling frequency of different data streams, average walking speed and the smartphone position. The start of each walk is set as 0 ms, thus not reporting time-related information. Walks locations collected in the UK are anonymized using the following approach: the first position is fixed to a central location of the city of Oxford (latitude: 51.7520, longitude: -1.2577) and all other positions are reassigned by applying a translation along the longitudinal and latitudinal axes which maintains the original distance and angle between samples. This way, the exact geographical location is lost, but the path shape and distances between samples are maintained. The difference between consecutive points “as the crow flies” and path curvature was numerically and visually inspected to obtain the same results as the original walks. Computations were made possible by using the Haversine Python library.
Multiple datasets are available regarding walking activity recognition among other daily living tasks. However, few studies are published with datasets that focus on the distance for both indoor and outdoor environments and that provide relevant ground truth information for it. Yan et al. [4] introduced an inertial walking dataset within indoor scenarios using a smartphone placed in 4 positions (on the leg, in a bag, in the hand, and on the body) by six healthy participants. The reference measurement used in this study is a Visual Odometry System embedded in a smartphone that has to be worn at the chest level, using a strap to hold it. While interesting and detailed, this dataset lacks GNSS data, which is likely to be used in outdoor scenarios, and the reference used for localization also suffers from accuracy issues, especially outdoors. Vezovcnik et al. [5] analysed estimation models for step length and provided an open-source dataset for a total of 22 km of only inertial walking data from 15 healthy adults. While relevant, their dataset focuses on steps rather than total distance and was acquired on a treadmill, which limits the validity in real-world scenarios. Kang et al. [6] proposed a way to estimate travelled distance by using an Android app that uses outdoor walking patterns to match them in indoor contexts for each participant. They collect data outdoors by including both inertial and positioning information and they use average values of speed obtained by the GPS data as reference labels. Afterwards, they use deep learning models to estimate walked distance obtaining high performances. Their results share that 3% to 11% of the data for each participant was discarded due to low quality. Unfortunately, the name of the used app is not reported and the paper does not mention if the dataset can be made available.
This dataset is heterogeneous under multiple aspects. It includes a majority of healthy participants, therefore, it is not possible to generalize the outcomes from this dataset to all walking styles or physical conditions. The dataset is heterogeneous also from a technical perspective, given the difference in devices, acquired data, and used smartphone apps (i.e. some tests lack IMU or GNSS, sampling frequency in iPhone was particularly low). We suggest selecting the appropriate track based on desired characteristics to obtain reliable and consistent outcomes.
This dataset allows researchers to develop algorithms to compute walked distance and to explore data quality and reliability in the context of the walking activity. This dataset was initiated to investigate the digitalization of the 6MWT, however, the collected information can also be useful for other physical capacity tests that involve walking (distance- or duration-based), or for other purposes such as fitness, and pedestrian navigation.
The article related to this dataset will be published in the proceedings of the IEEE MetroXRAINE 2024 conference, held in St. Albans, UK, 21-23 October.
This research is partially funded by the Swedish Knowledge Foundation and the Internet of Things and People research center through the Synergy project Intelligent and Trustworthy IoT Systems.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Methods: The data was collected via verbal survey, students were asked their gender, age and preference of smart phones. Hypothesis: It is hypothesized that the older individuals would prefer androids over iphone because they would be able to maximize the complex features of android and males would also prefer android because of its affordable and durable features. Prediction 1: Females prefer iPhone. Prediction 2: Older students prefer Android.
Facebook
TwitterThis dataset encompasses mobile app usage, web clickstream and location visitation behavior, collected from over 150,000 triple-opt-in first-party US Daily Active Users (DAU). The only omnichannel meter at scale representing iOS and Android platforms.
Facebook
TwitterThe number of smartphone users in the Philippines was forecast to increase between 2024 and 2029 by in total 5.6 million users (+7.29 percent). This overall increase does not happen continuously, notably not in 2026, 2027, 2028 and 2029. The smartphone user base is estimated to amount to 82.33 million users in 2029. Notably, the number of smartphone users of was continuously increasing over the past years.Smartphone users here are limited to internet users of any age using a smartphone. The shown figures have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2Fa96454d549040ca5bc6239b291b6a478%2Fgraph1.gif?generation=1729451150005529&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2Fddbecf3f014dc6d0c842ba2f1e0f7e11%2Fgraph2.gif?generation=1729451155866362&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2Fc142b757bbfe6a74e828354ae6beb9be%2Fgraph3.gif?generation=1729451160812914&alt=media" alt="">
This dataset, titled "Phone Listings from GSMArena.com," consists of two primary files: data.json and processed_data.csv, each containing detailed information about various phone models available on the market.
data.json File This file holds the raw, unprocessed data scraped from GSMArena.com. The columns and attributes include:
phone_brand: The brand or manufacturer of the phone (e.g., Apple, Samsung, Xiaomi). phone_model: The specific model or number of the phone. price: The price point of the phone, which can either be an exact figure or a rough estimate. This column might require data cleaning due to inconsistencies. specs: A nested dictionary that details the phone’s technical specifications. This includes features such as screen size, camera resolution, processor type, battery life, and other relevant hardware components. pricing: A nested dictionary containing price listings for the phone across various e-commerce platforms. processed_data.csv File This file contains cleaned and processed phone data, aggregated from various e-commerce sources. The columns are more refined, and each phone entry provides comprehensive details:
phone_brand: The manufacturer or brand of the phone. phone_model: The specific model or name of the phone. store: The particular store or e-commerce platform where the phone is listed. price: The price of the phone as a floating-point number, set in the native currency. currency: The currency in which the phone is priced (e.g., USD, EUR). price_USD: The phone price converted into USD. storage: The storage capacity of the phone, measured in gigabytes (GB). ram: The amount of RAM available in the phone, also measured in gigabytes (GB). Launch: The official launch date of the phone, represented in a datetime format. Dimensions: The physical dimensions of the phone, typically provided in millimeters (e.g., 163.8 x 76.8 x 8.9 mm). Weight: The weight of the phone, measured in grams. Display_Type: The type of display technology used, for example, "LTPO Super Retina XDR OLED, 120Hz, HDR10." Display_Size: The size of the phone's display in inches. Display_Resolution: The resolution of the phone's display (e.g., 1280 x 2856 pixels). OS: The phone's operating system, such as iOS 18 or Android 14. NFC: A flag indicating the presence of Near Field Communication (NFC), with values of 1 for phones that have NFC and 0 for phones that do not. USB: The type of USB port (e.g., USB Type-C 3.2 Gen 2). BATTERY: The battery capacity of the phone, measured in milliampere hours (mAh). Features_Sensors: Various features and sensors included with the phone (e.g., fingerprint scanner, accelerometer). Colors: Available color options for the phone model (e.g., Black Titanium, White Titanium). Video: Camera specifications for video recording, including supported resolutions and frame rates (e.g., 4K@30fps). Chipset: The chipset model in the phone, such as "Apple A18 Pro (3 nm)." CPU: Specifications of the central processing unit (CPU) (e.g., Hexa-core, 2x4.05 GHz). GPU: Specifications of the graphical processing unit (GPU). Year: The year in which the phone model was released. Foldable: A flag indicating whether the phone is foldable (1 = foldable, 0 = not foldable). PPI_Density: The pixel density of the display in pixels per inch (ppi). quantile_10, quantile_50, quantile_90: These columns represent the 10th, 50th (median), and 90th quantiles of phone prices in a given year. price_range: This column classifies phones into different price ranges (low, medium, or high), based on their position in the price distribution (quantiles). Overall, this dataset provides extensive information on phone models, offering both raw and processed views of phone listings, along with important price and technical details.
Facebook
TwitterInfant Crying smartphone speech dataset, collected by Android smartphone and iPhone, covering infant crying. Our dataset was collected from extensive and diversify speakers(201 people in total, with balanced gender distribution), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
Facebook
TwitterThe FourGroceries dataset is collected for research purposes on price detection analysis. This dataset was collected from four groceries in Turkey in 2022 by mobile phones with IOS or Android operating systems. The dataset consists of 84 images of shelf labels, 21 images from each grocery.
Facebook
TwitterThis dataset was created by Michael Lomuscio