23 datasets found

Number of smartphone users in the United States 2014-2029
statista.com
Updated May 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2025). Number of smartphone users in the United States 2014-2029 [Dataset]. https://www.statista.com/topics/2711/us-smartphone-market/
Explore at:
Dataset updated
May 5, 2025
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
United States
Description
The number of smartphone users in the United States was forecast to continuously increase between 2024 and 2029 by in total 17.4 million users (+5.61 percent). After the fifteenth consecutive increasing year, the smartphone user base is estimated to reach 327.54 million users and therefore a new peak in 2029. Notably, the number of smartphone users of was continuously increasing over the past years.Smartphone users here are limited to internet users of any age using a smartphone. The shown figures have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of smartphone users in countries like Mexico and Canada.
Network Traffic Android Malware
kaggle.com
zip
Updated Sep 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christian Urcuqui (2019). Network Traffic Android Malware [Dataset]. https://www.kaggle.com/datasets/xwolf12/network-traffic-android-malware
Explore at:
zip(116603 bytes)Available download formats
Dataset updated
Sep 12, 2019
Authors
Christian Urcuqui
Description
Introduction

Android is one of the most used mobile operating systems worldwide. Due to its technological impact, its open-source code and the possibility of installing applications from third parties without any central control, Android has recently become a malware target. Even if it includes security mechanisms, the last news about malicious activities and Android´s vulnerabilities point to the importance of continuing the development of methods and frameworks to improve its security.

To prevent malware attacks, researches and developers have proposed different security solutions, applying static analysis, dynamic analysis, and artificial intelligence. Indeed, data science has become a promising area in cybersecurity, since analytical models based on data allow for the discovery of insights that can help to predict malicious activities.

In this work, we propose to consider some network layer features as the basis for machine learning models that can successfully detect malware applications, using open datasets from the research community.

Content

This dataset is based on another dataset (DroidCollector) where you can get all the network traffic in pcap files, in our research we preprocessed the files in order to get network features that are illustrated in the next article:

López, C. C. U., Villarreal, J. S. D., Belalcazar, A. F. P., Cadavid, A. N., & Cely, J. G. D. (2018, May). Features to Detect Android Malware. In 2018 IEEE Colombian Conference on Communications and Computing (COLCOM) (pp. 1-6). IEEE.

Acknowledgements

Cao, D., Wang, S., Li, Q., Cheny, Z., Yan, Q., Peng, L., & Yang, B. (2016, August). DroidCollector: A High Performance Framework for High Quality Android Traffic Collection. In Trustcom/BigDataSE/I SPA, 2016 IEEE (pp. 1753-1758). IEEE
Global smartphone sales to end users 2007-2023
statista.com
Updated Oct 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Global smartphone sales to end users 2007-2023 [Dataset]. https://www.statista.com/statistics/263437/global-smartphone-sales-to-end-users-since-2007/
Explore at:
Dataset updated
Oct 15, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
In 2022, smartphone vendors sold around 1.39 billion smartphones were sold worldwide, with this number forecast to drop to 1.34 billion in 2023.

Smartphone penetration rate still on the rise

Less than half of the world’s total population owned a smart device in 2016, but the smartphone penetration rate has continued climbing, reaching 78.05 percent in 2020. By 2025, it is forecast that almost 87 percent of all mobile users in the United States will own a smartphone, an increase from the 27 percent of mobile users in 2010.

Smartphone end user sales

In the United States alone, sales of smartphones were projected to be worth around 73 billion U.S. dollars in 2021, an increase from 18 billion dollars in 2010. Global sales of smartphones are expected to increase from 2020 to 2021 in every major region, as the market starts to recover from the initial impact of the coronavirus (COVID-19) pandemic.
RICO dataset
kaggle.com
Updated Dec 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Onur Gunes (2021). RICO dataset [Dataset]. https://www.kaggle.com/datasets/onurgunes1993/rico-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 2, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Onur Gunes
Description
Context

Data-driven models help mobile app designers understand best practices and trends, and can be used to make predictions about design performance and support the creation of adaptive UIs. This paper presents Rico, the largest repository of mobile app designs to date, created to support five classes of data-driven applications: design search, UI layout generation, UI code generation, user interaction modeling, and user perception prediction. To create Rico, we built a system that combines crowdsourcing and automation to scalably mine design and interaction data from Android apps at runtime. The Rico dataset contains design data from more than 9.3k Android apps spanning 27 categories. It exposes visual, textual, structural, and interactive design properties of more than 66k unique UI screens. To demonstrate the kinds of applications that Rico enables, we present results from training an autoencoder for UI layout similarity, which supports query-by-example search over UIs.

Content

Rico was built by mining Android apps at runtime via human-powered and programmatic exploration. Like its predecessor ERICA, Rico’s app mining infrastructure requires no access to — or modification of — an app’s source code. Apps are downloaded from the Google Play Store and served to crowd workers through a web interface. When crowd workers use an app, the system records a user interaction trace that captures the UIs visited and the interactions performed on them. Then, an automated agent replays the trace to warm up a new copy of the app and continues the exploration programmatically, leveraging a content-agnostic similarity heuristic to efficiently discover new UI states. By combining crowdsourcing and automation, Rico can achieve higher coverage over an app’s UI states than either crawling strategy alone. In total, 13 workers recruited on UpWork spent 2,450 hours using apps on the platform over five months, producing 10,811 user interaction traces. After collecting a user trace for an app, we ran the automated crawler on the app for one hour.

Acknowledgements

UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN https://interactionmining.org/rico

Inspiration

The Rico dataset is large enough to support deep learning applications. We trained an autoencoder to learn an embedding for UI layouts, and used it to annotate each UI with a 64-dimensional vector representation encoding visual layout. This vector representation can be used to compute structurally — and often semantically — similar UIs, supporting example-based search over the dataset. To create training inputs for the autoencoder that embed layout information, we constructed a new image for each UI capturing the bounding box regions of all leaf elements in its view hierarchy, differentiating between text and non-text elements. Rico’s view hierarchies obviate the need for noisy image processing or OCR techniques to create these inputs.
Data from: AndroCT: Ten Years of App Call Traces in Android
zenodo.org
explore.openaire.eu
application/gzip, txt
Updated Mar 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wen Li; Xiaoqin Fu; Haipeng Cai; Haipeng Cai; Wen Li; Xiaoqin Fu (2022). AndroCT: Ten Years of App Call Traces in Android [Dataset]. http://doi.org/10.5281/zenodo.6336104
Explore at:
application/gzip, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6336104
Dataset updated
Mar 8, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Wen Li; Xiaoqin Fu; Haipeng Cai; Haipeng Cai; Wen Li; Xiaoqin Fu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A large-scale dataset on the dynamic profiles based on function calls of 35,974 benign and malicious Android apps from 10 historical years (2010 through 2019). Function calls are a commonly used means to model program behaviors, which may contribute to various code analysis approaches to assuring software correctness, reliability, and security. In particular, our dataset includes dynamic profiles of each app resulting from the same-length of time (10 mins) of being exercised by randomly generated inputs on both emulator and real device, enabling interesting and useful app analysis that reason about app behaviors in an evolutionary perspective while informing the differences of app behaviors on different run-time hardware platforms. Since we have 20 yearly datasets associated with 35,974 unique Android apps across the 10 years, profiling these apps took 12,000 hours. Considering the costs of filtering out apps that were originally sampled but that we were unable to profile (due to various reasons such as broken APKs, not being executable because of incompatibility issues, not instrumentable, etc.), we took over two years to produce all these traces. We hope to save future researchers' time in producing such a set of dynamic data to enable their empirical and technical work.

==================

Thanks for your interest in our dataset. Collecting this dataset took tremendous computational and human effort. Thus, please observe the following restrictions in using our dataset:

- Do not redistribute this dataset without our consent.
- Do not make commercial usage of this dataset.
- Get a faculty, or someone in a permanent position, to agree and commit to these conditions.
- When publishing your work that uses our dataset, please cite the following MSR 2021 data paper.

@inproceedings{AndroidCT,
title = {AndroCT: Ten Years of App Call Traces in Android},
author = {Wen Li, Xiaoqin Fu, and Haipeng Cai},
booktitle = {The 18th International Conference on Mining Software Repositories (MSR 2021), Data Showcase Track},
year = {2021},
}
Google Play Store Apps
kaggle.com
zip
Updated Feb 3, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lavanya (2019). Google Play Store Apps [Dataset]. https://www.kaggle.com/lava18/google-play-store-apps
Explore at:
zip(2037893 bytes)Available download formats
Dataset updated
Feb 3, 2019
Authors
Lavanya
Description
Context

While many public datasets (on Kaggle and the like) provide Apple App Store data, there are not many counterpart datasets available for Google Play Store apps anywhere on the web. On digging deeper, I found out that iTunes App Store page deploys a nicely indexed appendix-like structure to allow for simple and easy web scraping. On the other hand, Google Play Store uses sophisticated modern-day techniques (like dynamic page load) using JQuery making scraping more challenging.

Content

Each app (row) has values for catergory, rating, size, and more.

Acknowledgements

This information is scraped from the Google Play Store. This app information would not be available without it.

Inspiration

The Play Store apps data has enormous potential to drive app-making businesses to success. Actionable insights can be drawn for developers to work on and capture the Android market!
r
Android Process Memory String Dumps Dataset
researchdata.se
su.figshare.com
+1more
Updated May 11, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Irvin Homem; Panagiotis Papapetrou (2017). Android Process Memory String Dumps Dataset [Dataset]. http://doi.org/10.17045/STHLMUNI.4989773
Explore at:
Unique identifier
https://doi.org/10.17045/STHLMUNI.4989773
Dataset updated
May 11, 2017
Dataset provided by
Stockholm University
Authors
Irvin Homem; Panagiotis Papapetrou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A dataset containing 2375 samples of Android Process Memory String Dumps. The dataset is broadly composed of 2 classes: "Benign App" Memory Dumps and "Malicious App" Memory Dumps, respectively, split into 2 ZIP archives. The ZIP archives in total are approximately 17GB in size, however the unzipped contents are approximately 67GB.

This dataset is derived from a subset of the APK files originally made freely available for research through the AndroZoo project [1]. The AndroZoo project collected millions of Android applications and scanned them with the VirusTotal online malware scanning service, thereby classifying most of the apps as either malicious or benign at the time of scanning. The process memory dumps in this dataset were generated through running the subset of APK files from the AndroZoo dataset in an Android Emulator, capturing the process memory of the individual process and subsequently extracting only the strings from the process memory dump. This was facilitated through building 2 applications: Coriander and AndroMemDumpBeta which facilitate the running of Apps on Android Emulators, and the capturing of process memory respectively. The source code for these software applications is available on Github.

The individual samples are labelled with the SHA256 hash filename from the original AndroZoo labeling and the application package names extracted from within the specific APK manifest file. They also contain a time-stamp for when the memory dumping process took place for the specific file. The file extension used is ".dmp" to indicate that the files are memory dumps, however they only contain strings, and thus can be viewed in any simple text editor.

A subset of the first 10000 APK files from the original AndroZoo dataset is also included within this dataset. The metadata of these APK files is present in the file "AndroZoo-First-10000" and the 2375 Android Apps that are the main subjects of our dataset are extracted from here..

Our dataset is intended to be used in furthering our research related to Machine Learning-based Triage for Android Memory Forensics. It has been made openly available in order to foster opportunities for collaboration with other researchers, to enable validation of research results as well as to enhance the body of knowledge in related areas of research.

References: [1]. K. Allix, T. F. Bissyandé, J. Klein, and Y. Le Traon. AndroZoo: Collecting Millions of Android Apps for the Research Community. Mining Software Repositories (MSR) 2016
Dataset of "Extinguishing Ransomware - A Hybrid Approach to Android...
zenodo.org
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alberto Ferrante; Alberto Ferrante; Francesco Mercaldo; Miroslaw Malek; Jelena Milosevic; Francesco Mercaldo; Miroslaw Malek; Jelena Milosevic (2020). Dataset of "Extinguishing Ransomware - A Hybrid Approach to Android Ransomware Detection" [Dataset]. http://doi.org/10.5281/zenodo.1420449
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.1420449
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alberto Ferrante; Alberto Ferrante; Francesco Mercaldo; Miroslaw Malek; Jelena Milosevic; Francesco Mercaldo; Miroslaw Malek; Jelena Milosevic
Description
Protection against ransomware is particularly relevant in systems running the Android operating system, due to its huge users' base and, therefore, its potential for monetization from the attackers. In "Extinguishing Ransomware - A Hybrid Approach to Android Ransomware Detection" (see references for details), we describe a hybrid (static + dynamic) malware detection method that has extremely good accuracy (100% detection rate, with false positive below 4%).

We release a dataset related to the dynamic detection part of the aforementioned methods and containing execution traces of ransomware Android applications, in order to facilitate further research as well as to facilitate the adoption of dynamic detection in practice. The dataset contains execution traces from 666 ransomware applications taken from the Heldroid project [https://github.com/necst/heldroid] (the app repository is unavailable at the moment). Execution records were obtained by running the applications, one at a time, on the Android emulator. For each application, a maximum of 20,000 stimuli were applied with a maximum execution time of 15 minutes. For most of the applications, all the stimuli could be applied in this timeframe. In some of the traces none of the two limits is reached due to emulator hiccups. Collected features are related to the memory and CPU usage, network interaction and system calls and their monitoring is performed with a period of two seconds. The Android emulator of the Android Software Development Kit for Android 4.0 (release 20140702) was used. To guarantee that the system was always in a mint condition when a new sample is started, thus avoiding possible interference (e.g., changed settings, running processes, and modifications of the operating system files) from previously run samples, the Android operating system was each time re-initialized before running each application. The application execution process was automated by means of a shell script that made use of Android Debug Bridge (adb) and that was run on a Linux PC. The Monkey application exerciser was used in the script as a generator of the aforementioned stimuli. The Monkey is a command-line tool that can be run on any emulator instance or on a device; it sends a pseudo-random stream of user events (stimuli) into the system, which acts as a stress test on the application software.

In this dataset, we provide both per-app CSV files as well as unified files, in which CSV files of single applications have been concatenated. The CSV files contain the features extracted from the raw execution record. The provided files are listed below:

ransom-per_app-csv.zip - features obtained by executing ransomware applications, one CSV per application

ransom-unified-csv.zip - features obtained by executing ransomware applications, only one CSV file
e
The manifest and store data of 870,515 Android mobile applications - Dataset...
b2find.eudat.eu
Updated Oct 23, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). The manifest and store data of 870,515 Android mobile applications - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/b25ee20e-5268-50ae-9914-4bc70bd4ff1c
Explore at:
Dataset updated
Oct 23, 2023
Description
We built a crawler to collect data from the Google Play store including the application's metadata and APK files. The manifest files were extracted from the APK files and then processed to extract the features. The data set is composed of 870,515 records/apps, and for each app we produced 48 features. The data set was used to built and test two bootstrap aggregating of multiple XGBoost machine learning classifiers. The dataset were collected between April 2017 and November 2018. We then checked the status of these applications on three different occasions; December 2018, February 2019, and May-June 2019.
Android Malware Dataset with VirusTotal Labels
zenodo.org
zip
Updated May 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammed Rashed; Mohammed Rashed; Juan Tapiador; Juan Tapiador; Guillermo Suarez-Tangil; Guillermo Suarez-Tangil (2024). Android Malware Dataset with VirusTotal Labels [Dataset]. http://doi.org/10.5281/zenodo.11095700
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11095700
Dataset updated
May 1, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mohammed Rashed; Mohammed Rashed; Juan Tapiador; Juan Tapiador; Guillermo Suarez-Tangil; Guillermo Suarez-Tangil
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains labels of 2.47 million Android apk hashes extracted from VirusTotal reports.

The dataset was used in the experiments of our publication titled An Analysis of Android Malware Classification Services

The csv of the labels that was extracted from the VirusTotal reports is provided in labeling_dataset.csv.gz . A cell's value of -1 is used whenever there was no result from the
engine for the given apk file hash value. The column names are provided in cols_labeling_dataset.csv.

Note

-1 is a string and not an integer

If you use information from this repo, please cite our paper

Rashed M, Suarez-Tangil G. An Analysis of Android Malware Classification Services. Sensors. 2021; 21(16):5671. https://doi.org/10.3390/s21165671

BibTeX

@Article{s21165671,
AUTHOR = {Rashed, Mohammed and Suarez-Tangil, Guillermo},
TITLE = {An Analysis of Android Malware Classification Services},
JOURNAL = {Sensors},
VOLUME = {21},
YEAR = {2021},
NUMBER = {16},
ARTICLE-NUMBER = {5671},
URL = {https://www.mdpi.com/1424-8220/21/16/5671},\
ISSN = {1424-8220},
DOI = {10.3390/s21165671}
}

Required Software

gzip

Debian-based Linux: you may install it using the following command apt-get install gzip

MacOS: gzip is pre-installed

Windows: you may download gzip from http://gnuwin32.sourceforge.net/packages/gzip.htm

How to use the file?

There are two ways to use the file:

Extract the gzip file and then you will have a csv output file. For that you need to install gzip and then extracting .csv.gz. The user may use the command gunzip labelingDataset.csv.gz

Extract information from the zipped file directly (following the same logic of AndroZoo's csv):
To extract the first column and save to a file called list_of_selected_sha256, run the following command:
zcat labelingDataset.csv.gz | cut -d',' -f1 > list_of_selected_sha256
To obtain rows of apk hashes that were first seen after the 1st of May, 2016, run this command:
zcat labeling_dataset.csv.gz | grep -v ',snaggamea' | awk -F, '{if ( $2 >= "2016-05" ) {print} }'
s
Android Mischief Dataset
stratosphereips.org
zip
Updated May 7, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kamila Babayeva (2021). Android Mischief Dataset [Dataset]. https://www.stratosphereips.org/android-mischief-dataset
Explore at:
zipAvailable download formats
Dataset updated
May 7, 2021
Dataset provided by
Stratosphere Lab, Department of Electrical Engineering, Czech Technical University
Authors
Kamila Babayeva
License
Attribution 2.0 (CC BY 2.0)https://creativecommons.org/licenses/by/2.0/
License information was derived automatically
Time period covered
2020
Area covered
Czech Republic, Prague
Description
The Android Mischief Dataset is a dataset of network traffic from mobile phones infected with Android RATs. Its goal is to offer the community a dataset to learn and analyze the network behaviour of RATs, in order to propose new detections to protect our devices. The current version of the dataset includes 7 packet captures from 7 executed Android RATs. The Android Mischief Dataset was done in the Stratosphere Laboratory, Czech Technical University in Prague.
Replication Package for Datashow Case Paper "AndroidCompass: A Dataset of...
zenodo.org
data.niaid.nih.gov
zip
Updated Aug 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sebastian Nielebock; Sebastian Nielebock; Paul Blockhaus; Jacob Krüger; Jacob Krüger; Frank Ortmeier; Frank Ortmeier; Paul Blockhaus (2021). Replication Package for Datashow Case Paper "AndroidCompass: A Dataset of Android Compatibility Checks in Code Repositories" [Dataset]. http://doi.org/10.5281/zenodo.4428340
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4428340
Dataset updated
Aug 20, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sebastian Nielebock; Sebastian Nielebock; Paul Blockhaus; Jacob Krüger; Jacob Krüger; Frank Ortmeier; Frank Ortmeier; Paul Blockhaus
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository provides the dataset AndroidCompass and scripts for re-creation of it from the paper "AndroidCompass: A Dataset of Android Compatibility Checks in Code Repositories" by Sebastian Nielebock, Paul Blockhaus, Jacob Krüger, and Frank Ortmeier from the Faculty of Computer Science of the Otto-von-Guericke University Magdeburg, Germany. The paper has been accepted for publication at the Data Showcase track at the 18th International Working Conference on Mining Software Repositories.

All scripts and data sets are provided by the authors and come without any guarantee. For any issues regarding replication do not hesitate to contact us ({sebastian.nielebock,paul.blockhaus,jacob.krueger,frank.ortmeier}

If you use or refer to these datasets, please cite our paper using the following BibTex entry.

@InProceedings{nielebock2021androidcompass, author = {Sebastian Nielebock and Paul Blockhaus and Jacob Kr\"{u}ger and Frank Ortmeier}, booktitle = {Proceedings of the 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR)}, title = {{AndroidCompass}: A Dataset of Android Compatibility Checks in Code Repositories}, year = {2021}, organization = {IEEE}, pages = {535-539}, note = {preprint available at \url{https://arxiv.org/abs/2103.09620}}, doi = {10.1109/MSR52588.2021.00069}, }
h
android_control_test
huggingface.co
Updated Aug 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
InfiX.ai (2025). android_control_test [Dataset]. https://huggingface.co/datasets/InfiX-ai/android_control_test
Explore at:
Dataset updated
Aug 8, 2025
Dataset provided by
InfiX.ai
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Processed Android Control Test Set for InfiGUI-R1 Evaluation

Dataset Description

This repository contains the processed test set derived from the Android Control dataset by Google Research. It has been specifically prepared for evaluating the performance of our model, InfiGUI-R1. The InfiGUI-R1 model is detailed in our paper:

InfiGUI-R1: Advancing Multimodal GUI Agents from Reactive Actors to Deliberative Reasoners

This dataset facilitates standardized testing and… See the full description on the dataset page: https://huggingface.co/datasets/InfiX-ai/android_control_test.
Sample Beiwe Dataset
zenodo.org
data.niaid.nih.gov
zip
Updated Apr 20, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick Emedom-Nnamdi; Kenzie W. Carlson; Zachary Clement; Marta Karas; Marcin Straczkiewicz; Jukka-Pekka Onnela; Patrick Emedom-Nnamdi; Kenzie W. Carlson; Zachary Clement; Marta Karas; Marcin Straczkiewicz; Jukka-Pekka Onnela (2022). Sample Beiwe Dataset [Dataset]. http://doi.org/10.5281/zenodo.6471045
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6471045
Dataset updated
Apr 20, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Patrick Emedom-Nnamdi; Kenzie W. Carlson; Zachary Clement; Marta Karas; Marcin Straczkiewicz; Jukka-Pekka Onnela; Patrick Emedom-Nnamdi; Kenzie W. Carlson; Zachary Clement; Marta Karas; Marcin Straczkiewicz; Jukka-Pekka Onnela
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a public release of Beiwe-generated data. The Beiwe Research Platform collects high-density data from a variety of smartphone sensors such as GPS, WiFi, Bluetooth, gyroscope, and accelerometer in addition to metadata from active surveys. A description of passive and active data streams, and a documentation concerning the use of Beiwe can be found here. This data was collected from an internal test study and is made available solely for educational purposes. It contains no identifying information; subject locations are de-identified using the noise GPS feature of Beiwe.

As part of the internal test study, data from 6 participants were collected from the start of March 21, 2022 to the end of March 28, 2022. The local time zone of this study is Eastern Standard Time. Each participant was notified to complete a survey at 9am EST on Monday, Thursday, and Saturday of the study week. An additional survey was administered on Tuesday at 5:15pm EST. For each survey, subjects were asked to respond to the prompt "How much time (in hours) do you think you spent at home?".
i
The icsi/netalyzr-android dataset
impactcybertrust.org
Updated Jan 21, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
External Data Source (2019). The icsi/netalyzr-android dataset [Dataset]. http://doi.org/10.23721/100/1478847
Explore at:
Unique identifier
https://doi.org/10.23721/100/1478847
Dataset updated
Jan 21, 2019
Authors
External Data Source
Description
This dataset was collected by the ICSI Netalyzr app for Android to develop a characterization of how operational decisions, such as network configurations, business models, and relationships between operators introduce diversity in service quality and affect user security and privacy. We delve in detail beyond the radio link and into network configuration and business relationships in six countries. We identify the widespread use of transparent middleboxes such as HTTP and DNS proxies, analyzing how they actively modify user traffic, compromise user privacy, and potentially undermine user security. In addition, we identify network sharing agreements between operators, highlighting the implications of roaming and characterizing the properties of MVNOs, including that a majority are simply rebranded versions of major operators. More broadly, our findings using this data highlight the importance of considering higher-layer relationships when seeking to analyze mobile traffic in a sound fashion. ; narseo@icsi.berkeley.edu
📱Smartphone Processors Ranking & Scores📊
kaggle.com
Updated Jan 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alan Jo (2023). 📱Smartphone Processors Ranking & Scores📊 [Dataset]. https://www.kaggle.com/datasets/alanjo/smartphone-processors-ranking
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 31, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Alan Jo
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Welcome to the ultimate Android vs iOS battle with this Smartphone SoC dataset!

Includes three .csv files. Any analysis is appreciated, even if it is a short one 😎

Context

Benchmarks allow for easy comparison between multiple devices by scoring their performance on a standardized series of tests, and they are useful in many instances: When buying a new phone or tablet

Content

smartphone cpu_stats.csv is the main data. Updated performance rating of smartphone SoCs as of 2022. Includes summary of Geekbench 5 and AnTuTu v9 scores. Includes CPU specs such as clock speed, core count, core config, and GPU.

ML ALL_benchmarks.csv is the Geekbench ML Benchmark data. This tells you how well each smartphone device performs when performing Machine Learning tasks. The data is gathered from user-submitted Geekbench ML results from the Geekbench Browser. To make sure the results accurately reflect the average performance of each device, the dataset only includes devices with at least five unique results in the Geekbench Browser.

antutu android vs ios_v4.csv is the AnTuTu benchmarks data. It includes information about CPU, GPU, MEM, UX and Total score.

Antutu Benchmarks

1. Total Score

Benchmark apps gives your device an overall numerical score as well as individual scores for each test it performs. The overall score is created by adding the results of those individual scores. These score numbers don't mean much on their own, they're just helpful for comparing different devices. For example, if your device's score is 300000, a device with a score of 600000 is about twice as fast. You can use individual test scores to compare the relative performance of specific parts of different devices. For example, you could compare how fast your phone's storage performs compared to another phone's storage.

2. CPU Score

The first part of the overall score is your CPU score. The CPU score in turn includes the output of CPU Mathematical Operations, CPU Common Algorithms, and CPU Multi-Core. In simpler words, the CPU score means how fast your phone processes commands. Your device's central processing unit (CPU) does most of the number-crunching. A faster CPU can run apps faster, so everything on your device will seem faster. Of course, once you get to a certain point, CPU speed won't affect performance much. However, a faster CPU may still help when running more demanding applications, such as high-end games.

3. GPU Score

The second part of the overall score is your GPU score. This score is comprised of the output of graphical components like Metal, OpenGL or Vulkan, depending on your device. The GPU score means how well your phone displays 2D and 3D graphics. Your device's graphics processing unit (GPU) handles accelerated graphics. When you play a game, your GPU kicks into gear and renders the 3D graphics or accelerates the shiny 2D graphics. Many interface animations and other transitions also use the GPU. The GPU is optimized for these sorts of graphics operations. The CPU could perform them, but it's more general-purpose and would take more time and battery power. You can say that your GPU does the graphics number-crunching, so a higher score here is better.

4. MEM score

The third part of the overall score is your MEM score. The MEM score includes the results of the output of RAM Access, ROM APP IO, ROM Sequential Read and Write, and ROM Random Access. In simpler words, the MEM score means how fast and how much memory your phone possesses. RAM stands for random-access memory; while ROM stands for read-only memory. Your device uses RAM as working memory, while flash storage or an internal SD card is used for long-term storage. The faster it can write to and read data from its RAM, the faster your device will perform. Your RAM is constantly being used on your device, whatever you're doing. While RAM is volatile in nature, ROM is its opposite. RAM mostly stores temporary data, while ROM is used to store permanent data like the firmware of your phone. Both the RAM and ROM make up the memory of your phone, helping it to perform tasks efficiently.

5. UX Score

The fourth and final part of the overall score is your UX score. The UX score is made up of the results of the output of the Data Security, Data Processing, Image Processing, User Experience, and Video CTS and Decode tests. The UX score means an overall score that represents how the device's "user experience" will be in the real world. It's a number you can look at to get a feel for a device's overall performance without digging into the above benchmarks or relying too much on the overall score.

Acknowledgements

Sourced from Geekbench and AnTuTu.

If you enjoyed this dataset, here's some similar datasets you may like 😎
JUIndoorLoc: Indoor Localization using WiFi
kaggle.com
Updated Dec 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Priya Roy (2024). JUIndoorLoc: Indoor Localization using WiFi [Dataset]. https://www.kaggle.com/datasets/priyaroycse/juindoorloc-wifi-fingerprint-indoor-localization/suggestions?status=pending&yourSuggestions=true
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 9, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Priya Roy
Description
The WiFi Fingerprint Dataset, JUIndoorLoc, for Indoor Localization contains received signal strength data (RSS) collected from multiple WiFi access points (APs) across various predefined indoor locations. Each entry in the dataset corresponds to a unique location identified by specific coordinates as labels and includes RSS values from nearby APs. The dataset is typically structured to facilitate machine learning applications for indoor positioning, offering sufficient diversity in environmental conditions such as temporal, indoor ambiance, and device variability. It serves as a benchmark for evaluating machine learning algorithms in tasks like location classification, important AP identification, indoor region clustering, etc. This dataset is indispensable for research in indoor location-based services. The dataset is published and analyzed in the following research paper.

Roy P, Chowdhury C, Ghosh D, Bandyopadhyay S. JUIndoorLoc: A Ubiquitous Framework for Smartphone-Based Indoor Localization Subject to Context and Device Heterogeneity. Wireless Personal Communications. 2019:1-24, doi:10.1007/s11277-019-06188-2. Link: https://link.springer.com/article/10.1007/s11277-019-06188-2

Dataset Information: - The data has been captured from the three floors of a five-story building at Jadavpur University in different times, indoor ambience, and devices. - Data has been collected with 1 meter × 1 meter cell size so that different WiFi signal patterns of rooms, laboratories, corridors, and stairs can be investigated. - The numbers of WiFi APs appearing in the dataset are 172. - A total of 1000 location points from three floors are covered. - The RSS of WiFi APs has been collected by 4 Android devices with different configurations. - The RSS values are represented as negative integer values ranging from -11 dBm to -100 dBm (extremely weak signal). - A negative value of -110 dBm is used to fill up the missing entries, which indicate APs have not been detected.

Information about Features: | Feature | Description | | --- | --- | | Cid | A unique number to identify the indoor region where the capture is taken. Each cell number has two parts; the first part is floor number and the second part is the position of a cell on the two-dimensional building map. | | AP001-AP172 | Received signal strength value of 172 APs. Negative integer values from -11dBm to -100dBm and -110 used to identify the APs which are not detected in scan duration.| | Rs | Represents the status of room, the value is either 1 or 0. 1 and 0 represent open and closed rooms, respectively. | | Hpr | Represents the presence or absence of human, the value is either 1 or 0. 1 and 0 represent the presence and absence of humans, respectively.| | Did | A unique identifier is assigned to each Android device, which is used to capture data. These device identifiers are given as: D1 : Samsung Galaxy Tab 2, Android version 4.1.1, D2 : Samsung Galaxy Tab E, Android version 5.0, D3 : Samsung Galaxy Tab 10, Android version 4.0, D4 : Motorola Moto E 2nd Generation, Android version 5.1 | | Ts | 13-digit integer value used to record time when the fingerprint is taken. |

Applications:

This dataset is ideal for: Predictive Modeling: Train and test models for predicting the location of a user in an indoor region.

Educational and Research Purpose: Practice data exploration, cleaning, and analysis in a realistic WiFi fingerprint dataset that considers time, indoor ambience, and device heterogeneity.

License: This dataset is shared for educational and research purposes. Please refer to the following publication when using it in any project or publication.

Roy P, Chowdhury C, Ghosh D, Bandyopadhyay S. JUIndoorLoc: A Ubiquitous Framework for Smartphone-Based Indoor Localization Subject to Context and Device Heterogeneity. Wireless Personal Communications. 2019:1-24, doi:10.1007/s11277-019-06188-2. ```Link: https://link.springer.com/article/10.1007/s11277-019-06188-2
ANTI-PHISHING IN ANDROID PHONE PROJECT REPORT
kaggle.com
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kamal Acharya (2025). ANTI-PHISHING IN ANDROID PHONE PROJECT REPORT [Dataset]. http://doi.org/10.34740/kaggle/dsv/12394228
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/12394228
Dataset updated
Jul 7, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kamal Acharya
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Phishing is a new word produced from 'fishing', it refers to the act that the attacker allure users to visit a faked Web site by sending them faked e-mails (or instant messages), and stealthily get victim's personal information such as user name, password, and national security ID, etc. This information then can be used for future target advertisements or even identity theft attacks (e.g., transfer money from victims' bank account). The frequently used attack method is to send e-mails to potential victims, which seemed to be sent by banks, online organizations, or ISPs. In these e-mails, they will make up some causes, e.g. the password of your credit card had been mis-entered for many times, or they are providing upgrading services, to allure you visit their Web site to conform or modify your account number and password through the hyperlink provided in the e-mail (Leon, 2008).
Number of smartphone users in the Philippines 2014-2029
statista.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department, Number of smartphone users in the Philippines 2014-2029 [Dataset]. https://www.statista.com/topics/8230/smartphones-market-in-the-philippines/
Explore at:
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
Philippines
Description
The number of smartphone users in the Philippines was forecast to increase between 2024 and 2029 by in total 5.6 million users (+7.29 percent). This overall increase does not happen continuously, notably not in 2026, 2027, 2028 and 2029. The smartphone user base is estimated to amount to 82.33 million users in 2029. Smartphone users here are limited to internet users of any age using a smartphone. The shown figures have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of smartphone users in countries like Thailand and Indonesia.
Z
The ORBIT (Object Recognition for Blind Image Training)-India Dataset
data.niaid.nih.gov
zenodo.org
Updated Jul 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Grayson, Martin (2024). The ORBIT (Object Recognition for Blind Image Training)-India Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11394528
Explore at:
Dataset updated
Jul 2, 2024
Dataset provided by
Massiceti, Daniela
Pearson, Jennifer
Robinson, Simon
Jones, Matt
India, Gesu
Morrison, Cecily
Grayson, Martin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
India
Description
The ORBIT (Object Recognition for Blind Image Training) -India Dataset is a collection of 105,243 images of 76 commonly used objects, collected by 12 individuals in India who are blind or have low vision. This dataset is an "Indian subset" of the original ORBIT dataset [1, 2], which was collected in the UK and Canada. In contrast to the ORBIT dataset, which was created in a Global North, Western, and English-speaking context, the ORBIT-India dataset features images taken in a low-resource, non-English-speaking, Global South context, a home to 90% of the world’s population of people with blindness. Since it is easier for blind or low-vision individuals to gather high-quality data by recording videos, this dataset, like the ORBIT dataset, contains images (each sized 224x224) derived from 587 videos. These videos were taken by our data collectors from various parts of India using the Find My Things [3] Android app. Each data collector was asked to record eight videos of at least 10 objects of their choice.

Collected between July and November 2023, this dataset represents a set of objects commonly used by people who are blind or have low vision in India, including earphones, talking watches, toothbrushes, and typical Indian household items like a belan (rolling pin), and a steel glass. These videos were taken in various settings of the data collectors' homes and workspaces using the Find My Things Android app.

The image dataset is stored in the ‘Dataset’ folder, organized by folders assigned to each data collector (P1, P2, ...P12) who collected them. Each collector's folder includes sub-folders named with the object labels as provided by our data collectors. Within each object folder, there are two subfolders: ‘clean’ for images taken on clean surfaces and ‘clutter’ for images taken in cluttered environments where the objects are typically found. The annotations are saved inside a ‘Annotations’ folder containing a JSON file per video (e.g., P1--coffee mug--clean--231220_084852_coffee mug_224.json) that contains keys corresponding to all frames/images in that video (e.g., "P1--coffee mug--clean--231220_084852_coffee mug_224--000001.jpeg": {"object_not_present_issue": false, "pii_present_issue": false}, "P1--coffee mug--clean--231220_084852_coffee mug_224--000002.jpeg": {"object_not_present_issue": false, "pii_present_issue": false}, ...). The ‘object_not_present_issue’ key is True if the object is not present in the image, and the ‘pii_present_issue’ key is True, if there is a personally identifiable information (PII) present in the image. Note, all PII present in the images has been blurred to protect the identity and privacy of our data collectors. This dataset version was created by cropping images originally sized at 1080 × 1920; therefore, an unscaled version of the dataset will follow soon.

This project was funded by the Engineering and Physical Sciences Research Council (EPSRC) Industrial ICASE Award with Microsoft Research UK Ltd. as the Industrial Project Partner. We would like to acknowledge and express our gratitude to our data collectors for their efforts and time invested in carefully collecting videos to build this dataset for their community. The dataset is designed for developing few-shot learning algorithms, aiming to support researchers and developers in advancing object-recognition systems. We are excited to share this dataset and would love to hear from you if and how you use this dataset. Please feel free to reach out if you have any questions, comments or suggestions.

REFERENCES:

Daniela Massiceti, Lida Theodorou, Luisa Zintgraf, Matthew Tobias Harris, Simone Stumpf, Cecily Morrison, Edward Cutrell, and Katja Hofmann. 2021. ORBIT: A real-world few-shot dataset for teachable object recognition collected from people who are blind or low vision. DOI: https://doi.org/10.25383/city.14294597

microsoft/ORBIT-Dataset. https://github.com/microsoft/ORBIT-Dataset

Linda Yilin Wen, Cecily Morrison, Martin Grayson, Rita Faia Marques, Daniela Massiceti, Camilla Longden, and Edward Cutrell. 2024. Find My Things: Personalized Accessibility through Teachable AI for People who are Blind or Low Vision. In Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems (CHI EA '24). Association for Computing Machinery, New York, NY, USA, Article 403, 1–6. https://doi.org/10.1145/3613905.3648641

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista Research Department (2025). Number of smartphone users in the United States 2014-2029 [Dataset]. https://www.statista.com/topics/2711/us-smartphone-market/

Number of smartphone users in the United States 2014-2029

Explore at:

46 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

May 5, 2025

Dataset provided by

Statistahttp://statista.com/

Authors

Statista Research Department

Area covered

United States

Description

The number of smartphone users in the United States was forecast to continuously increase between 2024 and 2029 by in total 17.4 million users (+5.61 percent). After the fifteenth consecutive increasing year, the smartphone user base is estimated to reach 327.54 million users and therefore a new peak in 2029. Notably, the number of smartphone users of was continuously increasing over the past years.Smartphone users here are limited to internet users of any age using a smartphone. The shown figures have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of smartphone users in countries like Mexico and Canada.

Clear search

Close search

Google apps

Main menu

Number of smartphone users in the United States 2014-2029

Network Traffic Android Malware

Introduction

Content

Acknowledgements

Global smartphone sales to end users 2007-2023

RICO dataset

Context

Content

Acknowledgements

Inspiration

Data from: AndroCT: Ten Years of App Call Traces in Android

Google Play Store Apps

Context

Content

Acknowledgements

Inspiration

Android Process Memory String Dumps Dataset

Dataset of "Extinguishing Ransomware - A Hybrid Approach to Android...

The manifest and store data of 870,515 Android mobile applications - Dataset...

Android Malware Dataset with VirusTotal Labels

If you use information from this repo, please cite our paper

BibTeX

Required Software

How to use the file?

Android Mischief Dataset

Replication Package for Datashow Case Paper "AndroidCompass: A Dataset of...

android_control_test

Sample Beiwe Dataset

The icsi/netalyzr-android dataset

📱Smartphone Processors Ranking & Scores📊

Welcome to the ultimate Android vs iOS battle with this Smartphone SoC dataset!

Context

Content

Antutu Benchmarks

1. Total Score

2. CPU Score

3. GPU Score

4. MEM score

5. UX Score

Acknowledgements

If you enjoyed this dataset, here's some similar datasets you may like 😎

JUIndoorLoc: Indoor Localization using WiFi

ANTI-PHISHING IN ANDROID PHONE PROJECT REPORT

Number of smartphone users in the Philippines 2014-2029

The ORBIT (Object Recognition for Blind Image Training)-India Dataset

Number of smartphone users in the United States 2014-2029