9 datasets found
  1. Network Traffic Android Malware

    • kaggle.com
    zip
    Updated Sep 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Urcuqui (2019). Network Traffic Android Malware [Dataset]. https://www.kaggle.com/datasets/xwolf12/network-traffic-android-malware
    Explore at:
    zip(116603 bytes)Available download formats
    Dataset updated
    Sep 12, 2019
    Authors
    Christian Urcuqui
    Description

    Introduction

    Android is one of the most used mobile operating systems worldwide. Due to its technological impact, its open-source code and the possibility of installing applications from third parties without any central control, Android has recently become a malware target. Even if it includes security mechanisms, the last news about malicious activities and Android´s vulnerabilities point to the importance of continuing the development of methods and frameworks to improve its security.

    To prevent malware attacks, researches and developers have proposed different security solutions, applying static analysis, dynamic analysis, and artificial intelligence. Indeed, data science has become a promising area in cybersecurity, since analytical models based on data allow for the discovery of insights that can help to predict malicious activities.

    In this work, we propose to consider some network layer features as the basis for machine learning models that can successfully detect malware applications, using open datasets from the research community.

    Content

    This dataset is based on another dataset (DroidCollector) where you can get all the network traffic in pcap files, in our research we preprocessed the files in order to get network features that are illustrated in the next article:

    López, C. C. U., Villarreal, J. S. D., Belalcazar, A. F. P., Cadavid, A. N., & Cely, J. G. D. (2018, May). Features to Detect Android Malware. In 2018 IEEE Colombian Conference on Communications and Computing (COLCOM) (pp. 1-6). IEEE.

    Acknowledgements

    Cao, D., Wang, S., Li, Q., Cheny, Z., Yan, Q., Peng, L., & Yang, B. (2016, August). DroidCollector: A High Performance Framework for High Quality Android Traffic Collection. In Trustcom/BigDataSE/I SPA, 2016 IEEE (pp. 1753-1758). IEEE

  2. P

    Myket Android Application Install Dataset

    • paperswithcode.com
    Updated Aug 12, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erfan Loghmani; Mohammadamin Fazli (2023). Myket Android Application Install Dataset [Dataset]. https://paperswithcode.com/dataset/myket-android-application-install
    Explore at:
    Dataset updated
    Aug 12, 2023
    Authors
    Erfan Loghmani; Mohammadamin Fazli
    Description

    This dataset contains information on application install interactions of users in the Myket android application market. The dataset was created for the purpose of evaluating interaction prediction models, requiring user and item identifiers along with timestamps of the interactions. Hence, the dataset can be used for interaction prediction and building a recommendation system. Furthermore, the data forms a dynamic network of interactions, and we can also perform network representation learning on the nodes in the network, which are users and applications.

    Data Creation The dataset was initially generated by the Myket data team, and later cleaned and subsampled by Erfan Loghmani a master student at Sharif University of Technology at the time. The data team focused on a two-week period and randomly sampled 1/3 of the users with interactions during that period. They then selected install and update interactions for three months before and after the two-week period, resulting in interactions spanning about 6 months and two weeks.

    We further subsampled and cleaned the data to focus on application download interactions. We identified the top 8000 most installed applications and selected interactions related to them. We retained users with more than 32 interactions, resulting in 280,391 users. From this group, we randomly selected 10,000 users, and the data was filtered to include only interactions for these users. The detailed procedure can be found in here.

    Data Structure The dataset has two main files.

    myket.csv: This file contains the interaction information and follows the same format as the datasets used in the "JODIE: Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks" (ACM SIGKDD 2019) project. However, this data does not contain state labels and interaction features, resulting in associated columns being all zero. app_info_sample.csv: This file comprises features associated with applications present in the sample. For each individual application, information such as the approximate number of installs, average rating, count of ratings, and category are included. These features provide insights into the applications present in the dataset.

    Dataset Details

    Total Instances: 694,121 install interaction instances Instances Format: Triplets of user_id, app_name, timestamp 10,000 users and 7,988 android applications Item features for 7,606 applications

    For a detailed summary of the data's statistics, including information on users, applications, and interactions, please refer to the Python notebook available at summary-stats.ipynb. The notebook provides an overview of the dataset's characteristics and can be helpful for understanding the data's structure before using it for research or analysis.

    Top 20 Most Installed Applications | Package Name | Count of Interactions | | ---------------------------------- | --------------------- | | com.instagram.android | 15292 | | ir.resaneh1.iptv | 12143 | | com.tencent.ig | 7919 | | com.ForgeGames.SpecialForcesGroup2 | 7797 | | ir.nomogame.ClutchGame | 6193 | | com.dts.freefireth | 6041 | | com.whatsapp | 5876 | | com.supercell.clashofclans | 5817 | | com.mojang.minecraftpe | 5649 | | com.lenovo.anyshare.gps | 5076 | | ir.medu.shad | 4673 | | com.firsttouchgames.dls3 | 4641 | | com.activision.callofduty.shooter | 4357 | | com.tencent.iglite | 4126 | | com.aparat | 3598 | | com.kiloo.subwaysurf | 3135 | | com.supercell.clashroyale | 2793 | | co.palang.QuizOfKings | 2589 | | com.nazdika.app | 2436 | | com.digikala | 2413 |

    Comparison with SNAP Datasets The Myket dataset introduced in this repository exhibits distinct characteristics compared to the real-world datasets used by the project. The table below provides a comparative overview of the key dataset characteristics:

    Dataset#Users#Items#InteractionsAverage Interactions per UserAverage Unique Items per User
    Myket10,0007,988694,12169.454.6
    LastFM9801,0001,293,1031,319.5158.2
    Reddit10,000984672,44767.27.9
    Wikipedia8,2271,000157,47419.12.2
    MOOC7,04797411,74958.425.3

    The Myket dataset stands out by having an ample number of both users and items, highlighting its relevance for real-world, large-scale applications. Unlike LastFM, Reddit, and Wikipedia datasets, where users exhibit repetitive item interactions, the Myket dataset contains a comparatively lower amount of repetitive interactions. This unique characteristic reflects the diverse nature of user behaviors in the Android application market environment.

    Citation If you use this dataset in your research, please cite the following preprint:

    @misc{loghmani2023effect, title={Effect of Choosing Loss Function when Using T-batching for Representation Learning on Dynamic Networks}, author={Erfan Loghmani and MohammadAmin Fazli}, year={2023}, eprint={2308.06862}, archivePrefix={arXiv}, primaryClass={cs.LG} }

  3. i

    The icsi/netalyzr-android dataset

    • impactcybertrust.org
    Updated Jan 21, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    External Data Source (2019). The icsi/netalyzr-android dataset [Dataset]. http://doi.org/10.23721/100/1478847
    Explore at:
    Dataset updated
    Jan 21, 2019
    Authors
    External Data Source
    Description

    This dataset was collected by the ICSI Netalyzr app for Android to develop a characterization of how operational decisions, such as network configurations, business models, and relationships between operators introduce diversity in service quality and affect user security and privacy. We delve in detail beyond the radio link and into network configuration and business relationships in six countries. We identify the widespread use of transparent middleboxes such as HTTP and DNS proxies, analyzing how they actively modify user traffic, compromise user privacy, and potentially undermine user security. In addition, we identify network sharing agreements between operators, highlighting the implications of roaming and characterizing the properties of MVNOs, including that a majority are simply rebranded versions of major operators. More broadly, our findings using this data highlight the importance of considering higher-layer relationships when seeking to analyze mobile traffic in a sound fashion. ; narseo@icsi.berkeley.edu

  4. m

    ITC-Net-MingledApp: A comprehensive dataset of mixed mobile application...

    • data.mendeley.com
    Updated Oct 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abolghasem Rezaei Khesal (2024). ITC-Net-MingledApp: A comprehensive dataset of mixed mobile application traffic for robust network traffic classification, domain adaptation, and generalization in diverse environments - Tehran Dataset #1 [Dataset]. http://doi.org/10.17632/9frgkybxhn.1
    Explore at:
    Dataset updated
    Oct 7, 2024
    Authors
    Abolghasem Rezaei Khesal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tehran
    Description

    This repository is part of the ITC-NetMingledApp dataset, which includes network traffic data from 36 Android applications, with each capture featuring concurrent traffic from multiple applications and smartphones. This repository contains part #1 of the data related to the Iran-Tehran scenario. Each capture is stored in a compressed file containing the relevant PCAP files of the associated applications. The PCAP files are named according to a convention: {TimeStamp}_{Application Name}{Download-Upload Speed}.pcap Part #2 of Iran-Tehran scenario is in the Tehran Dataset #2 (https://doi.org/10.17632/zsffy3j9y6.1) repository.

  5. Brainport, Platooning, formation improvement by traffic light data

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jan 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TNO; TASS; NXP; Technolution; NEVS (National Electric Vehicle Sweden); TNO; TASS; NXP; Technolution; NEVS (National Electric Vehicle Sweden) (2020). Brainport, Platooning, formation improvement by traffic light data [Dataset]. http://doi.org/10.5281/zenodo.3606608
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    TNO; TASS; NXP; Technolution; NEVS (National Electric Vehicle Sweden); TNO; TASS; NXP; Technolution; NEVS (National Electric Vehicle Sweden)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Scenario description:

    Platoon formation with live traffic light data included in planner.
    - Enabled live traffic light data included in planner
    - Not using the Android app
    - Starting at default locations
    - This test was filmed, including the GUI.

    Session description:

    Platoon formation improvement by traffic light data.

    Datasets descriptions:

    AUTOPILOT_BrainPort_Platooning_DriverVehicleInteraction: Data extracted from the CAN of the vehicle

    This dataset contains e.g. throttlestatus, clutchstatus, brakestatus, brakeforce, wipersstatus, steeringwheel for the vehicle

    AUTOPILOT_BrainPort_Platooning_EnvironmentSensorsAbsolute: Data extracted from the vehicle environment sensors

    This dataset contains information about detected object, with absolute coordinates

    AUTOPILOT_BrainPort_Platooning_EnvironmentSensorsRelative: Data extracted from the vehicle environment sensors

    This dataset contains information about detected object, with relative coordinates

    AUTOPILOT_BrainPort_Platooning_IotVehicleMessage: Data sent between all devices, vehicles and services

    Each sensor data submission is a Message. A Message has an Envelope, a Path, and optionally (but likely) Path Events and optionally Path Media. The envelope bears fundamental information about the individual sender (the vehicle) but not to a level that owner of the vehicle can be identified or different messages can be identified that originate from a single vehicle.

    AUTOPILOT_BrainPort_Platooning_PlatoonFormation: Data sent from PlatoonService to vehicle

    This dataset contains information about the route and speed for a specific vehicle for forming a platoon

    AUTOPILOT_BrainPort_Platooning_PlatooningAction: Data logged by vehicle

    This dataset contains information about the current status of the platooning

    AUTOPILOT_BrainPort_Platooning_PlatooningEvent: Data logged by vehicle

    This dataset contains information about the identifiers used for each specific platooning event

    AUTOPILOT_BrainPort_Platooning_PlatoonStatus: Data sent by vehicle to PlatoonService

    This dataset contains information about the current status of the platooning

    AUTOPILOT_BrainPort_Platooning_PositioningSystem: Data from GPS on the vehicle

    This dataset contains speed, longitude, latitude, heading from the GPS

    AUTOPILOT_BrainPort_Platooning_PositioningSystemResample: Data from GPS on the vehicle

    This dataset contains speed,longitude,latitude,heading from the GPS, resampled to 100 milliseconds

    AUTOPILOT_BrainPort_Platooning_PSInfo: Data sent by PlatoonService to the vehicle

    This dataset contains speed and route information for the vehicle to create a platoon

    AUTOPILOT_BrainPort_Platooning_Target: Data from sensors on the vehicle

    Target detection in the vicinity of the host vehicle, by a vehicle sensor or virtual sensor

    AUTOPILOT_BrainPort_Platooning_Vehicle: Data from the CAN and sensors about the state of the vehicle

    This dataset contains a.o temperature and battery state of the vehicles

    AUTOPILOT_BrainPort_Platooning_VehicleDynamics: Data from the CAN and sensors about the state of the vehicle

    This dataset contains a.o accelerations and speedlimit of the vehicle, as observed from the CAN and the external sensors

    AUTOPILOT_BrainPort_Platooning_VehicleDynamics: Data from the CAN and sensors about the state of the vehicle

    This dataset contains a.o accelerations and speedlimit of the vehicle, as observed from the CAN and the external sensors

  6. Brainport, Platooning, platoon with live traffic light

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jan 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TNO; TASS; NXP; Technolution; NEVS (National Electric Vehicle Sweden); TNO; TASS; NXP; Technolution; NEVS (National Electric Vehicle Sweden) (2020). Brainport, Platooning, platoon with live traffic light [Dataset]. http://doi.org/10.5281/zenodo.3606589
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    TNO; TASS; NXP; Technolution; NEVS (National Electric Vehicle Sweden); TNO; TASS; NXP; Technolution; NEVS (National Electric Vehicle Sweden)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Scenario description:

    Platoon formation and platooning, from Helmond to Eindhoven and back to the Automotive Campus.
    - Starting in urban area with speed limits of 15 and 30 km/h.
    - Driving East on the Europaweg with speed limits of 50 and 70 km/h. This includes 3 crossings with traffic lights.
    - Driving on the the N270, along the Automotive Campus. One crossing with traffic lights, just before the A270.
    - Driving on the A270 (speed limit 100 km/h). Interrupted by one traffic light.
    - U-turn at the fly-over or at the end of the A270, to return the same way to the Automotive Campus.

    Session description:

    Platoon formation and platooning, with live traffic light data included in planner.
    - Live traffic light data available for planner
    - Driver uses the Android app
    - Starting at default locations
    - Platooning (CACC and lane keeping) on the A270 when possible.

    Datasets descriptions:

    AUTOPILOT_BrainPort_Platooning_DriverVehicleInteraction: Data extracted from the CAN of the vehicle

    This dataset contains e.g. throttlestatus, clutchstatus, brakestatus, brakeforce, wipersstatus, steeringwheel for the vehicle

    AUTOPILOT_BrainPort_Platooning_EnvironmentSensorsAbsolute: Data extracted from the vehicle environment sensors

    This dataset contains information about detected object, with absolute coordinates

    AUTOPILOT_BrainPort_Platooning_EnvironmentSensorsRelative: Data extracted from the vehicle environment sensors

    This dataset contains information about detected object, with relative coordinates

    AUTOPILOT_BrainPort_Platooning_IotVehicleMessage: Data sent between all devices, vehicles and services

    Each sensor data submission is a Message. A Message has an Envelope, a Path, and optionally (but likely) Path Events and optionally Path Media. The envelope bears fundamental information about the individual sender (the vehicle) but not to a level that owner of the vehicle can be identified or different messages can be identified that originate from a single vehicle.

    AUTOPILOT_BrainPort_Platooning_PlatoonFormation: Data sent from PlatoonService to vehicle

    This dataset contains information about the route and speed for a specific vehicle for forming a platoon

    AUTOPILOT_BrainPort_Platooning_PlatooningAction: Data logged by vehicle

    This dataset contains information about the current status of the platooning

    AUTOPILOT_BrainPort_Platooning_PlatooningEvent: Data logged by vehicle

    This dataset contains information about the identifiers used for each specific platooning event

    AUTOPILOT_BrainPort_Platooning_PlatoonStatus: Data sent by vehicle to PlatoonService

    This dataset contains information about the current status of the platooning

    AUTOPILOT_BrainPort_Platooning_PositioningSystem: Data from GPS on the vehicle

    This dataset contains speed, longitude, latitude, heading from the GPS

    AUTOPILOT_BrainPort_Platooning_PositioningSystemResample: Data from GPS on the vehicle

    This dataset contains speed,longitude,latitude,heading from the GPS, resampled to 100 milliseconds

    AUTOPILOT_BrainPort_Platooning_PSInfo: Data sent by PlatoonService to the vehicle

    This dataset contains speed and route information for the vehicle to create a platoon

    AUTOPILOT_BrainPort_Platooning_Target: Data from sensors on the vehicle

    Target detection in the vicinity of the host vehicle, by a vehicle sensor or virtual sensor

    AUTOPILOT_BrainPort_Platooning_Vehicle: Data from the CAN and sensors about the state of the vehicle

    This dataset contains a.o temperature and battery state of the vehicles

    AUTOPILOT_BrainPort_Platooning_VehicleDynamics: Data from the CAN and sensors about the state of the vehicle

    This dataset contains a.o accelerations and speedlimit of the vehicle, as observed from the CAN and the external sensors

    AUTOPILOT_BrainPort_Platooning_VehicleDynamics: Data from the CAN and sensors about the state of the vehicle

    This dataset contains a.o accelerations and speedlimit of the vehicle, as observed from the CAN and the external sensors

  7. f

    Potential weaknesses and other security issues per app.

    • figshare.com
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vasileios Kouliaridis; Georgios Kambourakis; Efstratios Chatzoglou; Dimitrios Geneiatakis; Hua Wang (2023). Potential weaknesses and other security issues per app. [Dataset]. http://doi.org/10.1371/journal.pone.0251867.t006
    Explore at:
    Dataset updated
    Jun 11, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Vasileios Kouliaridis; Georgios Kambourakis; Efstratios Chatzoglou; Dimitrios Geneiatakis; Hua Wang
    Description

    Potential weaknesses and other security issues per app.

  8. C

    CoronaMelder Statistics

    • ckan.mobidatalab.eu
    Updated Jul 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OverheidNl (2023). CoronaMelder Statistics [Dataset]. https://ckan.mobidatalab.eu/dataset/coronamelder-statistieken
    Explore at:
    http://publications.europa.eu/resource/authority/file-type/csvAvailable download formats
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    OverheidNl
    License

    Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
    License information was derived automatically

    Description

    In this table you will find information about CoronaMelder. This concerns two variables: 1. The number of people who downloaded CoronaMelder 2. The number of people who warned others via CoronaMelder 1. The number of downloads is based on data from: - App Store (iOS) - Play Store (Android) - Huawei App Gallery (Android) 2. If you have tested positive for corona, you can voluntarily indicate this in the app, together with an employee of the GGD. The numbers show how many people have done this.

  9. i

    cuckoo

    • impactcybertrust.org
    • search.datacite.org
    Updated Jun 15, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    External Data Source (2019). cuckoo [Dataset]. http://doi.org/10.23721/100/1503942
    Explore at:
    Dataset updated
    Jun 15, 2019
    Authors
    External Data Source
    Description

    Cuckoo Sandbox is the leading open sourceautomated malware analysis system. You can throw any suspicious file atit and in a matter of seconds Cuckoo will provide you back some detailedresults outlining what such file did when executed inside an isolatedenvironment.

    Cuckoo Sandbox is free software that automated the task of analyzing any malicious file under Windows, OS X, Linux, and Android.

    What can it do?

    Cuckoo Sandbox is an advanced, extremely modular, and 100% open source automated malware analysis system with infinite application opportunities. By default it is able to:


    Analyze many different malicious files (executables, office documents, pdf files, emails, etc) as well as malicious websites under Windows, Linux, Mac OS X, and Android virtualized environments.
    Trace API calls and general behavior of the file and distill this into high level information and signatures comprehensible by anyone.
    Dump and analyze network traffic, even when encrypted with SSL/TLS. With native network routing support to drop all traffic or route it through InetSIM, a network interface, or a VPN.
    Perform advanced memory analysis of the infected virtualized system through Volatility as well as on a process memory granularity using YARA.


    Due to Cuckoo s open source nature and extensive modular design one may customize any aspect of the analysis environment, analysis results processing, and reporting stage. Cuckoo provides you all the requirements to easily integrate the sandbox into your existing framework and backend in the way you want, with the format you want, and all of that without licensing requirements.

    .

  10. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Christian Urcuqui (2019). Network Traffic Android Malware [Dataset]. https://www.kaggle.com/datasets/xwolf12/network-traffic-android-malware
Organization logo

Network Traffic Android Malware

Malware and Benign Apps

Explore at:
6 scholarly articles cite this dataset (View in Google Scholar)
zip(116603 bytes)Available download formats
Dataset updated
Sep 12, 2019
Authors
Christian Urcuqui
Description

Introduction

Android is one of the most used mobile operating systems worldwide. Due to its technological impact, its open-source code and the possibility of installing applications from third parties without any central control, Android has recently become a malware target. Even if it includes security mechanisms, the last news about malicious activities and Android´s vulnerabilities point to the importance of continuing the development of methods and frameworks to improve its security.

To prevent malware attacks, researches and developers have proposed different security solutions, applying static analysis, dynamic analysis, and artificial intelligence. Indeed, data science has become a promising area in cybersecurity, since analytical models based on data allow for the discovery of insights that can help to predict malicious activities.

In this work, we propose to consider some network layer features as the basis for machine learning models that can successfully detect malware applications, using open datasets from the research community.

Content

This dataset is based on another dataset (DroidCollector) where you can get all the network traffic in pcap files, in our research we preprocessed the files in order to get network features that are illustrated in the next article:

López, C. C. U., Villarreal, J. S. D., Belalcazar, A. F. P., Cadavid, A. N., & Cely, J. G. D. (2018, May). Features to Detect Android Malware. In 2018 IEEE Colombian Conference on Communications and Computing (COLCOM) (pp. 1-6). IEEE.

Acknowledgements

Cao, D., Wang, S., Li, Q., Cheny, Z., Yan, Q., Peng, L., & Yang, B. (2016, August). DroidCollector: A High Performance Framework for High Quality Android Traffic Collection. In Trustcom/BigDataSE/I SPA, 2016 IEEE (pp. 1753-1758). IEEE

Search
Clear search
Close search
Google apps
Main menu