30 datasets found
  1. ec2_cpu_utilization

    • kaggle.com
    Updated Aug 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    #Piyush (2025). ec2_cpu_utilization [Dataset]. https://www.kaggle.com/datasets/piyushnaik/ec2-cpu-utilization/discussion?sort=undefined
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 3, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    #Piyush
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    CPU utilization time series dataset for anomaly detection

  2. m

    Data from: Large-Scale Curated Multivariate Time Series Anomaly Detection...

    • data.mendeley.com
    Updated Jul 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Veena More (2025). Large-Scale Curated Multivariate Time Series Anomaly Detection Dataset for Laptop Performance Metrics [Dataset]. http://doi.org/10.17632/97jn6xrs84.1
    Explore at:
    Dataset updated
    Jul 7, 2025
    Authors
    Veena More
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    High-quality multivariate time-series datasets are significantly less accessible compared to more common data types such as images or text, due to the resource-intensive process of continuous monitoring, precise annotation, and long-term observation. This paper introduces a cost-effective solution in the form of a large-scale, curated dataset specifically designed for anomaly detection in computing systems’ performance metrics. The dataset encompasses 45 GB of multivariate time-series data collected from 66 systems, capturing key performance indicators such as CPU usage, memory consumption, disk I/O, system load, and power consumption across diverse hardware configurations and real-world usage scenarios. Annotated anomalies, including performance degradation and resource inefficiencies, provide a reliable benchmark and ground truth for evaluating anomaly detection models. By addressing the accessibility challenges associated with time-series data, this resource facilitates advancements in machine learning applications, including anomaly detection, predictive maintenance, and system optimisation. Its comprehensive and practical design makes it a foundational asset for researchers and practitioners dedicated to developing reliable and efficient computing systems.

  3. AI-Based Job Site Matching

    • kaggle.com
    Updated Feb 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). AI-Based Job Site Matching [Dataset]. https://www.kaggle.com/datasets/thedevastator/ai-based-job-site-matching/versions/2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 11, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    AI-Based Job Site Matching

    Leveraging 400k+ Hours of Resource & Performance Data

    By [source]

    About this dataset

    As you savvy job-seekers know, selecting an optimal site for GlideinWMS jobs is no small feat -weighing so many critical variables, and performing the highly sophisticated calculations needed to maximize the gains can be a tall order. Our dataset offers a valuable helping hand: with detailed insight into resource metrics and time-series analysis of over 400K hours of data, this treasure trove of information will hasten your journey towards finding just the right spot for all your job needs.

    Specifically, our dataset contains three files: dataset_classification.csv, which provides information on critical elements such as disk usage and CPU cache size; dataset_time_series_analysis.csv featuring in-depth takeaways from careful time series analysis; And finally dataset_400k_hour.csv gathering computation results from over 400K hours of testing! With columns such as Failure (indicating whether or not the job failed) TotalCpus (the total number of CPUs used by the job), CpuIsBusy (whether or not the CPU is busy), and SlotType (the type of slot used by the job), it's easier than ever to plot that perfect path to success!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset can be used to help identify the most suitable site for GlideinWMS jobs. It contains resource metrics and time-series analysis, which can provide useful insight into the suitability of each potential site. The dataset consists of three sets: dataset_classification.csv, dataset_time_series_analysis.csv and dataset_400k_hour.csv.

    The first set provides a high-level view of the critical resource metrics that are essential when matching a job and a site: DiskUsage, TotalCpus, TotalMemory, TotalDisk, CpuCacheSize and TotalVirtualMemoryTotalSlots as well as total slot information all important criteria for any job matching process - including whether or not the CpuIsBusy - along with information about the SlotType for each job at each potential site; additionally there is also data regarding Failure should an issue arise during this process; finally Site is provided so that users can ensure they are matching jobs to sites within their own specific environment if required by policy or business rules.

    The second set provides detailed time-series analysis related to these metrics over longer timeframes as well LastUpdate indicating when this analysis was generated (without date), ydate indicating year of last update (without date), mdate indicating month of last update (without date) and hdate indicating hour at which data is refreshed on a regular basis without errors so that up-to-the minute decisions can be made during busy times like peak workloads or reallocations caused by anomalies in usage patterns within existing systems/environments;

    Finally our third set takes things one step further with detailed information related to our 400k+ hours analytical data collection allowing you maximize efficiency while selecting best possible matches across multiple sites/criteria using only one tool (which we have conveniently packaged together in this impressive kaggle datasets :)

    By taking advantage of our AI driven approach you will be able benefit from optimal job selection across many different scenarios such maximum efficiency scenarios with boosts in throughput through realtime scaling along with accountability boost ensuring proper system governance when moving from static systems utilizing static strategies towards ones more reactive working utilization dynamics within new agile deployments increasing stability while lowering maintenance costs over longer run!

    Research Ideas

    • Use the total CPU, memory and disk usage metrics to identify jobs that need additional resources to complete quickly and suggest alternatives sites with more optimal resource availability
    • Utilize the time-series analysis using failure rate, last update time series, as well as month/hour/year of last update metrics to create predictive models for job site matching and failure avoidance on future jobs
    • Identify inefficiencies in scheduling by cross-examining job types (slot type), CPU caching size requirements against historical data to find opportunities for optimization or new approaches to job organization

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    **License: [CC0 1....

  4. f

    BigDataAD Benchmark Dataset

    • figshare.com
    zip
    Updated Sep 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kingsley Pattinson (2023). BigDataAD Benchmark Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.24040563.v8
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 29, 2023
    Dataset provided by
    figshare
    Authors
    Kingsley Pattinson
    License

    https://www.apache.org/licenses/LICENSE-2.0.htmlhttps://www.apache.org/licenses/LICENSE-2.0.html

    Description

    The largest real-world dataset for multivariate time series anomaly detection (MTSAD) from the AIOps system of a Real-Time Data Warehouse (RTDW) from a top cloud computing company. All the metrics and labels in our dataset are derived from real-world scenarios. All metrics were obtained from the RTDW instance monitoring system and cover a rich variety of metric types, including CPU usage, queries per second (QPS) and latency, which are related to many important modules within RTDW AIOps Dataset. We obtain labels from the ticket system, which integrates three main sources of instance anomalies: user service requests, instance unavailability and fault simulations . User service requests refer to tickets that are submitted directly by users, whereas instance unavailability is typically detected through existing monitoring tools or discovered by Site Reliability Engineers (SREs). Since the system is usually very stable, we augment the anomaly samples by conducting fault simulations. Fault simulation refers to a special type of anomaly, planned beforehand, which is introduced to the system to test its performance under extreme conditions. All records in the ticket system are subject to follow-up processing by engineers, who meticulously mark the start and end times of each ticket. This rigorous approach ensures the accuracy of the labels in our dataset.

  5. h

    harpertokenSysMon

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    harper, harpertokenSysMon [Dataset]. https://huggingface.co/datasets/harpertoken/harpertokenSysMon
    Explore at:
    Authors
    harper
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    harpertokenSysMon Dataset

      Dataset Summary
    

    This open-source dataset captures real-time system metrics from macOS for time-series analysis, anomaly detection, and predictive maintenance.

      Dataset Features
    

    OS Compatibility: macOS
    Data Collection Interval: 1-5 seconds
    Total Storage Limit: 4GB
    File Format: CSV & Parquet
    Data Fields:
    timestamp: Date and time of capture
    cpu_usage: CPU usage percentage per core
    memory_used_mb: RAM usage in MB… See the full description on the dataset page: https://huggingface.co/datasets/harpertoken/harpertokenSysMon.

  6. f

    CPU hours, institutions, and PI's by year.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Richard Knepper; Katy Börner (2023). CPU hours, institutions, and PI's by year. [Dataset]. http://doi.org/10.1371/journal.pone.0157628.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Richard Knepper; Katy Börner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CPU hours, institutions, and PI's by year.

  7. MIT Supercloud Dataset

    • kaggle.com
    Updated Jun 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SkylarkPhantom (2022). MIT Supercloud Dataset [Dataset]. https://www.kaggle.com/datasets/skylarkphantom/mit-datacenter-challenge-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 30, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    SkylarkPhantom
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    For full details of the data please refer to the paper "The MIT Supercloud Dataset", available at https://ieeexplore.ieee.org/abstract/document/9622850 or https://arxiv.org/abs/2108.02037

    Dataset

    Datacenter monitoring systems offer a variety of data streams and events. The Datacenter Challenge datasets are a combination of high-level data (e.g. Slurm Workload Manager scheduler data) and low-level job-specific time series data. The high-level data includes parameters such as the number of nodes requested, number of CPU/GPU/memory requests, exit codes, and run time data. The low-level time series data is collected on the order of seconds for each job. This granular time series data includes CPU/GPU/memory utilization, amount of disk I/O, and environmental parameters such as power drawn and temperature. Ideally, leveraging both high-level scheduler data and low-level time series data will facilitate the development of AI/ML algorithms which not only predict/detect failures, but also allow for the accurate determination of their cause.

    Here I will only include the high-level data.

    If you are interested in using the dataset, please cite this paper. @INPROCEEDINGS{9773216, author={Li, Baolin and Arora, Rohin and Samsi, Siddharth and Patel, Tirthak and Arcand, William and Bestor, David and Byun, Chansup and Roy, Rohan Basu and Bergeron, Bill and Holodnak, John and Houle, Michael and Hubbell, Matthew and Jones, Michael and Kepner, Jeremy and Klein, Anna and Michaleas, Peter and McDonald, Joseph and Milechin, Lauren and Mullen, Julie and Prout, Andrew and Price, Benjamin and Reuther, Albert and Rosa, Antonio and Weiss, Matthew and Yee, Charles and Edelman, Daniel and Vanterpool, Allan and Cheng, Anson and Gadepally, Vijay and Tiwari, Devesh}, booktitle={2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)}, title={AI-Enabling Workloads on Large-Scale GPU-Accelerated System: Characterization, Opportunities, and Implications}, year={2022}, volume={}, number={}, pages={1224-1237}, doi={10.1109/HPCA53966.2022.00093}}

    Reference: https://dcc.mit.edu/ https://github.com/boringlee24/HPCA22_SuperCloud

  8. Microservices Bottleneck Localization Dataset

    • kaggle.com
    Updated Feb 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gagan Somashekar (2024). Microservices Bottleneck Localization Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/7638732
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 17, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Gagan Somashekar
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Prior works have noted that existing public traces on anomaly detection and bottleneck localization in microservices applications only contain single, severe bottlenecks that are not representative of real-world scenarios. When such a bottleneck is introduced, the resulting latency increases by an order of magnitude (100x), making it trivial to detect that single bottleneck using a simple grid search or threshold-based approaches.

    To create a more realistic dataset that includes traces with multiple bottlenecks at different intensities, we carefully benchmarked the social networking application under different interference intensities and duration of interference. We chose intensities and duration values that degrade the application performance but do not cause any faults or errors that can be trivially detected. We induced interference on different VMs at different times and also simultaneously. A single VM could be induced with different types of interference (e.g., CPU and memory), resulting in the hosted microservices experiencing a mixture of interference patterns. The resulting dataset consists of around 40 million request traces along with corresponding time series of CPU, memory, I/O, and network metrics. The dataset also includes application, VM, and Kubernetes logs.

    A detailed description of the files is provided in the Data Explorer section. Please reach out to gagan at cs dot stonybrook dot edu if you have any questions or concerns.

    If you find the dataset useful, please cite our WWW'24 paper "GAMMA: Graph Neural Network-Based Multi-Bottleneck Localization for Microservices Applications." Citation format (bibtex):

    author = {Somashekar, Gagan and Dutt, Anurag and Adak, Mainak and Lorido Botran, Tania and Gandhi, Anshul},
    title = {GAMMA: Graph Neural Network-Based Multi-Bottleneck Localization for Microservices Applications.},
    year = {2024},
    publisher = {Association for Computing Machinery},
    address = {New York, NY, USA},
    url = {https://doi.org/10.1145/3589334.3645665},
    doi = {10.1145/3589334.3645665},
    booktitle = {Proceedings of the ACM Web Conference 2024},
    location = {Singapore},
    series = {WWW '24}
    }```
    
  9. U

    Replication data for "Lightweight Behavior-Based Malware Detection"

    • dataverse.unimi.it
    Updated Nov 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicola Bena; Nicola Bena; Marco Anisetti; Marco Anisetti; Claudio A. Ardagna; Claudio A. Ardagna; Gabriele Gianini; Gabriele Gianini; Vincenzo Giandomenico; Vincenzo Giandomenico (2024). Replication data for "Lightweight Behavior-Based Malware Detection" [Dataset]. http://doi.org/10.13130/RD_UNIMI/LJ6Z8V
    Explore at:
    bin(27523), text/x-python(1436), text/x-python(1147), tsv(112040), bin(55), tsv(10289), application/x-ipynb+json(1736968), tsv(10111), txt(2018), tsv(118946), zip(3251335), tsv(119113), application/x-ipynb+json(8672), txt(240000), bin(1228541), application/x-ipynb+json(137862), tsv(119218), bin(36712), tsv(112144), application/x-ipynb+json(121867), txt(1694), text/markdown(13542), bin(4245), bin(156998), zip(52781371), tsv(119217), text/x-python(1126), application/x-ipynb+json(11533), zip(4469422), text/x-python(1339)Available download formats
    Dataset updated
    Nov 3, 2024
    Dataset provided by
    UNIMI Dataverse
    Authors
    Nicola Bena; Nicola Bena; Marco Anisetti; Marco Anisetti; Claudio A. Ardagna; Claudio A. Ardagna; Gabriele Gianini; Gabriele Gianini; Vincenzo Giandomenico; Vincenzo Giandomenico
    License

    https://dataverse.unimi.it/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.13130/RD_UNIMI/LJ6Z8Vhttps://dataverse.unimi.it/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.13130/RD_UNIMI/LJ6Z8V

    Description

    Dataset containing real-world and synthetic samples on legit and malware samples in the form of time series. The samples consider machine-level performance metrics: CPU usage, RAM usage, number of bytes read and written from and to disk and network. Synthetic samples are generated using a GAN.

  10. h

    anomaly_detection_metrics_data

    • huggingface.co
    Updated Jul 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shreyas Patil (2023). anomaly_detection_metrics_data [Dataset]. https://huggingface.co/datasets/ShreyasP123/anomaly_detection_metrics_data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 20, 2023
    Authors
    Shreyas Patil
    Description

    Dataset Card: Anomaly Detection Metrics Data

      Dataset Summary
    

    This dataset contains system performance metrics collected over time for anomaly detection in time series data. It includes multiple system metrics such as CPU load, memory usage, and other resource utilization statistics, along with timestamps and additional attributes.

      Dataset Details
    

    Size: ~7.3 MB (raw JSON), 345 kB (auto-converted Parquet) Rows: 46,669 Format: JSON Libraries: datasets, pandas… See the full description on the dataset page: https://huggingface.co/datasets/ShreyasP123/anomaly_detection_metrics_data.

  11. Project and allocation data from XDCDB.

    • plos.figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Richard Knepper; Katy Börner (2023). Project and allocation data from XDCDB. [Dataset]. http://doi.org/10.1371/journal.pone.0157628.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Richard Knepper; Katy Börner
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Project and allocation data from XDCDB.

  12. Z

    Data for "Thermal transport of glasses via machine learning driven...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pegolo, Paolo; Grasselli, Federico (2024). Data for "Thermal transport of glasses via machine learning driven simulations" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10225315
    Explore at:
    Dataset updated
    Feb 9, 2024
    Dataset provided by
    Scuola Internazionale Superiore di Studi Avanzati
    École Polytechnique Fédérale de Lausanne
    Authors
    Pegolo, Paolo; Grasselli, Federico
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains input and analysis scripts supporting the findings of Thermal transport of glasses via machine learning driven simulations, by P. Pegolo and F. Grasselli. Content:

    README.md: this file, information about the repository SiO2: vitreous silica parent folder

    NEP: folder with datasets and input scripts for NEP training

    train.xyz: training dataset test.xyz: validation dataset nep.in: NEP input script nep.txt: NEP model nep.restart: NEP restart file DP: folder with datasets and input scripts for DP training

    input.json: DeePMD training input dataset: DeePMD training dataset validation: DeePMD validation dataset frozen_model.pb: DP model GKMD: scripts for the GKMD simulations Tersoff: Tersoff reference simulation

    model.xyz: initial configuration run.in: GPUMD script SiO2.gpumd.tersoff88: Tersoff model parameters convert_movie_to_dump.py: script to convert GPUMD XYZ trajectory to LAMMPS format for re-running the trajectory with the MLPs DP: DP simulation

    init.data: LAMMPS initial configuration in.lmp: LAMMPS input to re-run the Tersoff trajectory with the DP NEP: NEP simulation

    init.data: LAMMPS initial configuration in.lmp: LAMMPS input to re-run the Tersoff trajectory with the NEP. Note that this needs the NEP-CPU user package installed in LAMMPS. At the moment it is not possible to re-run a trajectory with GPUMD. QHGK: scripts for the QHGK simulations

    DP: DP data

    second.npy: second-order interatomic force constants third.npy: third-order interatomic force constants replicated_atoms.xyz: configuration dynmat: scripts to compute interatomic force constants with the DP model. Analogous scripts were used also to compute IFCs with the other potentials.

    initial.data: non optimized configuration in.dynmat.lmp: LAMMPS script to minimize the structure and compute second-order interatomic force constants in.third.lmp: LAMMPS script to compute third-order interatomic force constants Tersoff: Tersoff data

    second.npy: second-order interatomic force constants third.npy: third-order interatomic force constants replicated_atoms.xyz: configuration NEP: NEP data

    second.npy: second-order interatomic force constants third.npy: third-order interatomic force constants replicated_atoms.xyz: configuration qhgk.py: script to compute QHGK lifetimes and thermal conductivity Si: vitreous silicon parent folder

    QHGK: scripts for the QHGK simulations

    qhgk.py: script to compute QHGK lifetimes [N]: folder with the calculations on a N-atoms system

    second.npy: second-order interatomic force constants third.npy: third-order interatomic force constants replicated_atoms.xyz: configuration LiSi: vitreous litihum-intercalated silicon parent folder

    NEP: folder with datasets and input scripts for NEP training

    train.xyz: training dataset test.xyz: validation dataset nep.in: NEP input script nep.txt: NEP model nep.restart: NEP restart file EMD: folder with data on the equilibrium molecular dynamics simulations

    70k: data of the simulations with ~70k atoms

    1-45: folder with input scripts for the simulations at different Li concentration

    fraction.dat: Li fraction, y, as in Li_{y}Si quench: scripts for the melt-quench-anneal sample preparation

    model.xyz: initial configuration restart.xyz: final configuration run.in: GPUMD input gk: scripts for the GKMD simulation

    model.xyz: initial configuration restart.xyz: final configuration run.in: GPUMD input cepstral: folder for cepstral analysis

    analyze.py: python script for cepstral analysis of the fluxes' time-series generated by the GKMD runs

  13. Transition and Drivers of Elastic-Inelastic Deformation in the Abarkuh Plain...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jul 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sayyed Mohammad Javad Mirzadeh; Sayyed Mohammad Javad Mirzadeh; Shuanggen Jin; Shuanggen Jin; Estelle Chaussard; Estelle Chaussard; Roland Bürgmann; Roland Bürgmann; Abolfazl Rezaei; Abolfazl Rezaei; Saba Ghotbi; Saba Ghotbi; Andreas Braun; Andreas Braun (2023). Transition and Drivers of Elastic-Inelastic Deformation in the Abarkuh Plain from InSAR Multi-Sensor Time Series and Hydrogeological Data [Dataset]. http://doi.org/10.5281/zenodo.7786448
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 8, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sayyed Mohammad Javad Mirzadeh; Sayyed Mohammad Javad Mirzadeh; Shuanggen Jin; Shuanggen Jin; Estelle Chaussard; Estelle Chaussard; Roland Bürgmann; Roland Bürgmann; Abolfazl Rezaei; Abolfazl Rezaei; Saba Ghotbi; Saba Ghotbi; Andreas Braun; Andreas Braun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Abarkuh
    Description

    This repository contains the datasets used in Mirzadeh et al., 2022. It includes three InSAR time-series datasets from the Envisat descending orbit, ALOS-1 ascending orbit, and Sentinel-1A in ascending and descending orbits, acquired over the Abarkuh Plain, Iran, as well as the geological map of the study area and the GNSS and hydrogeological data used in this research.

    Dataset 1: Envisat descending track 292

    • Date: 06 Oct 2003 - 05 Sep 2005 (12 acquisitions)
    • Processor: ISCE/stripmapStack + MintPy
    • Displacement time-series (in HDF-EOS5 format): timeseries_LOD_tropHgt_ramp_demErr.h5
    • Mean LOS Velocity (in HDF-EOS5 format): velocity.h5
    • Mask Temporal Coherence (in HDF-EOS5 format): maskTempCoh.h5
    • Geometry (in HDF-EOS5 format): geometryRadar.h5

    Dataset 2: ALOS-1 ascending track 569

    • Date: 06 Dec 2006 - 17 Dec 2010 (14 acquisitions)
    • Processor: ISCE/stripmapStack + MintPy
    • Displacement time-series (in HDF-EOS5 format): timeseries_ERA5_ramp_demErr.h5
    • Mean LOS Velocity (in HDF-EOS5 format): velocity.h5
    • Mask Temporal Coherence (in HDF-EOS5 format): maskTempCoh.h5
    • Geometry (in HDF-EOS5 format): geometryRadar.h5

    Dataset 2: Sentinel-1 ascending track 130 and descending track 137

    • Date: 14 Oct 2014 - 28 Mar 2020 (129 ascending acquisitions) + 27 Oct 2014 - 29 Mar 2020 (114 descending acquisitions)
    • Processor: ISCE/topsStack + MintPy
    • Displacement time-series (in HDF-EOS5 format): timeseries_ERA5_ramp_demErr.h5
    • Mean LOS Velocity (in HDF-EOS5 format): velocity.h5
    • Mask Temporal Coherence (in HDF-EOS5 format): maskTempCoh.h5
    • Geometry (in HDF-EOS5 format): geometryRadar.h5

    The time series and Mean LOS Velocity (MVL) products can be georeferenced and resampled using the makTempCoh and geometryRadar products and the MintPy commands/functions.

  14. Jetson Nano Resource Usage and Performance Dataset

    • kaggle.com
    Updated Feb 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shaily-20 (2025). Jetson Nano Resource Usage and Performance Dataset [Dataset]. https://www.kaggle.com/datasets/shaily20/jetson-nano-resource-usage-and-performance-dataset/versions/2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 3, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    shaily-20
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Description: Jetson Nano Bob Waveshare NVIDIA Performance Metrics

    Overview:

    This dataset contains performance metrics collected from the NVIDIA Jetson Nano development board, specifically using the Bob Waveshare module. The data captures various system parameters over time, providing insights into the performance and resource utilization of the device. Data Structure:

    The dataset is structured in a CSV format with the following columns:

    1. RAM: Current and total RAM usage (e.g., "3084/3964MB").
    2. lfb: Local framebuffer memory usage (e.g., "7x4MB").
    3. SWAP: Current and total swap memory usage (e.g., "152/6078MB").
    4. cached: Cached memory usage (e.g., "12MB").
    5. IRAM: Internal RAM usage (e.g., "0/252kB").
    6. CPU-Core 1 to 4: CPU utilization percentages and frequencies for each core (e.g., "16%@921").
    7. EMC_FREQ: Memory controller frequency (e.g., "6%@1600").
    8. GR3D_FREQ: Graphics processing frequency (e.g., "99%@460").
    9. VIC_FREQ: Video interface controller frequency (e.g., "0%@192").
    10. Temperature Metrics: Various temperature readings for components such as PLL, CPU, GPU, and others (e.g., "PLL@30C").
    11. Power Metrics: Power input readings for different components (e.g., "POM_5V_IN").

    Purpose:

    The dataset is intended for performance analysis, benchmarking, and optimization of applications running on the Jetson Nano. It can be used to monitor system health, identify bottlenecks, and evaluate the impact of different workloads on system resources.

    Applications:

    Researchers and developers can utilize this dataset to: Analyze CPU and GPU performance under various workloads. Monitor thermal performance and power consumption. Optimize software applications for better resource management. Conduct comparative studies with other embedded systems.

    Data Collection:

    The data was collected over a series of tests and benchmarks, capturing real-time performance metrics during operation. Each entry represents a snapshot of the system's state at a specific point in time.

    Usage Notes:

    Users should be aware of the context in which the data was collected, including the specific configurations and workloads applied during testing. This information is crucial for interpreting the results accurately. This dataset serves as a valuable resource for anyone looking to understand the performance characteristics of the NVIDIA Jetson Nano platform, particularly in the context of embedded AI and machine learning applications.

  15. f

    Summary of multiple linear regression analysis for total time prediction:...

    • plos.figshare.com
    xls
    Updated Dec 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammed Hlayel; Hairulnizam Mahdin; Mohammad Hayajneh; Saleh H. AlDaajeh; Siti Salwani Yaacob; Mazidah Mat Rejab (2024). Summary of multiple linear regression analysis for total time prediction: Effects of vertex count and video size. [Dataset]. http://doi.org/10.1371/journal.pone.0314691.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 19, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Mohammed Hlayel; Hairulnizam Mahdin; Mohammad Hayajneh; Saleh H. AlDaajeh; Siti Salwani Yaacob; Mazidah Mat Rejab
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary of multiple linear regression analysis for total time prediction: Effects of vertex count and video size.

  16. c

    Research data supporting "21st century progress in computing".

    • repository.cam.ac.uk
    zip
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Coyle, Diane; Hampton, Lucy (2025). Research data supporting "21st century progress in computing". [Dataset]. http://doi.org/10.17863/CAM.113404
    Explore at:
    zip(638529 bytes)Available download formats
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    Apollo
    University of Cambridge
    Authors
    Coyle, Diane; Hampton, Lucy
    Description

    CPU and GPU time series of cost of computing, also time series of cost of cloud computing in UK. The detailed descriptions of the series are available in the associated paper. AI models miss disease in Black and female patients

  17. f

    Second data set.

    • figshare.com
    xls
    Updated Dec 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammed Hlayel; Hairulnizam Mahdin; Mohammad Hayajneh; Saleh H. AlDaajeh; Siti Salwani Yaacob; Mazidah Mat Rejab (2024). Second data set. [Dataset]. http://doi.org/10.1371/journal.pone.0314691.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 19, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Mohammed Hlayel; Hairulnizam Mahdin; Mohammad Hayajneh; Saleh H. AlDaajeh; Siti Salwani Yaacob; Mazidah Mat Rejab
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The rapid development of Digital Twin (DT) technology has underlined challenges in resource-constrained mobile devices, especially in the application of extended realities (XR), which includes Augmented Reality (AR) and Virtual Reality (VR). These challenges lead to computational inefficiencies that negatively impact user experience when dealing with sizeable 3D model assets. This article applies multiple lossless compression algorithms to improve the efficiency of digital twin asset delivery in Unity’s AssetBundle and Addressable asset management frameworks. In this study, an optimal model will be obtained that reduces both bundle size and time required in visualization, simultaneously reducing CPU and RAM usage on mobile devices. This study has assessed compression methods, such as LZ4, LZMA, Brotli, Fast LZ, and 7-Zip, among others, for their influence on AR performance. This study also creates mathematical models for predicting resource utilization, like RAM and CPU time, required by AR mobile applications. Experimental results show a detailed comparison among these compression algorithms, which can give insights and help choose the best method according to the compression ratio, decompression speed, and resource usage. It finally leads to more efficient implementations of AR digital twins on resource-constrained mobile platforms with greater flexibility in development and a better end-user experience. Our results show that LZ4 and Fast LZ perform best in speed and resource efficiency, especially with RAM caching. At the same time, 7-Zip/LZMA achieves the highest compression ratios at the cost of slower loading. Brotli emerged as a strong option for web-based AR/VR content, striking a balance between compression efficiency and decompression speed, outperforming Gzip in WebGL contexts. The Addressable Asset system with LZ4 offers the most efficient balance for real-time AR applications. This study will deliver practical guidance on optimal compression method selection to improve user experience and scalability for AR digital twin implementations.

  18. f

    CPU time for different values of α.

    • plos.figshare.com
    xls
    Updated Jun 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shakoor Ahmad; Shumaila Javeed; Saqlain Raza; Dumitru Baleanu (2023). CPU time for different values of α. [Dataset]. http://doi.org/10.1371/journal.pone.0277472.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Shakoor Ahmad; Shumaila Javeed; Saqlain Raza; Dumitru Baleanu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CPU time for different values of α.

  19. b

    Autonomous Underwater Vehicle Monterey Bay Time Series - CTD from AUV Makai...

    • datacart.bco-dmo.org
    • bco-dmo.org
    • +1more
    csv
    Updated Aug 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr Chris Scholin (2023). Autonomous Underwater Vehicle Monterey Bay Time Series - CTD from AUV Makai on 2016-02-03 [Dataset]. http://doi.org/10.26008/1912/bco-dmo.644012.1
    Explore at:
    csv(2.80 MB)Available download formats
    Dataset updated
    Aug 15, 2023
    Dataset provided by
    Biological and Chemical Data Management Office
    Authors
    Dr Chris Scholin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Feb 3, 2016
    Area covered
    Variables measured
    lat, lon, sal, temp, depth, chl_a_fluor, ISO_DateTime_UTC
    Measurement technique
    Environmental Sample Processor, Autonomous Underwater Vehicle
    Description

    Autonomous Underwater Vehicle (AUV) Monterey Bay Time Series from Feb 2016. This data set includes CTD and fluorometer data from the Makai AUV, as context for ecogenomic sampling using an onboard Environmental Sample Processor (ESP).

  20. m

    Ingenic Semiconductor - Diluted-Average-Shares

    • macro-rankings.com
    csv, excel
    Updated Jul 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    macro-rankings (2025). Ingenic Semiconductor - Diluted-Average-Shares [Dataset]. https://www.macro-rankings.com/markets/stocks/300223-she/income-statement/diluted-average-shares
    Explore at:
    excel, csvAvailable download formats
    Dataset updated
    Jul 23, 2025
    Dataset authored and provided by
    macro-rankings
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    china
    Description

    Diluted-Average-Shares Time Series for Ingenic Semiconductor. Ingenic Semiconductor Co.,Ltd. engages in the research and development, design, and sale of integrated circuit chip products in China and internationally. It offers multi-core crossover IoT micro-processor, multi-core heterogeneous crossover micro-processor, low-power AIoT micro-processor, low power image recognition micro-processor, ultra-low-power IoT micro-processor, low power AI video processor, 4K video and AI vision application processor, balanced video processor, dual camera low power video processor, 2K HEVC video-IOT MCU, and professional security backend processor. The company also provides computing, storage, analog, and interconnect chips. Its products are used in automotive electronics, industrial and medical, communication equipment, consumer electronics, and other fields. The company was founded in 2005 and is headquartered in Beijing, China.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
#Piyush (2025). ec2_cpu_utilization [Dataset]. https://www.kaggle.com/datasets/piyushnaik/ec2-cpu-utilization/discussion?sort=undefined
Organization logo

ec2_cpu_utilization

Time series dataset for anomaly detection

Explore at:
25 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 3, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
#Piyush
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

CPU utilization time series dataset for anomaly detection

Search
Clear search
Close search
Google apps
Main menu