40 datasets found
  1. Alibaba GPU Cluster Dataset 2025

    • kaggle.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sultanul Ovi (2025). Alibaba GPU Cluster Dataset 2025 [Dataset]. https://www.kaggle.com/datasets/mdsultanulislamovi/alibaba-gpu-cluster-dataset-2025
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 12, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sultanul Ovi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    GPU-Disaggregated DLRM Trace Dataset

    This dataset records latency-sensitive inference instances for GPU-disaggregated serving of deep learning recommendation models. It contains per-instance resource reservations and life cycle timestamps for scheduling analysis and capacity planning.

    This dataset represents a groundbreaking trace collection from production GPU-disaggregated serving systems for Deep Learning Recommendation Models (DLRMs), accompanying the NSDI'25 paper on GPU-disaggregated serving at scale. The dataset captures real-world operational characteristics of inference services in a large-scale production environment, providing invaluable insights into resource allocation patterns, temporal dynamics, and system behavior for latency-sensitive ML workloads.

    Scope

    • Total rows: 23871.
    • Unique apps: 156.
    • Role split, {'CN': 16485, 'HN': 7386}.

    Key fields

    Instance ID. Role. Application group. Requests and limits for CPU, GPU, RDMA, memory, and disk. Density cap per node. Creation, scheduling, and deletion timestamps relative to the trace start.

    High-level observations

    • All workloads are marked latency sensitive. Instances are typically long-running with high priority, as stated by the authors.
    • Scheduling delay distribution and runtime distribution are included in the figures. Concurrency over time gives a view of system load.
    • RDMA percentage and max instances per node expose placement constraints that influence packing on heterogeneous nodes.

    🎯 Key Characteristics

    Scale and Scope

    • Total Instances: 23,871 inference instances
    • Services: 156 unique inference services (applications)
    • Workload Type: 100% Latency-Sensitive (LS) workloads
    • Priority Level: High-priority, long-running inference instances
    • System Architecture: GPU-disaggregated architecture separating compute and GPU resources

    Instance Distribution

    • CPU Nodes (CN): 16,485 instances (69.1%)
      • Pure CPU-based inference workloads
      • No GPU allocation
      • Lower RDMA requirements (mean: 3.4%)
    • Heterogeneous GPU Nodes (HN): 7,386 instances (30.9%)
      • GPU-accelerated inference workloads
      • All instances allocated exactly 1 GPU
      • Higher RDMA requirements (mean: 20.5%)

    🔍 Key Insights

    Workload Heterogeneity

    • Clear bimodal distribution between CPU and GPU workloads
    • CN instances optimized for CPU-intensive operations
    • HN instances balanced for GPU acceleration with supporting CPU resources

    Resource Efficiency

    • Tight coupling between CPU and memory allocation (correlation: 0.97)
    • Independent scaling of GPU resources from CPU/memory
    • RDMA bandwidth scaled based on disaggregation communication needs

    Production Patterns

    • All workloads classified as latency-sensitive
    • High-priority, long-running inference services
    • Immediate scheduling indicates sufficient resource availability

    Disaggregation Benefits

    • Efficient resource utilization through separation of concerns
    • CN nodes handle CPU-intensive preprocessing/postprocessing
    • HN nodes focus on GPU-accelerated model inference
    • RDMA enables efficient data movement between disaggregated components

    📈 Research Applications

    This dataset enables research in:

    • Resource Allocation: Optimal scheduling strategies for disaggregated systems
    • Performance Modeling: Understanding latency-throughput tradeoffs
    • System Design: Architectural decisions for ML serving infrastructure
    • Workload Characterization: Production DLRM inference patterns
    • Capacity Planning: Resource provisioning for ML workloads
    • Fault Tolerance: Instance distribution and anti-affinity strategies

    🎓 Academic Contribution

    This dataset represents one of the first publicly available production traces for GPU-disaggregated DLRM serving, providing:

    • Real-world validation data for system research
    • Baseline for performance comparisons
    • Foundation for reproducible research in ML systems
    • Insights into production-scale ML infrastructure

    This dataset provides a unique window into production GPU-disaggregated systems, offering researchers and practitioners valuable insights for advancing the field of large-scale ML serving infrastructure.

  2. run_model_on_gpu

    • kaggle.com
    zip
    Updated Oct 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexis T. (2025). run_model_on_gpu [Dataset]. https://www.kaggle.com/datasets/alexismorgant/run-model-on-gpu
    Explore at:
    zip(966 bytes)Available download formats
    Dataset updated
    Oct 20, 2025
    Authors
    Alexis T.
    Description

    Dataset

    This dataset was created by Alexis T.

    Contents

  3. IceVision for CUDA11

    • kaggle.com
    zip
    Updated Dec 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron B. (2021). IceVision for CUDA11 [Dataset]. https://www.kaggle.com/abee82/icevision
    Explore at:
    zip(4278676246 bytes)Available download formats
    Dataset updated
    Dec 23, 2021
    Authors
    Aaron B.
    Description

    Dataset

    This dataset was created by Aaron B.

    Contents

  4. R

    Accident Detection Model Dataset

    • universe.roboflow.com
    zip
    Updated Apr 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Accident detection model (2024). Accident Detection Model Dataset [Dataset]. https://universe.roboflow.com/accident-detection-model/accident-detection-model/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 8, 2024
    Dataset authored and provided by
    Accident detection model
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Accident Bounding Boxes
    Description

    Accident-Detection-Model

    Accident Detection Model is made using YOLOv8, Google Collab, Python, Roboflow, Deep Learning, OpenCV, Machine Learning, Artificial Intelligence. It can detect an accident on any accident by live camera, image or video provided. This model is trained on a dataset of 3200+ images, These images were annotated on roboflow.

    Problem Statement

    • Road accidents are a major problem in India, with thousands of people losing their lives and many more suffering serious injuries every year.
    • According to the Ministry of Road Transport and Highways, India witnessed around 4.5 lakh road accidents in 2019, which resulted in the deaths of more than 1.5 lakh people.
    • The age range that is most severely hit by road accidents is 18 to 45 years old, which accounts for almost 67 percent of all accidental deaths.

    Accidents survey

    https://user-images.githubusercontent.com/78155393/233774342-287492bb-26c1-4acf-bc2c-9462e97a03ca.png" alt="Survey">

    Literature Survey

    • Sreyan Ghosh in Mar-2019, The goal is to develop a system using deep learning convolutional neural network that has been trained to identify video frames as accident or non-accident.
    • Deeksha Gour Sep-2019, uses computer vision technology, neural networks, deep learning, and various approaches and algorithms to detect objects.

    Research Gap

    • Lack of real-world data - We trained model for more then 3200 images.
    • Large interpretability time and space needed - Using google collab to reduce interpretability time and space required.
    • Outdated Versions of previous works - We aer using Latest version of Yolo v8.

    Proposed methodology

    • We are using Yolov8 to train our custom dataset which has been 3200+ images, collected from different platforms.
    • This model after training with 25 iterations and is ready to detect an accident with a significant probability.

    Model Set-up

    Preparing Custom dataset

    • We have collected 1200+ images from different sources like YouTube, Google images, Kaggle.com etc.
    • Then we annotated all of them individually on a tool called roboflow.
    • During Annotation we marked the images with no accident as NULL and we drew a box on the site of accident on the images having an accident
    • Then we divided the data set into train, val, test in the ratio of 8:1:1
    • At the final step we downloaded the dataset in yolov8 format.
      #### Using Google Collab
    • We are using google colaboratory to code this model because google collab uses gpu which is faster than local environments.
    • You can use Jupyter notebooks, which let you blend code, text, and visualisations in a single document, to write and run Python code using Google Colab.
    • Users can run individual code cells in Jupyter Notebooks and quickly view the results, which is helpful for experimenting and debugging. Additionally, they enable the development of visualisations that make use of well-known frameworks like Matplotlib, Seaborn, and Plotly.
    • In Google collab, First of all we Changed runtime from TPU to GPU.
    • We cross checked it by running command ‘!nvidia-smi’
      #### Coding
    • First of all, We installed Yolov8 by the command ‘!pip install ultralytics==8.0.20’
    • Further we checked about Yolov8 by the command ‘from ultralytics import YOLO from IPython.display import display, Image’
    • Then we connected and mounted our google drive account by the code ‘from google.colab import drive drive.mount('/content/drive')’
    • Then we ran our main command to run the training process ‘%cd /content/drive/MyDrive/Accident Detection model !yolo task=detect mode=train model=yolov8s.pt data= data.yaml epochs=1 imgsz=640 plots=True’
    • After the training we ran command to test and validate our model ‘!yolo task=detect mode=val model=runs/detect/train/weights/best.pt data=data.yaml’ ‘!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt conf=0.25 source=data/test/images’
    • Further to get result from any video or image we ran this command ‘!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt source="/content/drive/MyDrive/Accident-Detection-model/data/testing1.jpg/mp4"’
    • The results are stored in the runs/detect/predict folder.
      Hence our model is trained, validated and tested to be able to detect accidents on any video or image.

    Challenges I ran into

    I majorly ran into 3 problems while making this model

    • I got difficulty while saving the results in a folder, as yolov8 is latest version so it is still underdevelopment. so i then read some blogs, referred to stackoverflow then i got to know that we need to writ an extra command in new v8 that ''save=true'' This made me save my results in a folder.
    • I was facing problem on cvat website because i was not sure what
  5. TabPFN

    • kaggle.com
    zip
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mark Inzhirov (2023). TabPFN [Dataset]. https://www.kaggle.com/datasets/neutrino404/tabpfn
    Explore at:
    zip(95945799 bytes)Available download formats
    Dataset updated
    Jun 14, 2023
    Authors
    Mark Inzhirov
    Description

    Use this data set when submitting code offline for competitions otherwise just use !pip install tabpfn for online use. Usage for offline code submissions within Kaggle notebooks is as follows:

    1**.First add the dataset by selecting "add data" and searching for this dataset and adding it to your input. **

    2.**Next add the following code to a code block in your notebook ** python !pip install tabpfn --no-index --find-links=file:///kaggle/input/tabpfn !mkdir -p /opt/conda/lib/python3.10/site-packages/tabpfn/models_diff !cp /kaggle/input/tabpfn/prior_diff_real_checkpoint_n_0_epoch_100.cpkt /opt/conda/lib/python3.10/site-packages/tabpfn/models_diff/ 3.** Import** :
    from tabpfn import TabPFNClassifier

    4.**Now you are all set you can create a classifier and run it offline for submission in offline kaggle code competitions:** python classifier = TabPFNClassifier(device='cpu',N_ensemble_configurations=64) classifier.fit(X_train, Y_train) y_eval, p_eval = classifier.predict(X_cv, return_winning_probability=True)

    If you want to use TabPFN with GPU use the following code when you make the model: classifier = TabPFNClassifier(device='cuda',N_ensemble_configurations=32)

    You can find documentation for this package on GitHub: https://github.com/automl/TabPFN.git Original paper on TabPFN can be found at: https://arxiv.org/abs/2207.01848 License Copyright 2022 Noah Hollmann, Samuel Müller, Katharina Eggensperger, Frank Hutter

    Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

  6. llamacpp cuda sm75 complete build

    • kaggle.com
    zip
    Updated Oct 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiwei Liu (2025). llamacpp cuda sm75 complete build [Dataset]. https://www.kaggle.com/datasets/jiweiliu/llamacpp-sm75-complete-build
    Explore at:
    zip(169040223 bytes)Available download formats
    Dataset updated
    Oct 17, 2025
    Authors
    Jiwei Liu
    Description

    Complete llama.cpp build folder compiled with CUDA support for compute capability 7.5 (Turing architecture GPUs: RTX 2060/2070/2080, Tesla T4, Quadro RTX).

    Build Configuration: - CUDA 12.5 (compatibility with CUDA 12.x) - Compute Capability: SM_75 - Optimized for Nvidia Turing GPUs - Complete build directory with CMake files

    Contents: - Complete build/ directory - build/bin/ - All compiled executables and shared libraries - build/CMakeCache.txt - CMake configuration - build/compile_commands.json - Compilation database - All build artifacts and intermediate files

    Usage: 1. Extract the build folder 2. Ensure CUDA 12.x runtime is installed on target system 3. Set LD_LIBRARY_PATH to include build/bin directory 4. Run executables from build/bin/

    Requirements on target machine: - Nvidia GPU with compute capability 7.5 - CUDA 12.x runtime libraries - Linux x86_64

  7. Torch-TensorRT v2.2.0

    • kaggle.com
    zip
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gowri Shankar Penugonda (2024). Torch-TensorRT v2.2.0 [Dataset]. https://www.kaggle.com/datasets/gowrishankarp/torch-tensorrt-v2-2-0
    Explore at:
    zip(2810802392 bytes)Available download formats
    Dataset updated
    Apr 15, 2024
    Authors
    Gowri Shankar Penugonda
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Torch-TensorRT v2.2.0

    # Installation
    !pip install -qq --no-index --find-links=../input/torch-tensorrt-v2-2-0 torch-tensorrt==2.2.0
    

    Usage

    import torch
    import torch_tensorrt
    
    trt_model_fp16 = torch_tensorrt.compile(mlm_model, 
      inputs= [torch_tensorrt.Input(shape=[batch_size, 1024], dtype=torch.int32), # input_ids
           torch_tensorrt.Input(shape=[batch_size, 1024], dtype=torch.int32)], # attention_mask
      enabled_precisions= {torch.float32}, # Run with 32-bit precision
      workspace_size=2000000000,
      truncate_long_and_double=True
    )
    
    torch.jit.save(trt_model_fp16, 'kaggle-mlm_model-1024-gpu-aug0-01-swa.trt_fp16.ts')
    
    trt_model_fp16 = torch.jit.load(model_path)
    .
    .
    .
    inputs = {k: v.type(torch.int32).cuda() for k, v in inputs.items()}
    output_trt = trt_model_fp16(inputs['input_ids'], inputs['attention_mask'])
    output_trt 
    
  8. nemotron-3-8b-base-4k

    • kaggle.com
    zip
    Updated Aug 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Serhii Kharchuk (2024). nemotron-3-8b-base-4k [Dataset]. https://www.kaggle.com/datasets/serhiikharchuk/nemotron-3-8b-base-4k
    Explore at:
    zip(13688476176 bytes)Available download formats
    Dataset updated
    Aug 31, 2024
    Authors
    Serhii Kharchuk
    Description

    Nemotron-3-8B-Base-4k Model Overview License

    The use of this model is governed by the NVIDIA AI Foundation Models Community License Agreement. Description

    Nemotron-3-8B-Base-4k is a large language foundation model for enterprises to build custom LLMs. This foundation model has 8 billion parameters, and supports a context length of 4,096 tokens. Nemotron-3-8B-Base-4k is part of Nemotron-3, which is a family of enterprise ready generative text models compatible with NVIDIA NeMo Framework. For other models in this collection, see the collections page.

    NVIDIA NeMo is an end-to-end, cloud-native platform to build, customize, and deploy generative AI models anywhere. It includes training and inferencing frameworks, guardrailing toolkits, data curation tools, and pretrained models, offering enterprises an easy, cost-effective, and fast way to adopt generative AI. To get access to NeMo Framework, please sign up at this link. References

    Announcement Blog Model Architecture

    Architecture Type: Transformer

    Network Architecture: Generative Pre-Trained Transformer (GPT-3) Software Integration

    Runtime Engine(s): NVIDIA AI Enterprise

    Toolkit: NeMo Framework

    To get access to NeMo Framework, please sign up at this link. See NeMo inference container documentation for details on how to setup and deploy an inference server with NeMo.

    Sample Inference Code:

    from nemo.deploy import NemoQuery

    In this case, we run inference on the same machine

    nq = NemoQuery(url="localhost:8000", model_name="Nemotron-3-8B-4K")

    output = nq.query_llm(prompts=["The meaning of life is"], max_output_token=200, top_k=1, top_p=0.0, temperature=0.1) print(output)

    Supported Hardware:

    H100
    A100 80GB, A100 40GB
    

    Model Version(s)

    Nemotron-3-8B-base-4k-BF16-1 Dataset & Training

    The model uses a learning rate of 3e-4 with a warm-up period of 500M tokens and a cosine learning rate annealing schedule for 95% of the total training tokens. The decay stops at a minimum learning rate of 3e-5. The model is trained with a sequence length of 4096 and uses FlashAttention’s Multi-Head Attention implementation. 1,024 A100s were used for 19 days to train the model.

    NVIDIA models are trained on a diverse set of public and proprietary datasets. This model was trained on a dataset containing 3.8 Trillion tokens of text. The dataset contains 53 different human languages (including English, German, Russian, Spanish, French, Japanese, Chinese, Italian, and Dutch) and 37 programming languages. The model also uses the training subsets of downstream academic benchmarks from sources like FLANv2, P3, and NaturalInstructions v2. NVIDIA is committed to the responsible development of large language models and conducts reviews of all datasets included in training. Evaluation Task Num-shot Score MMLU* 5 54.4 WinoGrande 0 70.9 Hellaswag 0 76.4 ARC Easy 0 72.9 TyDiQA-GoldP** 1 49.2 Lambada 0 70.6 WebQS 0 22.9 PiQA 0 80.4 GSM8K 8-shot w/ maj@8 39.4

    • The calculation of MMLU follows the original implementation. See Hugging Face’s explanation of different implementations of MMLU.

    ** The languages used are Arabic, Bangla, Finnish, Indonesian, Korean, Russian and Swahili. Intended use

    This is a completion model. For best performance, users are encouraged to customize the completion model using NeMo Framework suite of customization tools including Parameter-Efficient Fine-Tuning (P-tuning, Adapters, LoRA), and SFT/RLHF. For chat use cases, please consider using Nemotron-3-8B chat variants. Ethical use

    Technology can have a profound impact on people and the world, and NVIDIA is committed to enabling trust and transparency in AI development. NVIDIA encourages users to adopt principles of AI ethics and trustworthiness to guide your business decisions by following the guidelines in the NVIDIA AI Foundation Models Community License Agreement. Limitations

    The model was trained on data that contains toxic language and societal biases originally crawled from the internet. Therefore, the model may amplify those biases and return toxic responses especially when prompted with toxic prompts.
    The model may generate answers that may be inaccurate, omit key information, or include irrelevant or redundant text producing socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive.
    
  9. BUTTER-E: Energy Data for Deep Learning Models

    • kaggle.com
    zip
    Updated Jan 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pavan Kumar S (2025). BUTTER-E: Energy Data for Deep Learning Models [Dataset]. https://www.kaggle.com/datasets/pavankumar4757/butter-e-energy-data-for-deep-learning-models
    Explore at:
    zip(2940491 bytes)Available download formats
    Dataset updated
    Jan 11, 2025
    Authors
    Pavan Kumar S
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The BUTTER-E - Energy Consumption Data for the BUTTER Empirical Deep Learning Dataset provides node-level energy consumption data collected via watt-meters, complementing the primary BUTTER dataset. This dataset records energy consumption and performance metrics for 1,059,206 experimental runs across diverse configurations of fully connected neural networks. Key attributes include:

    1.timestamp: The precise time of the energy consumption measurement. 2.node:The hardware node identifier (e.g., r103u05) where the experiment was conducted. 3.watts: The energy consumption (in watts) recorded for the corresponding node at the given timestamp.

    Highlights Data spans 30,582 distinct configurations, including variations across 13 datasets, 20 network sizes, 8 network shapes, and 14 depths. Measurements were taken on CPU and GPU hardware, offering insights into the relationship between neural network parameters and energy consumption. The dataset provides valuable information for analyzing the energy efficiency of deep learning models, particularly in relation to cache effects, dataset size, and network architecture.

    Use Cases This dataset is ideal for: Energy-efficient AI research: Understanding how energy consumption scales with model size, dataset properties, and network configurations. Performance optimization: Identifying configurations with optimal trade-offs between performance and energy usage. Sustainability analysis: Evaluating the carbon footprint of training and deploying deep learning models.

  10. Nvidia Database

    • kaggle.com
    zip
    Updated Jan 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ajay Tom (2025). Nvidia Database [Dataset]. https://www.kaggle.com/datasets/ajayt0m/nvidia-database
    Explore at:
    zip(8712 bytes)Available download formats
    Dataset updated
    Jan 30, 2025
    Authors
    Ajay Tom
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This is a beginner-friendly SQLite database designed to help users practice SQL and relational database concepts. The dataset represents a basic business model inspired by NVIDIA and includes interconnected tables covering essential aspects like products, customers, sales, suppliers, employees, and projects. It's perfect for anyone new to SQL or data analytics who wants to learn and experiment with structured data.

    Tables and Their Contents:

    Products:

    Includes details of 15 products (e.g., GPUs, AI accelerators). Attributes: product_id, product_name, category, release_date, price.

    Customers:

    Lists 20 fictional customers with their industry and contact information. Attributes: customer_id, customer_name, industry, contact_email, contact_phone.

    Sales:

    Contains 100 sales records tied to products and customers. Attributes: sale_id, product_id, customer_id, sale_date, region, quantity_sold, revenue.

    Suppliers:

    Features 50 suppliers and the materials they provide. Attributes: supplier_id, supplier_name, material_supplied, contact_email.

    Supply Chain:

    Tracks materials supplied to produce products, proportional to sales. Attributes: supply_chain_id, supplier_id, product_id, supply_date, quantity_supplied.

    Departments:

    Lists 5 departments within the business. Attributes: department_id, department_name, location.

    Employees:

    Contains data on 30 employees and their roles in different departments. Attributes: employee_id, first_name, last_name, department_id, hire_date, salary.

    Projects:

    Describes 10 projects handled by different departments. Attributes: project_id, project_name, department_id, start_date, end_date, budget.

    Why Use This Dataset?

    • Perfect for Beginners: The dataset is simple and easy to understand.
    • Interconnected Tables: Provides a basic introduction to relational database concepts like joins and foreign keys.
    • SQL Practice: Run basic queries, filter data, and perform simple aggregations or calculations.
    • Learning Tool: Great for small projects and understanding business datasets.

    Potential Use Cases:

    • Practice SQL queries (SELECT, INSERT, UPDATE, DELETE, JOIN).
    • Understand how to design and query relational databases.
    • Analyze basic sales and supply chain data for patterns and trends.
    • Learn how to use databases in analytics tools like Excel, Power BI, or Tableau.

    Data Size:

    Number of Tables: 8 Total Rows: Around 230 across all tables, ensuring quick queries and easy exploration.

  11. efficientcube_1000cube

    • kaggle.com
    zip
    Updated Jan 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arbidos (2025). efficientcube_1000cube [Dataset]. https://www.kaggle.com/datasets/arabidopsisthalian/efficientcube-1000cube
    Explore at:
    zip(2459796 bytes)Available download formats
    Dataset updated
    Jan 31, 2025
    Authors
    Arbidos
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Learn: train_cube3_qtm_MLP2RB_04M_1735350897.csv 4ч 19 мин train_cube3_qtm_MLP2RB_04M_1735378399.csv 4ч 19 мин

    1729775349
    1729775893
    1729776146
    1729776388
    1729776568 47/69 оптимально

    subprocess.run([ "nvcc", "--version" ])
    subprocess.run( [ "nvidia-smi" ]) subprocess.run([ "cat" , "/etc/os-release" ])
    subprocess.run([ "uname" , "-srm" ]) subprocess.run([ "cat" , "/proc/version" ]) subprocess.run([ "lspci" , "-k" ]) subprocess.run([ "cat" , "/proc/cpuinfo" ]) subprocess.run([ "nvidia-smi" , "-q" ]) subprocess.run([ "arch" ])

  12. Image Matching Challenge 2023 Dockerfile

    • kaggle.com
    zip
    Updated Apr 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anil Ozturk (2023). Image Matching Challenge 2023 Dockerfile [Dataset]. https://www.kaggle.com/nlztrk/image-matching-challenge-2023-dockerfile
    Explore at:
    zip(726 bytes)Available download formats
    Dataset updated
    Apr 18, 2023
    Authors
    Anil Ozturk
    Description

    This Dockerfile creates an environment that meets all the dependencies in the example submission file in Image Matching Challenge 2023. (including colmap and pycolmap)

    Steps (for Linux)

    Warning: You should replace YOURIMAGENAME with the name you desire for your environment at the later steps.

    1. Install Docker and NVIDIA Container Toolkit

    • Docker Installation: Link
    • NVIDIA Container Toolkit Installation: Link

    2. Set NVIDIA-Docker as the default builder

    Edit / Create the /etc/docker/daemon.json with content: { "runtimes": { "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } }, "default-runtime": "nvidia" }

    3. Build Your Docker Environment

    Just go to the root folder of the repository and execute the following lines in your terminal. DOCKER_BUILDKIT=0 docker build -f Dockerfile -t YOURIMAGENAME .

    4. Run

    4.1. Interactive

    You can simply run your environment with: docker run --gpus all -it --rm YOURIMAGENAME

    4.1. With Volume Mounting & Jupyter Server

    You can add these lines to your ~/.bashrc: djupyter() { docker run -v $PWD:/tmp/working -v ${HOME}/.cache:/container_cache -w=/tmp/working -e "XDG_CACHE_HOME=/container_cache" -p 8888:8888 --gpus all --rm -it YOURIMAGENAME jupyter notebook --no-browser --ip="0.0.0.0" --notebook-dir=/tmp/working --allow-root } After resetting all your terminals you can simply open a new terminal then type: djupyter

    and voila! You've just opened your docker, mounted your current folder and started a Jupyter server!

  13. Alibaba GPU Cluster Spot Resource Dataset

    • kaggle.com
    zip
    Updated Aug 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sultanul Ovi (2025). Alibaba GPU Cluster Spot Resource Dataset [Dataset]. https://www.kaggle.com/datasets/mdsultanulislamovi/alibaba-gpu-cluster-spot-resource-dataset
    Explore at:
    zip(5189979 bytes)Available download formats
    Dataset updated
    Aug 13, 2025
    Authors
    Sultanul Ovi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset provides a comprehensive trace of AI workloads running on a large-scale GPU cluster with spot resource provisioning capabilities. It captures real-world operational characteristics from a production environment, managing both high-priority workloads with strict Service Level Objectives (SLOs) and opportunistic spot workloads.

    Key Characteristics

    • Infrastructure Scale: 4,278 GPU nodes with 6 different GPU card types
    • Workload Volume: 466,867 job submissions tracked
    • Organization Diversity: 119 unique organizations/departments
    • Workload Types: Mixed high-priority (HP) and spot instance workloads

    🔬 Research Applications

    This dataset is valuable for:

    1. Scheduling Algorithm Development

      • Spot instance prediction models
      • Multi-resource scheduling optimization
      • SLO-aware preemption strategies
    2. Cluster Design Studies

      • GPU provisioning optimization
      • Heterogeneous resource planning
      • Cost-performance trade-off analysis
    3. Workload Characterization

      • AI/ML job pattern analysis
      • Organization behavior modeling
      • Resource demand forecasting
    4. Economic Analysis

      • Spot pricing strategies
      • Resource allocation fairness
      • Cost optimization for mixed workloads

    📝 Dataset Limitations and Considerations

    1. Temporal Coverage: Observation period spans approximately 113 days
    2. Anonymization: Organization and GPU model names are partially anonymized
    3. Missing Metrics: No information on job success/failure rates, actual vs requested resources, or pricing
    4. Static Infrastructure: Node configuration assumed constant throughout observation period

    🎯 Recommended Analysis Extensions

    1. Temporal Analysis: Job arrival patterns, peak usage periods, seasonal trends
    2. Failure Analysis: Spot preemption impact on job completion
    3. Efficiency Metrics: Resource waste, fragmentation, and utilization rates
    4. Predictive Modeling: Spot availability forecasting, job duration prediction
    5. Fair Sharing: Organization-level resource allocation and priority analysis

    This dataset represents a significant contribution to the understanding of large-scale GPU cluster operations and spot resource management in production AI/ML environments.

  14. GPU Performance and Hashrate Dataset - Crypto

    • kaggle.com
    zip
    Updated Dec 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AnthonyTherrien (2024). GPU Performance and Hashrate Dataset - Crypto [Dataset]. https://www.kaggle.com/datasets/anthonytherrien/gpu-performance-and-hashrate-dataset-crypto
    Explore at:
    zip(11306 bytes)Available download formats
    Dataset updated
    Dec 28, 2024
    Authors
    AnthonyTherrien
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset Description

    This dataset provides detailed metrics on the performance and power consumption of various GPUs when running different cryptocurrency mining algorithms. It includes information about hash rates, power usage, and other technical specifications for multiple GPUs. The data was scraped from the Hashrate.no GPU Calculator.

    Features Overview

    The dataset has 111 columns, providing extensive insights into GPU performance. Key features include:

    • GPU Name: The name/model of the GPU (e.g., 4090, 4080S).
    • Algorithm Performance: Metrics like hash rates (e.g., Mh/s, Gh/s) for different algorithms, including:
      • AbelHash
      • Autolykos2
      • SHA512256d
      • zkSNARK
    • Power Consumption: Power usage in watts for each algorithm.
    • Economic Metrics:
      • kwh ($/kWh): Cost of electricity.
    • Algorithm-Specific Metrics: Data for specific GPUs across various mining algorithms.

    Use Cases

    This dataset is useful for: - Comparing GPU efficiency and profitability in cryptocurrency mining. - Analyzing power consumption for various algorithms. - Building predictive models for mining profitability. - Optimizing hardware selection for miners.

    Example Data

    Here’s a preview of the dataset:

    NameAbelHashPowerAbelHash (Mh/s)zkSNARKPower (Watt)zkSNARK (Mproof/s)
    4090249.0124.70310.01.43
    4080S168.087.95219.00.92
    4070TI108.064.54141.00.64

    File Details

  15. Clickbait PDFs

    • kaggle.com
    zip
    Updated Sep 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    emerald101 (2023). Clickbait PDFs [Dataset]. https://www.kaggle.com/datasets/emerald101/from-attachments-to-seo
    Explore at:
    zip(61350597461 bytes)Available download formats
    Dataset updated
    Sep 29, 2023
    Authors
    emerald101
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This repository contains the code and documentation for our ACSAC 2023 paper From Attachments to SEO: Click Here to Learn More about Clickbait PDFs!. With this artifact, we hope to foster future research on this subject.

    We provide the screenshots and file hashes of the PDFs in our dataset, allowing inspection of the images and download (from external sources, e.g. VirusTotal) of the same files. Moreover, we also share the URLs contained in the PDFs and the code to reproduce most of the findings of our paper (we are not allowed to share VirusTotal data due to their Terms of Service).

    We recommend inspecting our code from the Kaggle platform, as this does not involve any setup nor download of the data. Nonetheless, all our code can be executed in a regular laptop. We used Ubuntu 18.0 and Python 3.6.9. The dependencies for this code are minimal: Pandas 1.3.3 or higher, Numpy 1.21.2 and Matplotlib 3.4.3.

    Part of our experiments involve developing and training a deep learning model (based on DeepCluster). We created an additional Github repository containing the scripts that can help reproduce the clustering procedure. This code was developed using Ubuntu 19.0 and run on a TITAN RTX GPU. To support future research, we have shared the input and output data used in the clustering process. We have also provided the pairwise distances of the embeddings used in the second clustering step (input to DBSCAN), uploaded on Kaggle due to size restrictions of files on Github. The results of this specific experiment cannot be repeated due to manual analysis checks, but we have shared the input, output, and code to make it as reproducible as possible.

    Please feel free to leave a comment or reach out in case of any question or issue :)

  16. Unsloth for offline

    • kaggle.com
    zip
    Updated Jul 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zie Chan (2025). Unsloth for offline [Dataset]. https://www.kaggle.com/datasets/ziechan/unsloth-for-offline/code
    Explore at:
    zip(5156686198 bytes)Available download formats
    Dataset updated
    Jul 18, 2025
    Authors
    Zie Chan
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Unsloth Usage

    If you want to import Unsloth while turning off the internet: !pip install --no-index --find-links=/kaggle/input/unsloth-for-offline torch torchvision torchaudio !pip install --no-index --find-links=/kaggle/input/unsloth-for-offline xformers !pip install --no-index --find-links=/kaggle/input/unsloth-for-offline unsloth !pip install --no-index --find-links=/kaggle/input/unsloth-for-offline bitsandbytes Then you can follow the stardard notebook in unsloth document to fine tune your model.

    Pipeline / model splitting loading is also allowed, so if you do not have enough VRAM for 1 GPU to load say Llama 70B, no worries - we will split the model for you on each GPU! To enable this, use the device_map = "balanced" flag: from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( "unsloth/Llama-3.3-70B-Instruct", load_in_4bit = True, device_map = "balanced", )

    Opensloth

    Contributors have also created a repos to enable or improve multi-GPU support with Unsloth. If you want to use opensloth while turning off internet, run the following code step-by-step: ``` import tarfile import os

    Define the source folder and output path

    source_dir = "/kaggle/input/unsloth-for-offline/fire-0.7.0/fire-0.7.0" output_path = "/kaggle/working/fire-0.7.0.tar.gz" # You can change this path

    Create a tar.gz archive

    with tarfile.open(output_path, "w:gz") as tar: tar.add(source_dir, arcname=os.path.basename(source_dir)) print(f"Created: {output_path}") !pip install --no-index --find-links=/kaggle/working/ fire !pip install --no-index --find-links=/kaggle/input/unsloth-for-offline opensloth==0.1.7 ```

  17. Nvidia GPU power limit benchmark

    • kaggle.com
    zip
    Updated Nov 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Enrique Jesus Cardona Cebrian (2020). Nvidia GPU power limit benchmark [Dataset]. https://www.kaggle.com/enriquecardona/rtx-3070-power-limit-benchmark
    Explore at:
    zip(21201 bytes)Available download formats
    Dataset updated
    Nov 18, 2020
    Authors
    Enrique Jesus Cardona Cebrian
    Description

    Context

    When I owned my Zotac GEFORCE RTX 3070 TWIN EDGE OC 8GB I was curious about power tuning under Ubuntu. After several manual iterations I decided to create this set of tests that allows to run load tests using different power levels. As a result, CSV files are generated that can then be analyzed to find the best performance / consumption ratio.

    Content

    The script generates files that, after converting to CSV, represent the performance of the card in different test scenarios and power levels (which depend on each model).

    Two csv files have been added, one with the raw data and the other with more features that allow analyzing the performance obtained.

    Acknowledgements

    My thanks to the guys at Lambda Labs because their tests are the basis for these tests. They can be found at https://github.com/lambdal/lambda-tensorflow-benchmark/tree/tf2

    Inspiration

    The interesting thing for this dataset would be to have much more data, both from the same card models and from other models, to generate a reliable knowledge base on the information generated.

    It would be interesting to obtain the best power levels depending on the Tensorflow models to run.

    License

    https://www.gnu.org/licenses/gpl-3.0.html

  18. PC video games requirements

    • kaggle.com
    Updated Jan 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Baraa Zaid (2023). PC video games requirements [Dataset]. https://www.kaggle.com/datasets/baraazaid/pc-video-game-requirements
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 7, 2023
    Dataset provided by
    Kaggle
    Authors
    Baraa Zaid
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    PC Game Requirements Dataset

    This dataset contains information on the PC requirements for a variety of different games. Each row represents a single game, and the following attributes are included:**

    • Game: The name of the game
    • Memory: The minimum amount of memory (RAM) required to run the game
    • Graphics Card: The minimum required graphics card to run the game
    • CPU: The minimum required central processing unit (CPU) to run the game
    • File Size: The size of the game's installation file
    • OS: The minimum required operating system to run the game

    This dataset can be used to help determine whether a PC meets the requirements to run a particular game. It is important to note that meeting the minimum requirements does not guarantee optimal performance and that higher specifications may be needed for the best gaming experience.

    The code used to scrape the data can be found here.

  19. sgemm-kernel

    • kaggle.com
    zip
    Updated Apr 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sujeeth Shetty (2020). sgemm-kernel [Dataset]. https://www.kaggle.com/isujeeth/sgemm
    Explore at:
    zip(3119506 bytes)Available download formats
    Dataset updated
    Apr 2, 2020
    Authors
    Sujeeth Shetty
    Description

    Context

    SGEMM GPU kernel performance Dataset

    Content

    The data set is of SGEMM GPU kernel performance which consists of 14 features and 241600 records. This data set measures the running time of a matrix-matrix product A*B = C, where all matrices have size 2048 x 2048, using a parameterizable SGEMM GPU kernel with 261400 possible parameter combinations. Out of 14 features, the first 10 are ordinal and can only take up to 4 different powers of two values, and the 4 last variables are binary.

    Acknowledgements

    https://archive.ics.uci.edu/ml/datasets/SGEMM+GPU+kernel+performance

  20. Alibaba Cluster Trace GPU 2020

    • kaggle.com
    zip
    Updated Feb 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Derrick Mwiti (2022). Alibaba Cluster Trace GPU 2020 [Dataset]. https://www.kaggle.com/datasets/derrickmwiti/cluster-trace-gpu-v2020
    Explore at:
    zip(1460320128 bytes)Available download formats
    Dataset updated
    Feb 24, 2022
    Authors
    Derrick Mwiti
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The released trace contains a hybrid of training and inference jobs running state-of-the-art ML algorithms. It is collected from a large production cluster with over 6,500 GPUs (on ~1800 machines) in Alibaba PAI (Platform for Artificial Intelligence), spanning the July and August of 2020. We also include a Jupyter notebook that parses the trace and highlights some of the main characteristics (see section 3 Demo of Data Analysis).

    We also present a characterization study of the trace in a paper, "MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters", published in NSDI ’22.

    https://github.com/alibaba/clusterdata/tree/master/cluster-trace-gpu-v2020

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sultanul Ovi (2025). Alibaba GPU Cluster Dataset 2025 [Dataset]. https://www.kaggle.com/datasets/mdsultanulislamovi/alibaba-gpu-cluster-dataset-2025
Organization logo

Alibaba GPU Cluster Dataset 2025

GPU-Disaggregated DLRM Trace Dataset

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 12, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sultanul Ovi
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

GPU-Disaggregated DLRM Trace Dataset

This dataset records latency-sensitive inference instances for GPU-disaggregated serving of deep learning recommendation models. It contains per-instance resource reservations and life cycle timestamps for scheduling analysis and capacity planning.

This dataset represents a groundbreaking trace collection from production GPU-disaggregated serving systems for Deep Learning Recommendation Models (DLRMs), accompanying the NSDI'25 paper on GPU-disaggregated serving at scale. The dataset captures real-world operational characteristics of inference services in a large-scale production environment, providing invaluable insights into resource allocation patterns, temporal dynamics, and system behavior for latency-sensitive ML workloads.

Scope

  • Total rows: 23871.
  • Unique apps: 156.
  • Role split, {'CN': 16485, 'HN': 7386}.

Key fields

Instance ID. Role. Application group. Requests and limits for CPU, GPU, RDMA, memory, and disk. Density cap per node. Creation, scheduling, and deletion timestamps relative to the trace start.

High-level observations

  • All workloads are marked latency sensitive. Instances are typically long-running with high priority, as stated by the authors.
  • Scheduling delay distribution and runtime distribution are included in the figures. Concurrency over time gives a view of system load.
  • RDMA percentage and max instances per node expose placement constraints that influence packing on heterogeneous nodes.

🎯 Key Characteristics

Scale and Scope

  • Total Instances: 23,871 inference instances
  • Services: 156 unique inference services (applications)
  • Workload Type: 100% Latency-Sensitive (LS) workloads
  • Priority Level: High-priority, long-running inference instances
  • System Architecture: GPU-disaggregated architecture separating compute and GPU resources

Instance Distribution

  • CPU Nodes (CN): 16,485 instances (69.1%)
    • Pure CPU-based inference workloads
    • No GPU allocation
    • Lower RDMA requirements (mean: 3.4%)
  • Heterogeneous GPU Nodes (HN): 7,386 instances (30.9%)
    • GPU-accelerated inference workloads
    • All instances allocated exactly 1 GPU
    • Higher RDMA requirements (mean: 20.5%)

🔍 Key Insights

Workload Heterogeneity

  • Clear bimodal distribution between CPU and GPU workloads
  • CN instances optimized for CPU-intensive operations
  • HN instances balanced for GPU acceleration with supporting CPU resources

Resource Efficiency

  • Tight coupling between CPU and memory allocation (correlation: 0.97)
  • Independent scaling of GPU resources from CPU/memory
  • RDMA bandwidth scaled based on disaggregation communication needs

Production Patterns

  • All workloads classified as latency-sensitive
  • High-priority, long-running inference services
  • Immediate scheduling indicates sufficient resource availability

Disaggregation Benefits

  • Efficient resource utilization through separation of concerns
  • CN nodes handle CPU-intensive preprocessing/postprocessing
  • HN nodes focus on GPU-accelerated model inference
  • RDMA enables efficient data movement between disaggregated components

📈 Research Applications

This dataset enables research in:

  • Resource Allocation: Optimal scheduling strategies for disaggregated systems
  • Performance Modeling: Understanding latency-throughput tradeoffs
  • System Design: Architectural decisions for ML serving infrastructure
  • Workload Characterization: Production DLRM inference patterns
  • Capacity Planning: Resource provisioning for ML workloads
  • Fault Tolerance: Instance distribution and anti-affinity strategies

🎓 Academic Contribution

This dataset represents one of the first publicly available production traces for GPU-disaggregated DLRM serving, providing:

  • Real-world validation data for system research
  • Baseline for performance comparisons
  • Foundation for reproducible research in ML systems
  • Insights into production-scale ML infrastructure

This dataset provides a unique window into production GPU-disaggregated systems, offering researchers and practitioners valuable insights for advancing the field of large-scale ML serving infrastructure.

Search
Clear search
Close search
Google apps
Main menu