40 datasets found

Alibaba GPU Cluster Dataset 2025
kaggle.com
Updated Aug 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sultanul Ovi (2025). Alibaba GPU Cluster Dataset 2025 [Dataset]. https://www.kaggle.com/datasets/mdsultanulislamovi/alibaba-gpu-cluster-dataset-2025
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 12, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sultanul Ovi
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
GPU-Disaggregated DLRM Trace Dataset

This dataset records latency-sensitive inference instances for GPU-disaggregated serving of deep learning recommendation models. It contains per-instance resource reservations and life cycle timestamps for scheduling analysis and capacity planning.

This dataset represents a groundbreaking trace collection from production GPU-disaggregated serving systems for Deep Learning Recommendation Models (DLRMs), accompanying the NSDI'25 paper on GPU-disaggregated serving at scale. The dataset captures real-world operational characteristics of inference services in a large-scale production environment, providing invaluable insights into resource allocation patterns, temporal dynamics, and system behavior for latency-sensitive ML workloads.

Scope

Total rows: 23871.

Unique apps: 156.

Role split, {'CN': 16485, 'HN': 7386}.

Key fields

Instance ID. Role. Application group. Requests and limits for CPU, GPU, RDMA, memory, and disk. Density cap per node. Creation, scheduling, and deletion timestamps relative to the trace start.

High-level observations

All workloads are marked latency sensitive. Instances are typically long-running with high priority, as stated by the authors.

Scheduling delay distribution and runtime distribution are included in the figures. Concurrency over time gives a view of system load.

RDMA percentage and max instances per node expose placement constraints that influence packing on heterogeneous nodes.

🎯 Key Characteristics

Scale and Scope

Total Instances: 23,871 inference instances

Services: 156 unique inference services (applications)

Workload Type: 100% Latency-Sensitive (LS) workloads

Priority Level: High-priority, long-running inference instances

System Architecture: GPU-disaggregated architecture separating compute and GPU resources

Instance Distribution

CPU Nodes (CN): 16,485 instances (69.1%)

Pure CPU-based inference workloads

No GPU allocation

Lower RDMA requirements (mean: 3.4%)

Heterogeneous GPU Nodes (HN): 7,386 instances (30.9%)

GPU-accelerated inference workloads

All instances allocated exactly 1 GPU

Higher RDMA requirements (mean: 20.5%)

🔍 Key Insights

Workload Heterogeneity

Clear bimodal distribution between CPU and GPU workloads

CN instances optimized for CPU-intensive operations

HN instances balanced for GPU acceleration with supporting CPU resources

Resource Efficiency

Tight coupling between CPU and memory allocation (correlation: 0.97)

Independent scaling of GPU resources from CPU/memory

RDMA bandwidth scaled based on disaggregation communication needs

Production Patterns

All workloads classified as latency-sensitive

High-priority, long-running inference services

Immediate scheduling indicates sufficient resource availability

Disaggregation Benefits

Efficient resource utilization through separation of concerns

CN nodes handle CPU-intensive preprocessing/postprocessing

HN nodes focus on GPU-accelerated model inference

RDMA enables efficient data movement between disaggregated components

📈 Research Applications

This dataset enables research in:

Resource Allocation: Optimal scheduling strategies for disaggregated systems

Performance Modeling: Understanding latency-throughput tradeoffs

System Design: Architectural decisions for ML serving infrastructure

Workload Characterization: Production DLRM inference patterns

Capacity Planning: Resource provisioning for ML workloads

Fault Tolerance: Instance distribution and anti-affinity strategies

🎓 Academic Contribution

This dataset represents one of the first publicly available production traces for GPU-disaggregated DLRM serving, providing:

Real-world validation data for system research

Baseline for performance comparisons

Foundation for reproducible research in ML systems

Insights into production-scale ML infrastructure

This dataset provides a unique window into production GPU-disaggregated systems, offering researchers and practitioners valuable insights for advancing the field of large-scale ML serving infrastructure.
run_model_on_gpu
kaggle.com
zip
Updated Oct 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexis T. (2025). run_model_on_gpu [Dataset]. https://www.kaggle.com/datasets/alexismorgant/run-model-on-gpu
Explore at:
zip(966 bytes)Available download formats
Dataset updated
Oct 20, 2025
Authors
Alexis T.
Description
Dataset

This dataset was created by Alexis T.

Contents
IceVision for CUDA11
kaggle.com
zip
Updated Dec 23, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aaron B. (2021). IceVision for CUDA11 [Dataset]. https://www.kaggle.com/abee82/icevision
Explore at:
zip(4278676246 bytes)Available download formats
Dataset updated
Dec 23, 2021
Authors
Aaron B.
Description
Dataset

This dataset was created by Aaron B.

Contents
R
Accident Detection Model Dataset
universe.roboflow.com
zip
Updated Apr 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Accident detection model (2024). Accident Detection Model Dataset [Dataset]. https://universe.roboflow.com/accident-detection-model/accident-detection-model/model/1
Explore at:
zipAvailable download formats
Dataset updated
Apr 8, 2024
Dataset authored and provided by
Accident detection model
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Accident Bounding Boxes
Description
Accident-Detection-Model

Accident Detection Model is made using YOLOv8, Google Collab, Python, Roboflow, Deep Learning, OpenCV, Machine Learning, Artificial Intelligence. It can detect an accident on any accident by live camera, image or video provided. This model is trained on a dataset of 3200+ images, These images were annotated on roboflow.

Problem Statement

Road accidents are a major problem in India, with thousands of people losing their lives and many more suffering serious injuries every year.

According to the Ministry of Road Transport and Highways, India witnessed around 4.5 lakh road accidents in 2019, which resulted in the deaths of more than 1.5 lakh people.

The age range that is most severely hit by road accidents is 18 to 45 years old, which accounts for almost 67 percent of all accidental deaths.

Accidents survey

https://user-images.githubusercontent.com/78155393/233774342-287492bb-26c1-4acf-bc2c-9462e97a03ca.png" alt="Survey">

Literature Survey

Sreyan Ghosh in Mar-2019, The goal is to develop a system using deep learning convolutional neural network that has been trained to identify video frames as accident or non-accident.

Deeksha Gour Sep-2019, uses computer vision technology, neural networks, deep learning, and various approaches and algorithms to detect objects.

Research Gap

Lack of real-world data - We trained model for more then 3200 images.

Large interpretability time and space needed - Using google collab to reduce interpretability time and space required.

Outdated Versions of previous works - We aer using Latest version of Yolo v8.

Proposed methodology

We are using Yolov8 to train our custom dataset which has been 3200+ images, collected from different platforms.

This model after training with 25 iterations and is ready to detect an accident with a significant probability.

Model Set-up

Preparing Custom dataset

We have collected 1200+ images from different sources like YouTube, Google images, Kaggle.com etc.

Then we annotated all of them individually on a tool called roboflow.

During Annotation we marked the images with no accident as NULL and we drew a box on the site of accident on the images having an accident

Then we divided the data set into train, val, test in the ratio of 8:1:1

At the final step we downloaded the dataset in yolov8 format.
#### Using Google Collab

We are using google colaboratory to code this model because google collab uses gpu which is faster than local environments.

You can use Jupyter notebooks, which let you blend code, text, and visualisations in a single document, to write and run Python code using Google Colab.

Users can run individual code cells in Jupyter Notebooks and quickly view the results, which is helpful for experimenting and debugging. Additionally, they enable the development of visualisations that make use of well-known frameworks like Matplotlib, Seaborn, and Plotly.

In Google collab, First of all we Changed runtime from TPU to GPU.

We cross checked it by running command ‘!nvidia-smi’
#### Coding

First of all, We installed Yolov8 by the command ‘!pip install ultralytics==8.0.20’

Further we checked about Yolov8 by the command ‘from ultralytics import YOLO from IPython.display import display, Image’

Then we connected and mounted our google drive account by the code ‘from google.colab import drive drive.mount('/content/drive')’

Then we ran our main command to run the training process ‘%cd /content/drive/MyDrive/Accident Detection model !yolo task=detect mode=train model=yolov8s.pt data= data.yaml epochs=1 imgsz=640 plots=True’

After the training we ran command to test and validate our model ‘!yolo task=detect mode=val model=runs/detect/train/weights/best.pt data=data.yaml’ ‘!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt conf=0.25 source=data/test/images’

Further to get result from any video or image we ran this command ‘!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt source="/content/drive/MyDrive/Accident-Detection-model/data/testing1.jpg/mp4"’

The results are stored in the runs/detect/predict folder.
Hence our model is trained, validated and tested to be able to detect accidents on any video or image.

Challenges I ran into

I majorly ran into 3 problems while making this model

I got difficulty while saving the results in a folder, as yolov8 is latest version so it is still underdevelopment. so i then read some blogs, referred to stackoverflow then i got to know that we need to writ an extra command in new v8 that ''save=true'' This made me save my results in a folder.

I was facing problem on cvat website because i was not sure what
TabPFN
kaggle.com
zip
Updated Jun 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mark Inzhirov (2023). TabPFN [Dataset]. https://www.kaggle.com/datasets/neutrino404/tabpfn
Explore at:
zip(95945799 bytes)Available download formats
Dataset updated
Jun 14, 2023
Authors
Mark Inzhirov
Description
Use this data set when submitting code offline for competitions otherwise just use !pip install tabpfn for online use. Usage for offline code submissions within Kaggle notebooks is as follows:

1**.First add the dataset by selecting "add data" and searching for this dataset and adding it to your input. **

2.**Next add the following code to a code block in your notebook ** python !pip install tabpfn --no-index --find-links=file:///kaggle/input/tabpfn !mkdir -p /opt/conda/lib/python3.10/site-packages/tabpfn/models_diff !cp /kaggle/input/tabpfn/prior_diff_real_checkpoint_n_0_epoch_100.cpkt /opt/conda/lib/python3.10/site-packages/tabpfn/models_diff/ 3.** Import** :
from tabpfn import TabPFNClassifier

4.**Now you are all set you can create a classifier and run it offline for submission in offline kaggle code competitions:** python classifier = TabPFNClassifier(device='cpu',N_ensemble_configurations=64) classifier.fit(X_train, Y_train) y_eval, p_eval = classifier.predict(X_cv, return_winning_probability=True)

If you want to use TabPFN with GPU use the following code when you make the model: classifier = TabPFNClassifier(device='cuda',N_ensemble_configurations=32)

You can find documentation for this package on GitHub: https://github.com/automl/TabPFN.git Original paper on TabPFN can be found at: https://arxiv.org/abs/2207.01848 License Copyright 2022 Noah Hollmann, Samuel Müller, Katharina Eggensperger, Frank Hutter

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
llamacpp cuda sm75 complete build
kaggle.com
zip
Updated Oct 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiwei Liu (2025). llamacpp cuda sm75 complete build [Dataset]. https://www.kaggle.com/datasets/jiweiliu/llamacpp-sm75-complete-build
Explore at:
zip(169040223 bytes)Available download formats
Dataset updated
Oct 17, 2025
Authors
Jiwei Liu
Description
Complete llama.cpp build folder compiled with CUDA support for compute capability 7.5 (Turing architecture GPUs: RTX 2060/2070/2080, Tesla T4, Quadro RTX).

Build Configuration: - CUDA 12.5 (compatibility with CUDA 12.x) - Compute Capability: SM_75 - Optimized for Nvidia Turing GPUs - Complete build directory with CMake files

Contents: - Complete build/ directory - build/bin/ - All compiled executables and shared libraries - build/CMakeCache.txt - CMake configuration - build/compile_commands.json - Compilation database - All build artifacts and intermediate files

Usage: 1. Extract the build folder 2. Ensure CUDA 12.x runtime is installed on target system 3. Set LD_LIBRARY_PATH to include build/bin directory 4. Run executables from build/bin/

Requirements on target machine: - Nvidia GPU with compute capability 7.5 - CUDA 12.x runtime libraries - Linux x86_64

Torch-TensorRT v2.2.0

kaggle.com

zip

Updated Apr 15, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Gowri Shankar Penugonda (2024). Torch-TensorRT v2.2.0 [Dataset]. https://www.kaggle.com/datasets/gowrishankarp/torch-tensorrt-v2-2-0

Explore at:

zip(2810802392 bytes)Available download formats

Dataset updated

Apr 15, 2024

Authors

Gowri Shankar Penugonda

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Torch-TensorRT v2.2.0

# Installation
!pip install -qq --no-index --find-links=../input/torch-tensorrt-v2-2-0 torch-tensorrt==2.2.0

Usage

import torch
import torch_tensorrt

trt_model_fp16 = torch_tensorrt.compile(mlm_model, 
  inputs= [torch_tensorrt.Input(shape=[batch_size, 1024], dtype=torch.int32), # input_ids
       torch_tensorrt.Input(shape=[batch_size, 1024], dtype=torch.int32)], # attention_mask
  enabled_precisions= {torch.float32}, # Run with 32-bit precision
  workspace_size=2000000000,
  truncate_long_and_double=True
)

torch.jit.save(trt_model_fp16, 'kaggle-mlm_model-1024-gpu-aug0-01-swa.trt_fp16.ts')

trt_model_fp16 = torch.jit.load(model_path)
.
.
.
inputs = {k: v.type(torch.int32).cuda() for k, v in inputs.items()}
output_trt = trt_model_fp16(inputs['input_ids'], inputs['attention_mask'])
output_trt

nemotron-3-8b-base-4k
kaggle.com
zip
Updated Aug 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Serhii Kharchuk (2024). nemotron-3-8b-base-4k [Dataset]. https://www.kaggle.com/datasets/serhiikharchuk/nemotron-3-8b-base-4k
Explore at:
zip(13688476176 bytes)Available download formats
Dataset updated
Aug 31, 2024
Authors
Serhii Kharchuk
Description
Nemotron-3-8B-Base-4k Model Overview License

The use of this model is governed by the NVIDIA AI Foundation Models Community License Agreement. Description

Nemotron-3-8B-Base-4k is a large language foundation model for enterprises to build custom LLMs. This foundation model has 8 billion parameters, and supports a context length of 4,096 tokens. Nemotron-3-8B-Base-4k is part of Nemotron-3, which is a family of enterprise ready generative text models compatible with NVIDIA NeMo Framework. For other models in this collection, see the collections page.

NVIDIA NeMo is an end-to-end, cloud-native platform to build, customize, and deploy generative AI models anywhere. It includes training and inferencing frameworks, guardrailing toolkits, data curation tools, and pretrained models, offering enterprises an easy, cost-effective, and fast way to adopt generative AI. To get access to NeMo Framework, please sign up at this link. References

Announcement Blog Model Architecture

Architecture Type: Transformer

Network Architecture: Generative Pre-Trained Transformer (GPT-3) Software Integration

Runtime Engine(s): NVIDIA AI Enterprise

Toolkit: NeMo Framework

To get access to NeMo Framework, please sign up at this link. See NeMo inference container documentation for details on how to setup and deploy an inference server with NeMo.

Sample Inference Code:

from nemo.deploy import NemoQuery

In this case, we run inference on the same machine

nq = NemoQuery(url="localhost:8000", model_name="Nemotron-3-8B-4K")

output = nq.query_llm(prompts=["The meaning of life is"], max_output_token=200, top_k=1, top_p=0.0, temperature=0.1) print(output)

Supported Hardware:

H100 A100 80GB, A100 40GB

Model Version(s)

Nemotron-3-8B-base-4k-BF16-1 Dataset & Training

The model uses a learning rate of 3e-4 with a warm-up period of 500M tokens and a cosine learning rate annealing schedule for 95% of the total training tokens. The decay stops at a minimum learning rate of 3e-5. The model is trained with a sequence length of 4096 and uses FlashAttention’s Multi-Head Attention implementation. 1,024 A100s were used for 19 days to train the model.

NVIDIA models are trained on a diverse set of public and proprietary datasets. This model was trained on a dataset containing 3.8 Trillion tokens of text. The dataset contains 53 different human languages (including English, German, Russian, Spanish, French, Japanese, Chinese, Italian, and Dutch) and 37 programming languages. The model also uses the training subsets of downstream academic benchmarks from sources like FLANv2, P3, and NaturalInstructions v2. NVIDIA is committed to the responsible development of large language models and conducts reviews of all datasets included in training. Evaluation Task Num-shot Score MMLU* 5 54.4 WinoGrande 0 70.9 Hellaswag 0 76.4 ARC Easy 0 72.9 TyDiQA-GoldP** 1 49.2 Lambada 0 70.6 WebQS 0 22.9 PiQA 0 80.4 GSM8K 8-shot w/ maj@8 39.4

The calculation of MMLU follows the original implementation. See Hugging Face’s explanation of different implementations of MMLU.

** The languages used are Arabic, Bangla, Finnish, Indonesian, Korean, Russian and Swahili. Intended use

This is a completion model. For best performance, users are encouraged to customize the completion model using NeMo Framework suite of customization tools including Parameter-Efficient Fine-Tuning (P-tuning, Adapters, LoRA), and SFT/RLHF. For chat use cases, please consider using Nemotron-3-8B chat variants. Ethical use

Technology can have a profound impact on people and the world, and NVIDIA is committed to enabling trust and transparency in AI development. NVIDIA encourages users to adopt principles of AI ethics and trustworthiness to guide your business decisions by following the guidelines in the NVIDIA AI Foundation Models Community License Agreement. Limitations

The model was trained on data that contains toxic language and societal biases originally crawled from the internet. Therefore, the model may amplify those biases and return toxic responses especially when prompted with toxic prompts. The model may generate answers that may be inaccurate, omit key information, or include irrelevant or redundant text producing socially unacceptable or undesirable text, even if the prompt itself does not include anything explicitly offensive.
BUTTER-E: Energy Data for Deep Learning Models
kaggle.com
zip
Updated Jan 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pavan Kumar S (2025). BUTTER-E: Energy Data for Deep Learning Models [Dataset]. https://www.kaggle.com/datasets/pavankumar4757/butter-e-energy-data-for-deep-learning-models
Explore at:
zip(2940491 bytes)Available download formats
Dataset updated
Jan 11, 2025
Authors
Pavan Kumar S
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The BUTTER-E - Energy Consumption Data for the BUTTER Empirical Deep Learning Dataset provides node-level energy consumption data collected via watt-meters, complementing the primary BUTTER dataset. This dataset records energy consumption and performance metrics for 1,059,206 experimental runs across diverse configurations of fully connected neural networks. Key attributes include:

1.timestamp: The precise time of the energy consumption measurement. 2.node:The hardware node identifier (e.g., r103u05) where the experiment was conducted. 3.watts: The energy consumption (in watts) recorded for the corresponding node at the given timestamp.

Highlights Data spans 30,582 distinct configurations, including variations across 13 datasets, 20 network sizes, 8 network shapes, and 14 depths. Measurements were taken on CPU and GPU hardware, offering insights into the relationship between neural network parameters and energy consumption. The dataset provides valuable information for analyzing the energy efficiency of deep learning models, particularly in relation to cache effects, dataset size, and network architecture.

Use Cases This dataset is ideal for: Energy-efficient AI research: Understanding how energy consumption scales with model size, dataset properties, and network configurations. Performance optimization: Identifying configurations with optimal trade-offs between performance and energy usage. Sustainability analysis: Evaluating the carbon footprint of training and deploying deep learning models.
Nvidia Database
kaggle.com
zip
Updated Jan 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ajay Tom (2025). Nvidia Database [Dataset]. https://www.kaggle.com/datasets/ajayt0m/nvidia-database
Explore at:
zip(8712 bytes)Available download formats
Dataset updated
Jan 30, 2025
Authors
Ajay Tom
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This is a beginner-friendly SQLite database designed to help users practice SQL and relational database concepts. The dataset represents a basic business model inspired by NVIDIA and includes interconnected tables covering essential aspects like products, customers, sales, suppliers, employees, and projects. It's perfect for anyone new to SQL or data analytics who wants to learn and experiment with structured data.

Tables and Their Contents:

Products:

Includes details of 15 products (e.g., GPUs, AI accelerators). Attributes: product_id, product_name, category, release_date, price.

Customers:

Lists 20 fictional customers with their industry and contact information. Attributes: customer_id, customer_name, industry, contact_email, contact_phone.

Sales:

Contains 100 sales records tied to products and customers. Attributes: sale_id, product_id, customer_id, sale_date, region, quantity_sold, revenue.

Suppliers:

Features 50 suppliers and the materials they provide. Attributes: supplier_id, supplier_name, material_supplied, contact_email.

Supply Chain:

Tracks materials supplied to produce products, proportional to sales. Attributes: supply_chain_id, supplier_id, product_id, supply_date, quantity_supplied.

Departments:

Lists 5 departments within the business. Attributes: department_id, department_name, location.

Employees:

Contains data on 30 employees and their roles in different departments. Attributes: employee_id, first_name, last_name, department_id, hire_date, salary.

Projects:

Describes 10 projects handled by different departments. Attributes: project_id, project_name, department_id, start_date, end_date, budget.

Why Use This Dataset?

Perfect for Beginners: The dataset is simple and easy to understand.

Interconnected Tables: Provides a basic introduction to relational database concepts like joins and foreign keys.

SQL Practice: Run basic queries, filter data, and perform simple aggregations or calculations.

Learning Tool: Great for small projects and understanding business datasets.

Potential Use Cases:

Practice SQL queries (SELECT, INSERT, UPDATE, DELETE, JOIN).

Understand how to design and query relational databases.

Analyze basic sales and supply chain data for patterns and trends.

Learn how to use databases in analytics tools like Excel, Power BI, or Tableau.

Data Size:

Number of Tables: 8 Total Rows: Around 230 across all tables, ensuring quick queries and easy exploration.
efficientcube_1000cube
kaggle.com
zip
Updated Jan 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arbidos (2025). efficientcube_1000cube [Dataset]. https://www.kaggle.com/datasets/arabidopsisthalian/efficientcube-1000cube
Explore at:
zip(2459796 bytes)Available download formats
Dataset updated
Jan 31, 2025
Authors
Arbidos
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Learn: train_cube3_qtm_MLP2RB_04M_1735350897.csv 4ч 19 мин train_cube3_qtm_MLP2RB_04M_1735378399.csv 4ч 19 мин

1729775349
1729775893
1729776146
1729776388
1729776568 47/69 оптимально

subprocess.run([ "nvcc", "--version" ])
subprocess.run( [ "nvidia-smi" ]) subprocess.run([ "cat" , "/etc/os-release" ])
subprocess.run([ "uname" , "-srm" ]) subprocess.run([ "cat" , "/proc/version" ]) subprocess.run([ "lspci" , "-k" ]) subprocess.run([ "cat" , "/proc/cpuinfo" ]) subprocess.run([ "nvidia-smi" , "-q" ]) subprocess.run([ "arch" ])
Image Matching Challenge 2023 Dockerfile
kaggle.com
zip
Updated Apr 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anil Ozturk (2023). Image Matching Challenge 2023 Dockerfile [Dataset]. https://www.kaggle.com/nlztrk/image-matching-challenge-2023-dockerfile
Explore at:
zip(726 bytes)Available download formats
Dataset updated
Apr 18, 2023
Authors
Anil Ozturk
Description
This Dockerfile creates an environment that meets all the dependencies in the example submission file in Image Matching Challenge 2023. (including colmap and pycolmap)

Steps (for Linux)

Warning: You should replace YOURIMAGENAME with the name you desire for your environment at the later steps.

1. Install Docker and NVIDIA Container Toolkit

Docker Installation: Link

NVIDIA Container Toolkit Installation: Link

2. Set NVIDIA-Docker as the default builder

Edit / Create the /etc/docker/daemon.json with content: { "runtimes": { "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } }, "default-runtime": "nvidia" }

3. Build Your Docker Environment

Just go to the root folder of the repository and execute the following lines in your terminal. DOCKER_BUILDKIT=0 docker build -f Dockerfile -t YOURIMAGENAME .

4. Run

4.1. Interactive

You can simply run your environment with: docker run --gpus all -it --rm YOURIMAGENAME

4.1. With Volume Mounting & Jupyter Server

You can add these lines to your ~/.bashrc: djupyter() { docker run -v $PWD:/tmp/working -v ${HOME}/.cache:/container_cache -w=/tmp/working -e "XDG_CACHE_HOME=/container_cache" -p 8888:8888 --gpus all --rm -it YOURIMAGENAME jupyter notebook --no-browser --ip="0.0.0.0" --notebook-dir=/tmp/working --allow-root } After resetting all your terminals you can simply open a new terminal then type: djupyter

and voila! You've just opened your docker, mounted your current folder and started a Jupyter server!
Alibaba GPU Cluster Spot Resource Dataset
kaggle.com
zip
Updated Aug 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sultanul Ovi (2025). Alibaba GPU Cluster Spot Resource Dataset [Dataset]. https://www.kaggle.com/datasets/mdsultanulislamovi/alibaba-gpu-cluster-spot-resource-dataset
Explore at:
zip(5189979 bytes)Available download formats
Dataset updated
Aug 13, 2025
Authors
Sultanul Ovi
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset provides a comprehensive trace of AI workloads running on a large-scale GPU cluster with spot resource provisioning capabilities. It captures real-world operational characteristics from a production environment, managing both high-priority workloads with strict Service Level Objectives (SLOs) and opportunistic spot workloads.

Key Characteristics

Infrastructure Scale: 4,278 GPU nodes with 6 different GPU card types

Workload Volume: 466,867 job submissions tracked

Organization Diversity: 119 unique organizations/departments

Workload Types: Mixed high-priority (HP) and spot instance workloads

🔬 Research Applications

This dataset is valuable for:

Scheduling Algorithm Development

Spot instance prediction models

Multi-resource scheduling optimization

SLO-aware preemption strategies

Cluster Design Studies

GPU provisioning optimization

Heterogeneous resource planning

Cost-performance trade-off analysis

Workload Characterization

AI/ML job pattern analysis

Organization behavior modeling

Resource demand forecasting

Economic Analysis

Spot pricing strategies

Resource allocation fairness

Cost optimization for mixed workloads

📝 Dataset Limitations and Considerations

Temporal Coverage: Observation period spans approximately 113 days

Anonymization: Organization and GPU model names are partially anonymized

Missing Metrics: No information on job success/failure rates, actual vs requested resources, or pricing

Static Infrastructure: Node configuration assumed constant throughout observation period

🎯 Recommended Analysis Extensions

Temporal Analysis: Job arrival patterns, peak usage periods, seasonal trends

Failure Analysis: Spot preemption impact on job completion

Efficiency Metrics: Resource waste, fragmentation, and utilization rates

Predictive Modeling: Spot availability forecasting, job duration prediction

Fair Sharing: Organization-level resource allocation and priority analysis

This dataset represents a significant contribution to the understanding of large-scale GPU cluster operations and spot resource management in production AI/ML environments.
GPU Performance and Hashrate Dataset - Crypto
kaggle.com
zip
Updated Dec 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AnthonyTherrien (2024). GPU Performance and Hashrate Dataset - Crypto [Dataset]. https://www.kaggle.com/datasets/anthonytherrien/gpu-performance-and-hashrate-dataset-crypto
Explore at:
zip(11306 bytes)Available download formats
Dataset updated
Dec 28, 2024
Authors
AnthonyTherrien
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset Description

This dataset provides detailed metrics on the performance and power consumption of various GPUs when running different cryptocurrency mining algorithms. It includes information about hash rates, power usage, and other technical specifications for multiple GPUs. The data was scraped from the Hashrate.no GPU Calculator.

Features Overview

The dataset has 111 columns, providing extensive insights into GPU performance. Key features include:

GPU Name: The name/model of the GPU (e.g., 4090, 4080S).

Algorithm Performance: Metrics like hash rates (e.g., Mh/s, Gh/s) for different algorithms, including:

AbelHash

Autolykos2

SHA512256d

zkSNARK

Power Consumption: Power usage in watts for each algorithm.

Economic Metrics:

kwh ($/kWh): Cost of electricity.

Algorithm-Specific Metrics: Data for specific GPUs across various mining algorithms.

Use Cases

This dataset is useful for: - Comparing GPU efficiency and profitability in cryptocurrency mining. - Analyzing power consumption for various algorithms. - Building predictive models for mining profitability. - Optimizing hardware selection for miners.

Example Data

Here’s a preview of the dataset:

Name AbelHashPower AbelHash (Mh/s) zkSNARKPower (Watt) zkSNARK (Mproof/s)
4090 249.0 124.70 310.0 1.43
4080S 168.0 87.95 219.0 0.92
4070TI 108.0 64.54 141.0 0.64

File Details

Rows: Number of GPUs.

Columns: 111.

Source: Scraped from Hashrate.no GPU Calculator.
Clickbait PDFs
kaggle.com
zip
Updated Sep 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
emerald101 (2023). Clickbait PDFs [Dataset]. https://www.kaggle.com/datasets/emerald101/from-attachments-to-seo
Explore at:
zip(61350597461 bytes)Available download formats
Dataset updated
Sep 29, 2023
Authors
emerald101
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This repository contains the code and documentation for our ACSAC 2023 paper From Attachments to SEO: Click Here to Learn More about Clickbait PDFs!. With this artifact, we hope to foster future research on this subject.

We provide the screenshots and file hashes of the PDFs in our dataset, allowing inspection of the images and download (from external sources, e.g. VirusTotal) of the same files. Moreover, we also share the URLs contained in the PDFs and the code to reproduce most of the findings of our paper (we are not allowed to share VirusTotal data due to their Terms of Service).

We recommend inspecting our code from the Kaggle platform, as this does not involve any setup nor download of the data. Nonetheless, all our code can be executed in a regular laptop. We used Ubuntu 18.0 and Python 3.6.9. The dependencies for this code are minimal: Pandas 1.3.3 or higher, Numpy 1.21.2 and Matplotlib 3.4.3.

Part of our experiments involve developing and training a deep learning model (based on DeepCluster). We created an additional Github repository containing the scripts that can help reproduce the clustering procedure. This code was developed using Ubuntu 19.0 and run on a TITAN RTX GPU. To support future research, we have shared the input and output data used in the clustering process. We have also provided the pairwise distances of the embeddings used in the second clustering step (input to DBSCAN), uploaded on Kaggle due to size restrictions of files on Github. The results of this specific experiment cannot be repeated due to manual analysis checks, but we have shared the input, output, and code to make it as reproducible as possible.

Please feel free to leave a comment or reach out in case of any question or issue :)
Unsloth for offline
kaggle.com
zip
Updated Jul 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zie Chan (2025). Unsloth for offline [Dataset]. https://www.kaggle.com/datasets/ziechan/unsloth-for-offline/code
Explore at:
zip(5156686198 bytes)Available download formats
Dataset updated
Jul 18, 2025
Authors
Zie Chan
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Unsloth Usage

If you want to import Unsloth while turning off the internet: !pip install --no-index --find-links=/kaggle/input/unsloth-for-offline torch torchvision torchaudio !pip install --no-index --find-links=/kaggle/input/unsloth-for-offline xformers !pip install --no-index --find-links=/kaggle/input/unsloth-for-offline unsloth !pip install --no-index --find-links=/kaggle/input/unsloth-for-offline bitsandbytes Then you can follow the stardard notebook in unsloth document to fine tune your model.

Pipeline / model splitting loading is also allowed, so if you do not have enough VRAM for 1 GPU to load say Llama 70B, no worries - we will split the model for you on each GPU! To enable this, use the device_map = "balanced" flag: from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( "unsloth/Llama-3.3-70B-Instruct", load_in_4bit = True, device_map = "balanced", )

Opensloth

Contributors have also created a repos to enable or improve multi-GPU support with Unsloth. If you want to use opensloth while turning off internet, run the following code step-by-step: ``` import tarfile import os

Define the source folder and output path

source_dir = "/kaggle/input/unsloth-for-offline/fire-0.7.0/fire-0.7.0" output_path = "/kaggle/working/fire-0.7.0.tar.gz" # You can change this path

Create a tar.gz archive

with tarfile.open(output_path, "w:gz") as tar: tar.add(source_dir, arcname=os.path.basename(source_dir)) print(f"Created: {output_path}") !pip install --no-index --find-links=/kaggle/working/ fire !pip install --no-index --find-links=/kaggle/input/unsloth-for-offline opensloth==0.1.7 ```
Nvidia GPU power limit benchmark
kaggle.com
zip
Updated Nov 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Enrique Jesus Cardona Cebrian (2020). Nvidia GPU power limit benchmark [Dataset]. https://www.kaggle.com/enriquecardona/rtx-3070-power-limit-benchmark
Explore at:
zip(21201 bytes)Available download formats
Dataset updated
Nov 18, 2020
Authors
Enrique Jesus Cardona Cebrian
Description
Context

When I owned my Zotac GEFORCE RTX 3070 TWIN EDGE OC 8GB I was curious about power tuning under Ubuntu. After several manual iterations I decided to create this set of tests that allows to run load tests using different power levels. As a result, CSV files are generated that can then be analyzed to find the best performance / consumption ratio.

Content

The script generates files that, after converting to CSV, represent the performance of the card in different test scenarios and power levels (which depend on each model).

Two csv files have been added, one with the raw data and the other with more features that allow analyzing the performance obtained.

Acknowledgements

My thanks to the guys at Lambda Labs because their tests are the basis for these tests. They can be found at https://github.com/lambdal/lambda-tensorflow-benchmark/tree/tf2

Inspiration

The interesting thing for this dataset would be to have much more data, both from the same card models and from other models, to generate a reliable knowledge base on the information generated.

It would be interesting to obtain the best power levels depending on the Tensorflow models to run.

License

https://www.gnu.org/licenses/gpl-3.0.html
PC video games requirements
kaggle.com
Updated Jan 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Baraa Zaid (2023). PC video games requirements [Dataset]. https://www.kaggle.com/datasets/baraazaid/pc-video-game-requirements
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 7, 2023
Dataset provided by
Kaggle
Authors
Baraa Zaid
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
PC Game Requirements Dataset

This dataset contains information on the PC requirements for a variety of different games. Each row represents a single game, and the following attributes are included:**

Game: The name of the game

Memory: The minimum amount of memory (RAM) required to run the game

Graphics Card: The minimum required graphics card to run the game

CPU: The minimum required central processing unit (CPU) to run the game

File Size: The size of the game's installation file

OS: The minimum required operating system to run the game

This dataset can be used to help determine whether a PC meets the requirements to run a particular game. It is important to note that meeting the minimum requirements does not guarantee optimal performance and that higher specifications may be needed for the best gaming experience.

The code used to scrape the data can be found here.
sgemm-kernel
kaggle.com
zip
Updated Apr 2, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sujeeth Shetty (2020). sgemm-kernel [Dataset]. https://www.kaggle.com/isujeeth/sgemm
Explore at:
zip(3119506 bytes)Available download formats
Dataset updated
Apr 2, 2020
Authors
Sujeeth Shetty
Description
Context

SGEMM GPU kernel performance Dataset

Content

The data set is of SGEMM GPU kernel performance which consists of 14 features and 241600 records. This data set measures the running time of a matrix-matrix product A*B = C, where all matrices have size 2048 x 2048, using a parameterizable SGEMM GPU kernel with 261400 possible parameter combinations. Out of 14 features, the first 10 are ordinal and can only take up to 4 different powers of two values, and the 4 last variables are binary.

Acknowledgements

https://archive.ics.uci.edu/ml/datasets/SGEMM+GPU+kernel+performance
Alibaba Cluster Trace GPU 2020
kaggle.com
zip
Updated Feb 24, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Derrick Mwiti (2022). Alibaba Cluster Trace GPU 2020 [Dataset]. https://www.kaggle.com/datasets/derrickmwiti/cluster-trace-gpu-v2020
Explore at:
zip(1460320128 bytes)Available download formats
Dataset updated
Feb 24, 2022
Authors
Derrick Mwiti
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The released trace contains a hybrid of training and inference jobs running state-of-the-art ML algorithms. It is collected from a large production cluster with over 6,500 GPUs (on ~1800 machines) in Alibaba PAI (Platform for Artificial Intelligence), spanning the July and August of 2020. We also include a Jupyter notebook that parses the trace and highlights some of the main characteristics (see section 3 Demo of Data Analysis).

We also present a characterization study of the trace in a paper, "MLaaS in the Wild: Workload Analysis and Scheduling in Large-Scale Heterogeneous GPU Clusters", published in NSDI ’22.

https://github.com/alibaba/clusterdata/tree/master/cluster-trace-gpu-v2020

Name	AbelHashPower	AbelHash (Mh/s)	zkSNARKPower (Watt)	zkSNARK (Mproof/s)
4090	249.0	124.70	310.0	1.43
4080S	168.0	87.95	219.0	0.92
4070TI	108.0	64.54	141.0	0.64

Facebook

Twitter

Click to copy link

Link copied

Cite

Sultanul Ovi (2025). Alibaba GPU Cluster Dataset 2025 [Dataset]. https://www.kaggle.com/datasets/mdsultanulislamovi/alibaba-gpu-cluster-dataset-2025

Alibaba GPU Cluster Dataset 2025

GPU-Disaggregated DLRM Trace Dataset

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 12, 2025

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Sultanul Ovi

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

GPU-Disaggregated DLRM Trace Dataset

This dataset records latency-sensitive inference instances for GPU-disaggregated serving of deep learning recommendation models. It contains per-instance resource reservations and life cycle timestamps for scheduling analysis and capacity planning.

This dataset represents a groundbreaking trace collection from production GPU-disaggregated serving systems for Deep Learning Recommendation Models (DLRMs), accompanying the NSDI'25 paper on GPU-disaggregated serving at scale. The dataset captures real-world operational characteristics of inference services in a large-scale production environment, providing invaluable insights into resource allocation patterns, temporal dynamics, and system behavior for latency-sensitive ML workloads.

Scope

Total rows: 23871.
Unique apps: 156.
Role split, {'CN': 16485, 'HN': 7386}.

Key fields

Instance ID. Role. Application group. Requests and limits for CPU, GPU, RDMA, memory, and disk. Density cap per node. Creation, scheduling, and deletion timestamps relative to the trace start.

High-level observations

All workloads are marked latency sensitive. Instances are typically long-running with high priority, as stated by the authors.
Scheduling delay distribution and runtime distribution are included in the figures. Concurrency over time gives a view of system load.
RDMA percentage and max instances per node expose placement constraints that influence packing on heterogeneous nodes.

🎯 Key Characteristics

Scale and Scope

Total Instances: 23,871 inference instances
Services: 156 unique inference services (applications)
Workload Type: 100% Latency-Sensitive (LS) workloads
Priority Level: High-priority, long-running inference instances
System Architecture: GPU-disaggregated architecture separating compute and GPU resources

Instance Distribution

CPU Nodes (CN): 16,485 instances (69.1%)
- Pure CPU-based inference workloads
- No GPU allocation
- Lower RDMA requirements (mean: 3.4%)
Heterogeneous GPU Nodes (HN): 7,386 instances (30.9%)
- GPU-accelerated inference workloads
- All instances allocated exactly 1 GPU
- Higher RDMA requirements (mean: 20.5%)

🔍 Key Insights

Workload Heterogeneity

Clear bimodal distribution between CPU and GPU workloads
CN instances optimized for CPU-intensive operations
HN instances balanced for GPU acceleration with supporting CPU resources

Resource Efficiency

Tight coupling between CPU and memory allocation (correlation: 0.97)
Independent scaling of GPU resources from CPU/memory
RDMA bandwidth scaled based on disaggregation communication needs

Production Patterns

All workloads classified as latency-sensitive
High-priority, long-running inference services
Immediate scheduling indicates sufficient resource availability

Disaggregation Benefits

Efficient resource utilization through separation of concerns
CN nodes handle CPU-intensive preprocessing/postprocessing
HN nodes focus on GPU-accelerated model inference
RDMA enables efficient data movement between disaggregated components

📈 Research Applications

This dataset enables research in:

Resource Allocation: Optimal scheduling strategies for disaggregated systems
Performance Modeling: Understanding latency-throughput tradeoffs
System Design: Architectural decisions for ML serving infrastructure
Workload Characterization: Production DLRM inference patterns
Capacity Planning: Resource provisioning for ML workloads
Fault Tolerance: Instance distribution and anti-affinity strategies

🎓 Academic Contribution

This dataset represents one of the first publicly available production traces for GPU-disaggregated DLRM serving, providing:

Real-world validation data for system research
Baseline for performance comparisons
Foundation for reproducible research in ML systems
Insights into production-scale ML infrastructure

This dataset provides a unique window into production GPU-disaggregated systems, offering researchers and practitioners valuable insights for advancing the field of large-scale ML serving infrastructure.

Clear search

Close search

Google apps

Main menu

Alibaba GPU Cluster Dataset 2025

GPU-Disaggregated DLRM Trace Dataset

Scope

Key fields

High-level observations

🎯 Key Characteristics

Scale and Scope

Instance Distribution

🔍 Key Insights

Workload Heterogeneity

Resource Efficiency

Production Patterns

Disaggregation Benefits

📈 Research Applications

🎓 Academic Contribution

run_model_on_gpu

Dataset

Contents

IceVision for CUDA11

Dataset

Contents

Accident Detection Model Dataset

Accident-Detection-Model

Problem Statement

Accidents survey

Literature Survey

Research Gap

Proposed methodology

Model Set-up

Preparing Custom dataset

Challenges I ran into

I majorly ran into 3 problems while making this model

TabPFN

llamacpp cuda sm75 complete build

Torch-TensorRT v2.2.0

Torch-TensorRT v2.2.0

Usage

nemotron-3-8b-base-4k

In this case, we run inference on the same machine

BUTTER-E: Energy Data for Deep Learning Models

Nvidia Database

Tables and Their Contents:

Products:

Customers:

Sales:

Suppliers:

Supply Chain:

Departments:

Employees:

Projects:

Why Use This Dataset?

Potential Use Cases:

Data Size:

efficientcube_1000cube

Image Matching Challenge 2023 Dockerfile

Steps (for Linux)

1. Install Docker and NVIDIA Container Toolkit

2. Set NVIDIA-Docker as the default builder

3. Build Your Docker Environment

4. Run

4.1. Interactive

4.1. With Volume Mounting & Jupyter Server

Alibaba GPU Cluster Spot Resource Dataset

Key Characteristics

🔬 Research Applications

📝 Dataset Limitations and Considerations

🎯 Recommended Analysis Extensions

GPU Performance and Hashrate Dataset - Crypto

Dataset Description

Features Overview

Use Cases

Example Data

File Details

Clickbait PDFs

Unsloth for offline

Unsloth Usage

Opensloth

Define the source folder and output path

Create a tar.gz archive

Nvidia GPU power limit benchmark