100+ datasets found

Predictive Maintenance of Machines
kaggle.com
Updated Feb 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
RohithNair (2024). Predictive Maintenance of Machines [Dataset]. https://www.kaggle.com/datasets/nair26/predictive-maintenance-of-machines
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 28, 2024
Dataset provided by
Kaggle
Authors
RohithNair
License
http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html
Description
This dataset provides information about Vibration levels , torque, process temperature and Fault.

The dataset in the image is a spreadsheet containing information about engine performance. The spreadsheet has the following variables:

UDI: This is likely a unique identifier for each engine. Product ID: This could be a specific code or identifier for the engine model. Type: This indicates the type of engine, possibly categorized by fuel type (e.g., M - motor, L - liquid). Air temperature (K): This is the air temperature in Kelvin around the engine. Process temperature [K]: This is the internal temperature of the engine during operation, measured in Kelvin. Speed (rpm): This is the rotational speed of the engine in revolutions per minute. Torque (Nm): This is the twisting force exerted by the engine, measured in Newton meters. Vibration Levels: This could be a measure of the engine's vibration intensity. Operational Hours: This is the total number of hours the engine has been operational. Tailure Type: This indicates the type of failure the engine experienced, if any. Rotational: This might be a specific type of failure related to the engine's rotation. This dataset could be used for various analytical purposes related to engine performance and maintenance. Here are some examples:

Identifying patterns of engine failure: By analyzing the data, you could identify correlations between specific variables (e.g., air temperature, operational hours) and engine failures. This could help predict potential failures and schedule preventative maintenance. Optimizing engine performance: By analyzing the data, you could identify the operating conditions (e.g., temperature, speed) that lead to optimal engine performance. This could help improve fuel efficiency and engine lifespan. Comparing engine types: The data could be used to compare the performance and efficiency of different engine types under various operating conditions. Building predictive models: The data could be used to train machine learning models to predict engine failures, optimize maintenance schedules, and improve overall engine performance. It's important to note that the specific value of this dataset would depend on the context and the intended use case. For example, if you are only interested in a specific type of engine or a particular type of failure, you might need to filter or subset the data accordingly.
Predictive Maintenance Dataset - Air Compressor
kaggle.com
zip
Updated Mar 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ahmet okudan (2023). Predictive Maintenance Dataset - Air Compressor [Dataset]. https://www.kaggle.com/datasets/afumetto/predictive-maintenance-dataset-air-compressor
Explore at:
zip(9542500 bytes)Available download formats
Dataset updated
Mar 6, 2023
Authors
ahmet okudan
Description

https://www.buymeacoffee.com/ahmet17

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2114241%2F00b3f28d987503f43483e3e06b776dc8%2F1.png?generation=1681077793140806&alt=media" alt="">

This dataset is free on my kaggle page. However, to support me, you can buy me a coffee :)

Do not forget that these datasets can be prepared with months of studies after long measurements.

Hope it will be useful for you.

Classified datasets are required for predictive maintenance. A machine system has many parts that are difficult to replace and maintain. When these parts are corrupted, the trained neural network should be able to predict with high accuracy which part is corrupted. That's why as much data is collected as possible. Some data may be fully correlated with each other. This data is still taught to the neural network because changing one parameter in the time domain can unexpectedly change other parameters. In the artificial intelligence system required for predictive maintenance, there must be LSTM next to DNN.

This data set has been prepared with measurements made on the compressor system feeding the air line of a factory. The related compressor has the characteristics of being driven by an AC current electric motor, two-pistons, water-cooled, single-stage, capable of producing maximum 8 bar compressed air.

Measurements were made with high resolution sensors and an industrial type data collector. To prepare a clean dataset, measurement lines with cable-induced noise were deleted.
Dataset for Predictive Maintenance
kaggle.com
zip
Updated Jul 13, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nafisur Rahman (2018). Dataset for Predictive Maintenance [Dataset]. https://www.kaggle.com/datasets/nafisur/dataset-for-predictive-maintenance
Explore at:
zip(1383996 bytes)Available download formats
Dataset updated
Jul 13, 2018
Authors
Nafisur Rahman
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Nafisur Rahman

Released under CC0: Public Domain

Contents
Microsoft Azure Predictive Maintenance
kaggle.com
zip
Updated Oct 15, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
arnab (2020). Microsoft Azure Predictive Maintenance [Dataset]. https://www.kaggle.com/arnabbiswas1/microsoft-azure-predictive-maintenance
Explore at:
zip(32497141 bytes)Available download formats
Dataset updated
Oct 15, 2020
Authors
arnab
Description
Context

This an example data source which can be used for Predictive Maintenance Model Building. It consists of the following data:

Machine conditions and usage: The operating conditions of a machine e.g. data collected from sensors.

Failure history: The failure history of a machine or component within the machine.

Maintenance history: The repair history of a machine, e.g. error codes, previous maintenance activities or component replacements.

Machine features: The features of a machine, e.g. engine size, make and model, location.

Details

Telemetry Time Series Data (PdM_telemetry.csv): It consists of hourly average of voltage, rotation, pressure, vibration collected from 100 machines for the year 2015.

Error (PdM_errors.csv): These are errors encountered by the machines while in operating condition. Since, these errors don't shut down the machines, these are not considered as failures. The error date and times are rounded to the closest hour since the telemetry data is collected at an hourly rate.

Maintenance (PdM_maint.csv): If a component of a machine is replaced, that is captured as a record in this table. Components are replaced under two situations: 1. During the regular scheduled visit, the technician replaced it (Proactive Maintenance) 2. A component breaks down and then the technician does an unscheduled maintenance to replace the component (Reactive Maintenance). This is considered as a failure and corresponding data is captured under Failures. Maintenance data has both 2014 and 2015 records. This data is rounded to the closest hour since the telemetry data is collected at an hourly rate.

Failures (PdM_failures.csv): Each record represents replacement of a component due to failure. This data is a subset of Maintenance data. This data is rounded to the closest hour since the telemetry data is collected at an hourly rate.

Metadata of Machines (PdM_Machines.csv): Model type & age of the Machines.

Acknowledgements

This dataset was available as a part of Azure AI Notebooks for Predictive Maintenance. But as of 15th Oct, 2020 the notebook (link) is no longer available. However, the data can still be downloaded using the following URLs:

https://azuremlsampleexperiments.blob.core.windows.net/datasets/PdM_telemetry.csv https://azuremlsampleexperiments.blob.core.windows.net/datasets/PdM_errors.csv https://azuremlsampleexperiments.blob.core.windows.net/datasets/PdM_maint.csv https://azuremlsampleexperiments.blob.core.windows.net/datasets/PdM_failures.csv https://azuremlsampleexperiments.blob.core.windows.net/datasets/PdM_machines.csv

Inspiration

Try to use this data to build Machine Learning models related to Predictive Maintenance.
Machine Predictive Maintenance Classification
kaggle.com
zip
Updated Nov 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shivam Bansal (2021). Machine Predictive Maintenance Classification [Dataset]. https://www.kaggle.com/datasets/shivamb/machine-predictive-maintenance-classification/code
Explore at:
zip(139819 bytes)Available download formats
Dataset updated
Nov 6, 2021
Authors
Shivam Bansal
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Machine Predictive Maintenance Classification Dataset

Since real predictive maintenance datasets are generally difficult to obtain and in particular difficult to publish, we present and provide a synthetic dataset that reflects real predictive maintenance encountered in the industry to the best of our knowledge.

The dataset consists of 10 000 data points stored as rows with 14 features in columns - UID: unique identifier ranging from 1 to 10000 - productID: consisting of a letter L, M, or H for low (50% of all products), medium (30%), and high (20%) as product quality variants and a variant-specific serial number - air temperature [K]: generated using a random walk process later normalized to a standard deviation of 2 K around 300 K - process temperature [K]: generated using a random walk process normalized to a standard deviation of 1 K, added to the air temperature plus 10 K. - rotational speed [rpm]: calculated from powepower of 2860 W, overlaid with a normally distributed noise - torque [Nm]: torque values are normally distributed around 40 Nm with an Ïƒ = 10 Nm and no negative values. - tool wear [min]: The quality variants H/M/L add 5/3/2 minutes of tool wear to the used tool in the process. and a 'machine failure' label that indicates, whether the machine has failed in this particular data point for any of the following failure modes are true.

Important : There are two Targets - Do not make the mistake of using one of them as feature, as it will lead to leakage.

Target : Failure or Not

Failure Type : Type of Failure

Acknowledgements

UCI : https://archive.ics.uci.edu/ml/datasets/AI4I+2020+Predictive+Maintenance+Dataset
Predictive Maintenance: Aircraft Engine
kaggle.com
zip
Updated May 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammed HADANI (2024). Predictive Maintenance: Aircraft Engine [Dataset]. https://www.kaggle.com/datasets/mhadani/predictive-maintenance-aircraft-engine
Explore at:
zip(1349845 bytes)Available download formats
Dataset updated
May 28, 2024
Authors
Mohammed HADANI
Description
This dataset is designed for predictive maintenance of aircraft engines and consists of three primary files:

Training Data ("PM_train.csv"): This file contains multiple multivariate time series data, with each time series representing the operational cycles of different aircraft engines of the same type. Each cycle includes 21 sensor readings. Engines start with varying initial wear and manufacturing differences, which are unknown to the user. Initially, engines operate normally and begin to degrade over time. The degradation increases until a predefined threshold is reached, marking the engine as unsafe for further use. The final cycle in each time series indicates the failure point of the engine.

Testing Data ("PM_test.csv"): This file shares the same schema as the training data but does not specify the failure points. For example, an engine might run from cycle 1 to cycle 31 without indicating how many more cycles it can last before failure.

Ground Truth Data ("PM_truth.csv"): This file provides the actual remaining working cycles for the engines in the testing data. For instance, it shows that an engine running from cycle 1 to cycle 31 in the testing data has 112 remaining cycles before failure.

This dataset enables the development and evaluation of predictive maintenance models, allowing for the prediction of engine degradation and failure, thereby enhancing maintenance schedules and ensuring operational safety.
Predictive Maintenance Dataset
kaggle.com
zip
Updated Nov 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Himanshu Agarwal (2022). Predictive Maintenance Dataset [Dataset]. https://www.kaggle.com/datasets/hiimanshuagarwal/predictive-maintenance-dataset/code
Explore at:
zip(1798425 bytes)Available download formats
Dataset updated
Nov 7, 2022
Authors
Himanshu Agarwal
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
A company has a fleet of devices transmitting daily sensor readings. They would like to create a predictive maintenance solution to proactively identify when maintenance should be performed. This approach promises cost savings over routine or time based preventive maintenance, because tasks are performed only when warranted.

The task is to build a predictive model using machine learning to predict the probability of a device failure. When building this model, be sure to minimize false positives and false negatives. The column you are trying to Predict is called failure with binary value 0 for non-failure and 1 for failure.
EVIoT-PredictiveMaint Dataset
kaggle.com
zip
Updated Mar 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DatasetEngineer (2025). EVIoT-PredictiveMaint Dataset [Dataset]. https://www.kaggle.com/datasets/datasetengineer/eviot-predictivemaint-dataset
Explore at:
zip(44220545 bytes)Available download formats
Dataset updated
Mar 9, 2025
Authors
DatasetEngineer
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
The EVIoT-PredictiveMaint Dataset is a comprehensive real-world dataset collected from IoT-enabled electric vehicles (EVs) operating in diverse environments. The dataset captures multi-modal telemetry, environmental conditions, and historical maintenance records at 15-minute intervals over a 5-year period (January 2020 to January 2025). It is specifically designed for multi-horizon predictive maintenance in EV fleet management, supporting federated learning applications for failure prediction, maintenance scheduling, and component health assessment.

With 175,393 records, this dataset is ideal for research in predictive maintenance, failure analysis, and energy optimization in electric vehicle fleets. It includes sensor data, telematics, environmental conditions, and maintenance history to facilitate advanced machine learning models for predicting vehicle reliability and optimizing maintenance strategies.

Features in EVIoT-PredictiveMaint Dataset The dataset consists of 30+ features categorized into eight major groups:

Battery System Monitoring SoC (State of Charge) – Battery charge percentage SoH (State of Health) – Battery degradation level Battery Voltage – Voltage levels across the battery pack Battery Current – Current drawn or supplied by the battery Battery Temperature – Temperature of battery cells Charge Cycles – Total charge-discharge cycles of the battery

Electric Motor and Drivetrain Monitoring Motor Temperature – Temperature of the electric motor Motor Vibration – Vibration levels indicating wear or imbalance Motor Torque – Torque generated by the motor Motor RPM – Revolutions per minute of the motor Power Consumption – Power usage by the drivetrain system

Brake System Monitoring Brake Pad Wear – Thickness level of brake pads Brake Pressure – Hydraulic pressure applied to the braking system Regenerative Braking Efficiency – Efficiency of energy recovery during braking

Tire and Suspension Data Tire Pressure – Air pressure within the tires Tire Temperature – Surface temperature of the tires Suspension Load – Load stress on the suspension system

Environmental and Usage Data Ambient Temperature – External temperature conditions Ambient Humidity – Humidity levels in the surrounding environment Load Weight – Cargo or passenger weight carried by the vehicle Driving Speed – Current vehicle speed

Telematics and Fleet Data Distance Traveled – Cumulative distance covered Idle Time – Duration of vehicle idling Route Roughness – Road surface condition affecting vehicle wear

Maintenance Records Maintenance Type – Categories: None (0), Preventive (1), Corrective (2), Predictive (3)

Target Labels for Predictive Maintenance Remaining Useful Life (RUL) – Estimated time before maintenance is required Failure Probability – Likelihood of system failure (0: No Failure, 1: Failure) Time to Failure (TTF) – Estimated time before the next failure event Component Health Score – A continuous score (0-1) indicating component condition
Predictive Maintenance Dataset
kaggle.com
Updated Jul 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdelaziz Sami (2024). Predictive Maintenance Dataset [Dataset]. https://www.kaggle.com/datasets/abdelazizsami/predictive-maintenance-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 20, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Abdelaziz Sami
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Predictive Maintenance Dataset

Characteristics: - Type: Multivariate, Time-Series - Subject Area: Computer Science - Associated Tasks: Classification, Regression, Causal-Discovery - Feature Type: Real - Number of Instances: 10,000 - Number of Features: 6 - Missing Values: No

Description: The AI4I 2020 Predictive Maintenance Dataset is a synthetic dataset designed to mirror real-world predictive maintenance data typically encountered in industrial settings. It provides a valuable resource for developing and testing predictive maintenance models where real datasets are often scarce and challenging to share.

Dataset Information: - Purpose: To offer a synthetic dataset reflecting real-world predictive maintenance scenarios. - Funding: Not specified. - Instances Representation: Each instance represents a data point in a predictive maintenance context.

Variables Table: - UID (ID, Integer): Unique identifier ranging from 1 to 10,000 - Product ID (ID, Categorical): Product identifier consisting of a letter (L, M, or H) indicating product quality variants (low, medium, high) and a serial number - Type (Feature, Categorical): Product type - Air temperature (Feature, Continuous): Measured in Kelvin (K), generated using a random walk process and normalized to a standard deviation of 2 K around 300 K - Process temperature (Feature, Continuous): Measured in Kelvin (K), generated using a random walk process normalized to a standard deviation of 1 K, added to the air temperature plus 10 K - Rotational speed (Feature, Integer): Measured in revolutions per minute (rpm), calculated from a power of 2860 W with normally distributed noise - Torque (Feature, Continuous): Measured in Newton meters (Nm), normally distributed around 40 Nm with a standard deviation of 10 Nm, and no negative values - Tool wear (Feature, Integer): Measured in minutes (min), varies by product quality (H, M, L) adding 5, 3, or 2 minutes respectively - Machine failure (Target, Integer): Indicates whether the machine failed at this data point - TWF (Target, Integer): Tool wear failure

Additional Variable Information: The dataset consists of 10,000 data points stored as rows with 14 features in columns. Each row includes:

UID: Unique identifier

Product ID: Indicates product quality (L, M, H) and a serial number

Air temperature [K]: Normalized random walk process around 300 K

Process temperature [K]: Normalized random walk process added to air temperature plus 10 K

Rotational speed [rpm]: Calculated from power and overlaid with noise

Torque [Nm]: Normally distributed values around 40 Nm

Tool wear [min]: Additional wear based on product quality

Machine failure: Indicates overall failure status

Failure Modes: Includes tool wear failure (TWF), heat dissipation failure (HDF), power failure (PWF), overstrain failure (OSF), and random failures (RNF)

Failure Mode Details: - Tool wear failure (TWF): Tool failure or replacement between 200-240 mins, randomly assigned - Heat dissipation failure (HDF): Failure if temperature difference is below 8.6 K and rotational speed is below 1380 rpm - Power failure (PWF): Failure if power (torque * rotational speed in rad/s) is below 3500 W or above 9000 W - Overstrain failure (OSF): Failure if product of tool wear and torque exceeds thresholds (11,000 minNm for L, 12,000 for M, 13,000 for H) - Random failures (RNF): Each process has a 0.1% chance of failure regardless of parameters

Introductory Paper: "Explainable Artificial Intelligence for Predictive Maintenance Applications" by S. Matzka, 2020, published in the International Conference on Artificial Intelligence for Industries.
Predictive Maintenance System data set
kaggle.com
zip
Updated Oct 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Complex Infinite Solutions (2023). Predictive Maintenance System data set [Dataset]. https://www.kaggle.com/datasets/favadhassanjaskani/predictive-maintenance-system-data-set
Explore at:
zip(5389 bytes)Available download formats
Dataset updated
Oct 6, 2023
Authors
Complex Infinite Solutions
Description
Dataset

This dataset was created by Complex Infinite Solutions

Released under Other (specified in description)

Contents
Predictive maintenance dataset
kaggle.com
zip
Updated May 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AngeValli (2023). Predictive maintenance dataset [Dataset]. https://www.kaggle.com/datasets/angevalli/predictive-maintenance-dataset
Explore at:
zip(54380924 bytes)Available download formats
Dataset updated
May 24, 2023
Authors
AngeValli
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Datasets for performing predictive maintenance and give predictions on the future breakdowns and their root causes. The analysis of those datasets relies on life distributions and the choice over several maintenance strategies. The different natures of dataset allows to perform a wide range of analysis, which are presented in the notebook associated with this dataset.
Data from: Machine Failure Prediction using Sensor data
kaggle.com
zip
Updated Jun 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Umer Naeem (2024). Machine Failure Prediction using Sensor data [Dataset]. https://www.kaggle.com/datasets/umerrtx/machine-failure-prediction-using-sensor-data
Explore at:
zip(6953 bytes)Available download formats
Dataset updated
Jun 25, 2024
Authors
Umer Naeem
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Overview This dataset contains sensor data collected from various machines, with the aim of predicting machine failures in advance. It includes a variety of sensor readings as well as the recorded machine failures.

Columns Description footfall: The number of people or objects passing by the machine. tempMode: The temperature mode or setting of the machine. AQ: Air quality index near the machine. USS: Ultrasonic sensor data, indicating proximity measurements. CS: Current sensor readings, indicating the electrical current usage of the machine. VOC: Volatile organic compounds level detected near the machine. RP: Rotational position or RPM (revolutions per minute) of the machine parts. IP: Input pressure to the machine. Temperature: The operating temperature of the machine. fail: Binary indicator of machine failure (1 for failure, 0 for no failure).
Predictive Maintenance Dataset (AI4I 2020)
kaggle.com
zip
Updated Nov 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stephan Matzka (2022). Predictive Maintenance Dataset (AI4I 2020) [Dataset]. https://www.kaggle.com/datasets/stephanmatzka/predictive-maintenance-dataset-ai4i-2020/data
Explore at:
zip(138762 bytes)Available download formats
Dataset updated
Nov 6, 2022
Authors
Stephan Matzka
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Please note that this is the original dataset with additional information and proper attribution. There is at least one other version of this dataset on Kaggle that was uploaded without permission. Please be fair and attribute the original author. This synthetic dataset is modeled after an existing milling machine and consists of 10 000 data points from a stored as rows with 14 features in columns

UID: unique identifier ranging from 1 to 10000

product ID: consisting of a letter L, M, or H for low (50% of all products), medium (30%) and high (20%) as product quality variants and a variant-specific serial number

type: just the product type L, M or H from column 2

air temperature [K]: generated using a random walk process later normalized to a standard deviation of 2 K around 300 K

process temperature [K]: generated using a random walk process normalized to a standard deviation of 1 K, added to the air temperature plus 10 K.

rotational speed [rpm]: calculated from a power of 2860 W, overlaid with a normally distributed noise

torque [Nm]: torque values are normally distributed around 40 Nm with a SD = 10 Nm and no negative values.

tool wear [min]: The quality variants H/M/L add 5/3/2 minutes of tool wear to the used tool in the process.

a 'machine failure' label that indicates, whether the machine has failed in this particular datapoint for any of the following failure modes are true.

The machine failure consists of five independent failure modes 10. tool wear failure (TWF): the tool will be replaced of fail at a randomly selected tool wear time between 200 - 240 mins (120 times in our dataset). At this point in time, the tool is replaced 69 times, and fails 51 times (randomly assigned). 11. heat dissipation failure (HDF): heat dissipation causes a process failure, if the difference between air- and process temperature is below 8.6 K and the tools rotational speed is below 1380 rpm. This is the case for 115 data points. 12. power failure (PWF): the product of torque and rotational speed (in rad/s) equals the power required for the process. If this power is below 3500 W or above 9000 W, the process fails, which is the case 95 times in our dataset. 13. overstrain failure (OSF): if the product of tool wear and torque exceeds 11,000 minNm for the L product variant (12,000 M, 13,000 H), the process fails due to overstrain. This is true for 98 datapoints. 14. random failures (RNF): each process has a chance of 0,1 % to fail regardless of its process parameters. This is the case for only 5 datapoints, less than could be expected for 10,000 datapoints in our dataset. If at least one of the above failure modes is true, the process fails and the 'machine failure' label is set to 1. It is therefore not transparent to the machine learning method, which of the failure modes has caused the process to fail.

This dataset is part of the following publication, please cite when using this dataset: S. Matzka, "Explainable Artificial Intelligence for Predictive Maintenance Applications," 2020 Third International Conference on Artificial Intelligence for Industries (AI4I), 2020, pp. 69-74, doi: 10.1109/AI4I49448.2020.00023.

The image of the milling process is the work of Daniel Smyth @ Pexels: https://www.pexels.com/de-de/foto/industrie-herstellung-maschine-werkzeug-10406128/
Petrochemical Predictive Maintenance Dataset
kaggle.com
Updated Aug 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Python Developer (2025). Petrochemical Predictive Maintenance Dataset [Dataset]. https://www.kaggle.com/datasets/programmer3/petrochemical-predictive-maintenance-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 12, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Python Developer
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset contains 5000 rows of multi-sensor readings collected from petrochemical rotating machinery operating under varying load and environmental conditions.

The dataset includes:

Sensor signals: 3-axis vibration, temperature, electrical current, rotational speed (RPM), and internal pressure.

Time-frequency features: Five features extracted using wavelet packet decomposition.

Labels: Multi-class fault type (no_fault, bearing_fault, rotor_imbalance, misalignment) and binary maintenance requirement flag.

Timestamps: High-resolution time intervals for time-series analysis.
Vehicle Maintenance Data
kaggle.com
zip
Updated Mar 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chavindu Dulaj (2024). Vehicle Maintenance Data [Dataset]. https://www.kaggle.com/datasets/chavindudulaj/vehicle-maintenance-data
Explore at:
zip(1854785 bytes)Available download formats
Dataset updated
Mar 30, 2024
Authors
Chavindu Dulaj
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Vehicle Maintenance Dataset

Overview

This dataset provides synthetic data related to vehicle maintenance to help predict whether a vehicle requires maintenance or not based on various features.

Features

Vehicle_Model: Type of the vehicle (Car, SUV, Van, Truck, Bus, Motorcycle)

Mileage: Total mileage of the vehicle

Maintenance_History: Maintenance history of the vehicle (Good, Average, Poor)

Reported_Issues: Number of reported issues

Vehicle_Age: Age of the vehicle in years

Fuel_Type: Type of fuel used (Diesel, Petrol, Electric)

Transmission_Type: Transmission type (Automatic, Manual)

Engine_Size: Size of the engine in cc (Cubic Centimeters)

Odometer_Reading: Current odometer reading of the vehicle

Last_Service_Date: Date of the last service

Warranty_Expiry_Date: Date when the warranty expires

Owner_Type: Type of vehicle owner (First, Second, Third)

Insurance_Premium: Insurance premium amount

Service_History: Number of services done

Accident_History: Number of accidents the vehicle has been involved in

Fuel_Efficiency: Fuel efficiency of the vehicle in km/l (Kilometers per liter)

Tire_Condition: Condition of the tires (New, Good, Worn Out)

Brake_Condition: Condition of the brakes (New, Good, Worn Out)

Battery_Status: Status of the battery (New, Good, Weak)

Need_Maintenance: Target variable indicating whether the vehicle needs maintenance (1 = Yes, 0 = No)

Target Variable

Need_Maintenance: Indicates whether the vehicle requires maintenance or not based on specified conditions.

Data Range

Total number of records: 50,000

Source

This dataset is synthetic and was generated using Python. It is intended for educational and research purposes.

Acknowledgements

The dataset was generated using Python and the data is synthetic.
Smart Manufacturing Maintenance Dataset
kaggle.com
zip
Updated Jun 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ziya (2025). Smart Manufacturing Maintenance Dataset [Dataset]. https://www.kaggle.com/datasets/ziya07/smart-manufacturing-maintenance-dataset
Explore at:
zip(65206 bytes)Available download formats
Dataset updated
Jun 10, 2025
Authors
Ziya
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset supports research on predictive maintenance and decision support in smart manufacturing systems. The dataset combines real-time sensor measurements, maintenance cost factors, and decision variables to prioritize equipment servicing based on failure risk and operational constraints.

The framework uses cloud computing to improve system scalability and responsiveness by synchronizing virtual asset models with real-time sensor and inspection data

Researchers and practitioners can use this dataset to explore proactive maintenance scheduling, asset health diagnostics, and intelligent factory management.

⭐ Key Features Real-Time Sensor Data: Includes temperature, vibration, pressure, and acoustic signals from simulated manufacturing equipment.

Maintenance Decision Criteria: Factors like inspection duration, technician availability, and downtime cost to support MCDM.

Failure Probability Score: Computed feature for training predictive models (range: 0–1).

Maintenance Priority Label: Target variable (High = 1, Medium = 2, Low = 3) based on failure likelihood and operational risk.
ai4i+2020+predictive+maintenance+dataset
kaggle.com
zip
Updated Jul 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
arpan001 (2025). ai4i+2020+predictive+maintenance+dataset [Dataset]. https://www.kaggle.com/datasets/arpan00io/ai4i-2020-predictive-maintenance-dataset
Explore at:
zip(138762 bytes)Available download formats
Dataset updated
Jul 24, 2025
Authors
arpan001
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by arpan001

Released under MIT

Contents
Predictive maintenance
kaggle.com
zip
Updated May 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rasha A. Abu Rkab (2023). Predictive maintenance [Dataset]. https://www.kaggle.com/datasets/rashaali2003/predictive-maintenance
Explore at:
zip(139819 bytes)Available download formats
Dataset updated
May 15, 2023
Authors
Rasha A. Abu Rkab
Description
Predictive maintenance (PdM) is a technique that uses data analysis tools and techniques to detect anomalies in your operation and possible defects in equipment and processes so you can fix them before they result in failure.

The dataset consists of 10 000 data points stored as rows with 14 features in columns

UID: unique identifier ranging from 1 to 10000 productID:consisting of a letter L, M, or H for low (50% of all products), medium (30%), and high (20%) as product quality variants and a variant-specific serial number air temperature [K]: generated using a random walk process later normalized to a standard deviation of 2 K around 300 K process temperature [K]: generated using a random walk process normalized to a standard deviation of 1 K, added to the air temperature plus 10 K. rotational speed [rpm]: calculated from horsepower of 2860 W, overlaid with a normally distributed noise torque [Nm]: torque values are normally distributed around 40 Nm with an Ïƒ = 10 Nm and no negative values. tool wear [min]: The quality variants H/M/L add 5/3/2 minutes of tool wear to the used tool in the process. and a The 'machine failure' label that indicates, whether the machine has failed in this particular data point for any of the following failure modes is true.

Important:

There are two Targets - Do not make the mistake of using one of them as a feature, as it will lead to leakage. Target: Failure or Not Failure Type: Type of Failure

Acknowledgments: UCI : UCI
Preventive Maintenance for Marine Engines
kaggle.com
zip
Updated Feb 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fijabi J. Adekunle (2025). Preventive Maintenance for Marine Engines [Dataset]. https://www.kaggle.com/datasets/jeleeladekunlefijabi/preventive-maintenance-for-marine-engines
Explore at:
zip(436025 bytes)Available download formats
Dataset updated
Feb 12, 2025
Authors
Fijabi J. Adekunle
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Preventive Maintenance for Marine Engines: Data-Driven Insights

Introduction:

Marine engine failures can lead to costly downtime, safety risks and operational inefficiencies. This project leverages machine learning to predict maintenance needs, helping ship operators prevent unexpected breakdowns. Using a simulated dataset, we analyze key engine parameters and develop predictive models to classify maintenance status into three categories: Normal, Requires Maintenance, and Critical.

Overview This project explores preventive maintenance strategies for marine engines by analyzing operational data and applying machine learning techniques.

Key steps include: 1. Data Simulation: Creating a realistic dataset with engine performance metrics. 2. Exploratory Data Analysis (EDA): Understanding trends and patterns in engine behavior. 3. Model Training & Evaluation: Comparing machine learning models (Decision Tree, Random Forest, XGBoost) to predict maintenance needs. 4. Hyperparameter Tuning: Using GridSearchCV to optimize model performance.

Tools Used 1. Python: Data processing, analysis and modeling 2. Pandas & NumPy: Data manipulation 3. Scikit-Learn & XGBoost: Machine learning model training 4. Matplotlib & Seaborn: Data visualization

Skills Demonstrated ✔ Data Simulation & Preprocessing ✔ Exploratory Data Analysis (EDA) ✔ Feature Engineering & Encoding ✔ Supervised Machine Learning (Classification) ✔ Model Evaluation & Hyperparameter Tuning

Key Insights & Findings 📌 Engine Temperature & Vibration Level: Strong indicators of potential failures. 📌 Random Forest vs. XGBoost: After hyperparameter tuning, both models achieved comparable performance, with Random Forest performing slightly better. 📌 Maintenance Status Distribution: Balanced dataset ensures unbiased model training. 📌 Failure Modes: The most common issues were Mechanical Wear & Oil Leakage, aligning with real-world engine failure trends.

Challenges Faced 🚧 Simulating Realistic Data: Ensuring the dataset reflects real-world marine engine behavior was a key challenge. 🚧 Model Performance: The accuracy was limited (~35%) due to the complexity of failure prediction. 🚧 Feature Selection: Identifying the most impactful features required extensive analysis.

Call to Action 🔍 Explore the Dataset & Notebook: Try running different models and tweaking hyperparameters. 📊 Extend the Analysis: Incorporate additional sensor data or alternative machine learning techniques. 🚀 Real-World Application: This approach can be adapted for industrial machinery, aircraft engines, and power plants.
predictive maintenance
kaggle.com
zip
Updated Apr 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IRFAN ULLAH KHAN (2024). predictive maintenance [Dataset]. https://www.kaggle.com/datasets/programmarself/predictive-maintenance
Explore at:
zip(138682 bytes)Available download formats
Dataset updated
Apr 5, 2024
Authors
IRFAN ULLAH KHAN
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by IRFAN ULLAH KHAN

Released under Apache 2.0

Contents

Facebook

Twitter

Click to copy link

Link copied

Cite

RohithNair (2024). Predictive Maintenance of Machines [Dataset]. https://www.kaggle.com/datasets/nair26/predictive-maintenance-of-machines

Predictive Maintenance of Machines

Action-Oriented: Optimize, Predict, Prevent: Unleashing the Power of Data

Explore at:

273 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Feb 28, 2024

Dataset provided by

Kaggle

Authors

RohithNair

License

http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

Description

This dataset provides information about Vibration levels , torque, process temperature and Fault.

The dataset in the image is a spreadsheet containing information about engine performance. The spreadsheet has the following variables:

UDI: This is likely a unique identifier for each engine. Product ID: This could be a specific code or identifier for the engine model. Type: This indicates the type of engine, possibly categorized by fuel type (e.g., M - motor, L - liquid). Air temperature (K): This is the air temperature in Kelvin around the engine. Process temperature [K]: This is the internal temperature of the engine during operation, measured in Kelvin. Speed (rpm): This is the rotational speed of the engine in revolutions per minute. Torque (Nm): This is the twisting force exerted by the engine, measured in Newton meters. Vibration Levels: This could be a measure of the engine's vibration intensity. Operational Hours: This is the total number of hours the engine has been operational. Tailure Type: This indicates the type of failure the engine experienced, if any. Rotational: This might be a specific type of failure related to the engine's rotation. This dataset could be used for various analytical purposes related to engine performance and maintenance. Here are some examples:

Identifying patterns of engine failure: By analyzing the data, you could identify correlations between specific variables (e.g., air temperature, operational hours) and engine failures. This could help predict potential failures and schedule preventative maintenance. Optimizing engine performance: By analyzing the data, you could identify the operating conditions (e.g., temperature, speed) that lead to optimal engine performance. This could help improve fuel efficiency and engine lifespan. Comparing engine types: The data could be used to compare the performance and efficiency of different engine types under various operating conditions. Building predictive models: The data could be used to train machine learning models to predict engine failures, optimize maintenance schedules, and improve overall engine performance. It's important to note that the specific value of this dataset would depend on the context and the intended use case. For example, if you are only interested in a specific type of engine or a particular type of failure, you might need to filter or subset the data accordingly.

Clear search

Close search

Google apps

Main menu

Predictive Maintenance of Machines

Predictive Maintenance Dataset - Air Compressor

Dataset for Predictive Maintenance

Dataset

Contents

Microsoft Azure Predictive Maintenance

Context

Details

Acknowledgements

Inspiration

Machine Predictive Maintenance Classification

Machine Predictive Maintenance Classification Dataset

Important : There are two Targets - Do not make the mistake of using one of them as feature, as it will lead to leakage.

Acknowledgements

Predictive Maintenance: Aircraft Engine

Predictive Maintenance Dataset

EVIoT-PredictiveMaint Dataset

Predictive Maintenance Dataset

Predictive Maintenance Dataset

Predictive Maintenance System data set

Dataset

Contents

Predictive maintenance dataset

Data from: Machine Failure Prediction using Sensor data

Predictive Maintenance Dataset (AI4I 2020)

Petrochemical Predictive Maintenance Dataset

Vehicle Maintenance Data

Vehicle Maintenance Dataset

Overview

Features

Target Variable

Data Range

Source

Acknowledgements

Smart Manufacturing Maintenance Dataset

ai4i+2020+predictive+maintenance+dataset

Dataset

Contents

Predictive maintenance

Predictive maintenance (PdM) is a technique that uses data analysis tools and techniques to detect anomalies in your operation and possible defects in equipment and processes so you can fix them before they result in failure.

The dataset consists of 10 000 data points stored as rows with 14 features in columns

Important:

Preventive Maintenance for Marine Engines

predictive maintenance

Dataset

Contents

Predictive Maintenance of Machines

Action-Oriented: Optimize, Predict, Prevent: Unleashing the Power of Data