The dataset contains S.M.A.R.T. attributes of hard drives from 2015 to 2018 on ST4000DM000 model from BackBlaze DC. The dataset was kindly preprocessed and ready to use.
The dataset includes hard drive S.M.A.R.T. attributes along with model, serial number, date and capacity. The dataset was greatly preprocessed.
First of all, the specific model was chosen due to the greatest number of falls. Also, because of too many health drives and a small amount of failured, all failured and only 10k health drives was taken from every year.
Data was processed according to the following rules:
You can find more details here: https://github.com/awant/sd_failure_predictions
The original BackBlaze data: https://www.backblaze.com/b2/hard-drive-test-data.html. One can use this dataset in his own use, but he have to cite BackBlaze as the source and doesn't sell data.
In order for solutions to be comparable, I suggest use 2018 year as a test data and other as a train
Description:
This dataset is sourced from the "Roast-PC by Gemini" website, a platform that provides AI-powered roasting (critical feedback) on custom PC builds. Users input the components of their PC build, including CPU, GPU, motherboard, RAM, PSU, disk, and intended use case. The dataset captures the logs of these submissions, along with the roasting comments generated by Gemini AI, Google's AI model.
Dataset Overview:
Column Names and Descriptions:
Time
: Date and Time of request.cpu
: The CPU model specified by the user (e.g., "AMD Ryzen 5 5500", "Intel i7 1200K").gpu
: The GPU model specified by the user (e.g., "NVIDIA RTX 3080", "AMD Radeon RX 6800").motherboard
: The motherboard model specified by the user (e.g., "ASUS ROG Strix B550-F", "MSI B450 TOMAHAWK").ram
: The RAM configuration specified by the user, including size and speed (e.g., "16GB DDR4 3200MHz").psu
: The PSU (Power Supply Unit) model specified by the user, including wattage (e.g., "Corsair RM750x 750W").disk
: The storage devices specified by the user, including type and capacity (e.g., "1TB NVMe SSD", "500GB SATA HDD").use_case
: The intended use of the PC as specified by the user (e.g., "gaming", "video editing", "general use").roast_comments
: The AI-generated feedback or roasting comments provided by Gemini AI, critiquing the PC build based on the components and use case (indonesian).Functionality:
This dataset serves multiple purposes:
This dataset is ideal for those interested in PC building, hardware analysis, AI-generated content, or anyone curious about trends in custom PC configurations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘VertebralColumnDataSet’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/caesarlupum/vertebralcolumndataset on 30 September 2021.
--- Dataset description provided by original source is as follows ---
Download: Data Folder-http://archive.ics.uci.edu/ml/machine-learning-databases/00212/
Data Set Description, http://archive.ics.uci.edu/ml/machine-learning-databases/00212/
Abstract: Data set containing values for six biomechanical features used to classify orthopaedic patients into 3 classes (normal
, disk hernia
or spondilolysthesis
) or 2 classes (normal
or abnormal
).
Guilherme de Alencar Barreto (guilherme '@' deti.ufc.br) & Ajalmar Rêgo da Rocha Neto (ajalmar '@' ifce.edu.br), Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza, Ceará¡, Brazil.
Henrique Antonio Fonseca da Mota Filho (hdamota '@' gmail.com), Hospital Monte Klinikum, Fortaleza, Ceará¡, Brazil.
Biomedical data set built by Dr. Henrique da Mota during a medical residence period in the Group of Applied Research in Orthopaedics (GARO) of the Centre Médico-Chirurgical de Réadaptation des Massues, Lyon, France. The data have been organized in two different but related classification tasks. The first task consists in classifying patients as belonging to one out of three categories: Normal (100 patients), Disk Hernia (60 patients) or Spondylolisthesis (150 patients). For the second task, the categories Disk Hernia and Spondylolisthesis were merged into a single category labelled as 'abnormal'. Thus, the second task consists in classifying patients as belonging to one out of two categories: Normal (100 patients) or Abnormal (210 patients). We provide files also for use within the WEKA environment.
Each patient is represented in the data set by six biomechanical attributes derived from the shape and orientation of the pelvis and lumbar spine (in this order): pelvic incidence, pelvic tilt, lumbar lordosis angle, sacral slope, pelvic radius and grade of spondylolisthesis. The following convention is used for the class labels: DH (Disk Hernia), Spondylolisthesis (SL), Normal (NO) and Abnormal (AB).
(1) Berthonnaud, E., Dimnet, J., Roussouly, P. & Labelle, H. (2005). 'Analysis of the sagittal balance of the spine and pelvis using shape and orientation parameters', Journal of Spinal Disorders & Techniques, 18(1):40–47.
(2) Rocha Neto, A. R. & Barreto, G. A. (2009). 'On the Application of Ensembles of Classifiers to the Diagnosis of Pathologies of the Vertebral Column: A Comparative Analysis', IEEE Latin America Transactions, 7(4):487-496.
(3) Rocha Neto, A. R., Sousa, R., Barreto, G. A. & Cardoso, J. S. (2011). 'Diagnostic of Pathology on the Vertebral Column with Embedded Reject Option†, Proceedings of the 5th Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA'2011), Gran Canaria, Spain, Lecture Notes on Computer Science, vol. 6669, p. 588-595.
--- Original source retains full ownership of the source dataset ---
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
A Secchi disk is a circular disk with alternating black and white quadrants that is used to measure the clarity or transparency of water. It is typically lowered into the water using a line, and the depth at which it disappears from view is measured. The depth at which the disk disappears is called the "Secchi depth" and it provides an indication of the water clarity.
Measuring water transparency is important for a few reasons. First, it can indicate the health of aquatic ecosystems. Clear water allows sunlight to penetrate deeper, which is important for photosynthesis by aquatic plants and algae. If the water becomes cloudy or turbid, it can indicate that there is too much sediment or other particles in the water, which can have negative impacts on the ecosystem.
Second, water transparency can also impact water quality for human use. For example, if the water is too turbid, it can make it difficult to treat for drinking water or for use in industrial processes. Additionally, if the water is too cloudy or turbid, it can impact recreational uses such as swimming or fishing.
Overall, measuring water transparency using a Secchi disk can provide important information about the health and quality of aquatic ecosystems, as well as the suitability of water for human use.
This dataset is sampled from 135 locations along the eastern coast of Georgian Bay in Ontario, Canada from 2003-2005. It provides data on the Secchi depth alongside the characteristics of the water.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F6247135%2F33db54e2761af367ead97b0aa070190f%2Fsecchi_demonstration.jpeg?generation=1679696033632439&alt=media" alt="">
Download: Data Folder-http://archive.ics.uci.edu/ml/machine-learning-databases/00212/
Data Set Description, http://archive.ics.uci.edu/ml/machine-learning-databases/00212/
Abstract: Data set containing values for six biomechanical features used to classify orthopaedic patients into 3 classes (normal
, disk hernia
or spondilolysthesis
) or 2 classes (normal
or abnormal
).
Guilherme de Alencar Barreto (guilherme '@' deti.ufc.br) & Ajalmar Rêgo da Rocha Neto (ajalmar '@' ifce.edu.br), Department of Teleinformatics Engineering, Federal University of Ceará, Fortaleza, Ceará¡, Brazil.
Henrique Antonio Fonseca da Mota Filho (hdamota '@' gmail.com), Hospital Monte Klinikum, Fortaleza, Ceará¡, Brazil.
Biomedical data set built by Dr. Henrique da Mota during a medical residence period in the Group of Applied Research in Orthopaedics (GARO) of the Centre Médico-Chirurgical de Réadaptation des Massues, Lyon, France. The data have been organized in two different but related classification tasks. The first task consists in classifying patients as belonging to one out of three categories: Normal (100 patients), Disk Hernia (60 patients) or Spondylolisthesis (150 patients). For the second task, the categories Disk Hernia and Spondylolisthesis were merged into a single category labelled as 'abnormal'. Thus, the second task consists in classifying patients as belonging to one out of two categories: Normal (100 patients) or Abnormal (210 patients). We provide files also for use within the WEKA environment.
Each patient is represented in the data set by six biomechanical attributes derived from the shape and orientation of the pelvis and lumbar spine (in this order): pelvic incidence, pelvic tilt, lumbar lordosis angle, sacral slope, pelvic radius and grade of spondylolisthesis. The following convention is used for the class labels: DH (Disk Hernia), Spondylolisthesis (SL), Normal (NO) and Abnormal (AB).
(1) Berthonnaud, E., Dimnet, J., Roussouly, P. & Labelle, H. (2005). 'Analysis of the sagittal balance of the spine and pelvis using shape and orientation parameters', Journal of Spinal Disorders & Techniques, 18(1):40–47.
(2) Rocha Neto, A. R. & Barreto, G. A. (2009). 'On the Application of Ensembles of Classifiers to the Diagnosis of Pathologies of the Vertebral Column: A Comparative Analysis', IEEE Latin America Transactions, 7(4):487-496.
(3) Rocha Neto, A. R., Sousa, R., Barreto, G. A. & Cardoso, J. S. (2011). 'Diagnostic of Pathology on the Vertebral Column with Embedded Reject Option†, Proceedings of the 5th Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA'2011), Gran Canaria, Spain, Lecture Notes on Computer Science, vol. 6669, p. 588-595.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The dataset contains S.M.A.R.T. attributes of hard drives from 2015 to 2018 on ST4000DM000 model from BackBlaze DC. The dataset was kindly preprocessed and ready to use.
The dataset includes hard drive S.M.A.R.T. attributes along with model, serial number, date and capacity. The dataset was greatly preprocessed.
First of all, the specific model was chosen due to the greatest number of falls. Also, because of too many health drives and a small amount of failured, all failured and only 10k health drives was taken from every year.
Data was processed according to the following rules:
You can find more details here: https://github.com/awant/sd_failure_predictions
The original BackBlaze data: https://www.backblaze.com/b2/hard-drive-test-data.html. One can use this dataset in his own use, but he have to cite BackBlaze as the source and doesn't sell data.
In order for solutions to be comparable, I suggest use 2018 year as a test data and other as a train