16 datasets found
  1. mlops-dataset

    • kaggle.com
    zip
    Updated Nov 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniil Krizhanovskyi (2025). mlops-dataset [Dataset]. https://www.kaggle.com/datasets/daniilkrizhanovskyi/mlops-dataset
    Explore at:
    zip(4024 bytes)Available download formats
    Dataset updated
    Nov 12, 2025
    Authors
    Daniil Krizhanovskyi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Daniil Krizhanovskyi

    Released under MIT

    Contents

  2. MLops Pipeline

    • kaggle.com
    zip
    Updated Nov 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy Okey (2024). MLops Pipeline [Dataset]. https://www.kaggle.com/datasets/amyokey/mlops-pipeline
    Explore at:
    zip(8831585 bytes)Available download formats
    Dataset updated
    Nov 20, 2024
    Authors
    Amy Okey
    License

    https://www.licenses.ai/ai-licenseshttps://www.licenses.ai/ai-licenses

    Description

    Dataset

    This dataset was created by Amy Okey

    Released under RAIL (specified in description)

    Contents

  3. mlops-phase1

    • kaggle.com
    zip
    Updated Jun 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Under The Hood (2023). mlops-phase1 [Dataset]. https://www.kaggle.com/datasets/hinetabi/mlops-phase1
    Explore at:
    zip(10589173 bytes)Available download formats
    Dataset updated
    Jun 30, 2023
    Authors
    Under The Hood
    Description

    Dataset

    This dataset was created by Under The Hood

    Contents

  4. boston MLOps Assignment

    • kaggle.com
    zip
    Updated Dec 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shubham Raj (2024). boston MLOps Assignment [Dataset]. https://www.kaggle.com/datasets/shubham14p3/boston-mlops-assignment/data
    Explore at:
    zip(16746 bytes)Available download formats
    Dataset updated
    Dec 1, 2024
    Authors
    Shubham Raj
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Shubham Raj

    Released under MIT

    Contents

  5. BATCH 8 MLOPS PROJECT

    • kaggle.com
    zip
    Updated Oct 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saibhargav Ch (2024). BATCH 8 MLOPS PROJECT [Dataset]. https://www.kaggle.com/datasets/saibhargavch/batch-8-mlops-project/code
    Explore at:
    zip(263060 bytes)Available download formats
    Dataset updated
    Oct 7, 2024
    Authors
    Saibhargav Ch
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Saibhargav Ch

    Released under Apache 2.0

    Contents

  6. MLOps-Task

    • kaggle.com
    zip
    Updated May 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jhagdu (2020). MLOps-Task [Dataset]. https://www.kaggle.com/jhagdu/mlopstask
    Explore at:
    zip(32680540 bytes)Available download formats
    Dataset updated
    May 18, 2020
    Authors
    Jhagdu
    Description

    Dataset

    This dataset was created by Jhagdu

    Contents

  7. extNPH-MLOps

    • kaggle.com
    zip
    Updated Jul 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jurgen (2023). extNPH-MLOps [Dataset]. https://www.kaggle.com/datasets/jurgendn/mlops-aihub-extnph
    Explore at:
    zip(432508609 bytes)Available download formats
    Dataset updated
    Jul 3, 2023
    Authors
    Jurgen
    Description

    Dataset

    This dataset was created by Jurgen

    Contents

  8. the mlops_whitepaper.pdf

    • kaggle.com
    zip
    Updated Apr 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohsin Dahri (2025). the mlops_whitepaper.pdf [Dataset]. https://www.kaggle.com/datasets/mohsindahri/the-mlops-whitepaper-pdf
    Explore at:
    zip(11900334 bytes)Available download formats
    Dataset updated
    Apr 20, 2025
    Authors
    Mohsin Dahri
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Mohsin Dahri

    Released under CC0: Public Domain

    Contents

  9. indian_liver_patientmissing MLOps Assignment

    • kaggle.com
    zip
    Updated Dec 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shubham Raj (2024). indian_liver_patientmissing MLOps Assignment [Dataset]. https://www.kaggle.com/datasets/shubham14p3/indian-liver-patientmissing-mlops-assignment
    Explore at:
    zip(7915 bytes)Available download formats
    Dataset updated
    Dec 4, 2024
    Authors
    Shubham Raj
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Shubham Raj

    Released under MIT

    Contents

  10. Arabic Mental Health MLOps

    • kaggle.com
    zip
    Updated Nov 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmed Taha Shiha (2025). Arabic Mental Health MLOps [Dataset]. https://www.kaggle.com/datasets/ahmedtahashiha/arabic-mental-health-mlops/discussion
    Explore at:
    zip(533396375 bytes)Available download formats
    Dataset updated
    Nov 22, 2025
    Authors
    Ahmed Taha Shiha
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    MLOps project for Arabic mental health text classification

  11. Synthetic_Data_Satellite_Health

    • kaggle.com
    zip
    Updated Mar 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    JeffDJeffD (2024). Synthetic_Data_Satellite_Health [Dataset]. https://www.kaggle.com/datasets/jeffdjeffd/synthetic-data-satellite-health
    Explore at:
    zip(458476 bytes)Available download formats
    Dataset updated
    Mar 2, 2024
    Authors
    JeffDJeffD
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This purpose of this (imaginary) study is to detect faulty satellite in order to prevent communication interruption. Understand how confirmation bias can lead to misleading results by utilizing a synthetic data set with a developed story line.

    Problem Statement:

    You are leading a team to conduct a study of the that will allow the space agency to predict the health status of satellites based on telemetry data to enable proactive maintenance and ensure optimal performance in space missions.

    Stakeholders' Concerns:

    1. Accurately predicting satellite health to minimize the risk of mission failures and optimize satellite usage.

    2. Identifying the most critical factors that affect satellite health to focus on improving those aspects during the satellite design and maintenance process.

    3. Reducing the rate of false positives and false negatives in predictions to avoid unnecessary maintenance efforts and ensure that actual issues are addressed promptly.

    Misclassification Costs (estimation):

    False Positive (predicting a malfunction when the component is healthy): Unnecessary maintenance check: $5,000 Unwarranted component replacement: $50,000

    False Negative (predicting a component is healthy when it is malfunctioning): Data loss or degradation: $100,000 Partial mission failure: $500,000 Total mission failure or satellite loss: $300,000,000

    Team Focus:

    1. Thoroughly exploring the data to understand the relationships between various telemetry variables and satellite health.

    2. Ensuring the model is accurate and reliable by selecting appropriate algorithms, performing feature engineering, and validating the model's performance using relevant metrics.

    3. Identifying and addressing any data quality issues, such as missing values and incorrect data.

    4. Investigating the importance of each variable in the prediction task and communicating these insights to stakeholders for better decision-making

    DATA DICTIONARY:

    Data Dictionary

    1. time_since_launch (days)

    Range: 0 to 3650 Description: Time since the satellite was launched.

    1. orbital_altitude (km)

    Range: 300 to 2000 Description: Altitude of the satellite's orbit.

    1. battery_voltage (V)

    Range: 20 to 30 Description: Satellite's battery voltage.

    1. solar_panel_temperature (Β°C)

    Range: -50 to 50 Description: Temperature of the satellite's solar panels.

    1. attitude_control_error (degrees)

    Range: 0 to 5 Description: Error in the satellite's attitude control system.

    1. data_transmission_rate (Mbps)

    Range: 10 to 100 Description: Rate of data transmission from the satellite to the ground station.

    1. thermal_control_status (0 or 1)

    Range: 0 (not working) or 1 (working) Description: Binary flag indicating if the thermal control system is working or not.

    1. satellite_health (0 or 1)

    Range: 0 (unhealthy) or 1 (healthy) Description: Target variable - binary flag indicating if the satellite is healthy or unhealthy

  12. Google AI Whitepaper knowledgebase

    • kaggle.com
    zip
    Updated Apr 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    malavp1703 (2025). Google AI Whitepaper knowledgebase [Dataset]. https://www.kaggle.com/datasets/malavp1703/google-ai-whitepaer-knowledgebase
    Explore at:
    zip(62415619 bytes)Available download formats
    Dataset updated
    Apr 20, 2025
    Authors
    malavp1703
    Description

    This dataset is a ground of whitepapers shared by Google in its AI workshop. It is a knowledgebase on various GenAI topics including prompt engineering, vector databases, embeddings, RAG, Agents, Agent companions, fine tuning and use of MLops in GenAI planning.

  13. Resume and Job Description

    • kaggle.com
    zip
    Updated Jul 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sayyed Faizan95 (2025). Resume and Job Description [Dataset]. https://www.kaggle.com/datasets/sayyedfaizan95/resume-and-job-description
    Explore at:
    zip(63737 bytes)Available download formats
    Dataset updated
    Jul 19, 2025
    Authors
    Sayyed Faizan95
    Description

    πŸ“„ Synthetic Resume Dataset (1,200 Records)

    🧠 Overview

    This dataset contains 1,200 high-quality, synthetic resume records carefully generated to simulate real-world professional profiles. It is ideal for HR analytics, recruitment model training, resume screening systems, and academic research.

    The dataset is divided into:

    • βœ… 60% IT Jobs (720 records)
    • βœ… 40% Non-IT Jobs (480 records)

    Each entry mimics realistic resume attributes across a wide range of rolesβ€”from fresh graduates to experienced professionals.

    πŸ“Š Features

    Each record includes the following 14 structured fields:

    • Name, Age, Gender
    • Education_Level, Field_of_Study, Degrees, Institute_Name, Graduation_Year
    • Experience_Years, Current_Job_Title, Previous_Job_Titles
    • Skills, Certifications, Target_Job_Description

    πŸ‘¨β€πŸ’» IT Job Titles (sample):

    • Data Scientist, Cloud Engineer, Prompt Engineer, DevOps Engineer, Full-Stack Developer, MLOps Specialist, Quantum Computing Specialist, and more.

    πŸ§‘β€βš•οΈ Non-IT Job Titles (sample):

    • Nurse, Dentist, Financial Analyst, Mental Health Practitioner, Business Development Manager, Customer Success Manager, and others.

    πŸ” Key Highlights

    • βœ… 100% synthetic data – no privacy concerns
    • βœ… Zero missing values
    • βœ… Realistic skill and certification pairings based on job roles
    • βœ… Professionally written target job descriptions
    • βœ… Balanced age (21–45) and experience (0–10 years) distribution
    • βœ… Includes both freshers (36.8%) and working professionals (63.2%)
  14. Solana Price Prediction

    • kaggle.com
    zip
    Updated Jul 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dhaivat N Jambudia (2025). Solana Price Prediction [Dataset]. https://www.kaggle.com/datasets/dhaivatnjambudia/solana-price-prediction
    Explore at:
    zip(56507 bytes)Available download formats
    Dataset updated
    Jul 31, 2025
    Authors
    Dhaivat N Jambudia
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This Data is Extracted from coinGecko API with Price and Volume feature, data requires lot of feature engineering which is good practice for someone who wants to build MLOPs/Data Science project. This is static data set, if you want to make real pipeline this my repo where I have build real ETL pipeline and data stored in MongoDB Repo: https://github.com/Excergic/SOL-Price-Prediction Thank you.

  15. Social Power NBA

    • kaggle.com
    zip
    Updated Aug 1, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Noah Gift (2017). Social Power NBA [Dataset]. https://www.kaggle.com/noahgift/social-power-nba
    Explore at:
    zip(1397766 bytes)Available download formats
    Dataset updated
    Aug 1, 2017
    Authors
    Noah Gift
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    This data set contains combined on-court performance data for NBA players in the 2016-2017 season, alongside salary, Twitter engagement, and Wikipedia traffic data.

    Further information can be found in a series of articles for IBM Developerworks: "Explore valuation and attendance using data science and machine learning" and "Exploring the individual NBA players".

    A talk about this dataset has slides from March, 2018, Strata:

    https://www.slideshare.net/noahgift/social-power-andinfluenceinthenba-89807740?qid=3f9f835a-f3d7-4174-8a8c-c97f9c82e614&v=&b=&from_search=1

    Further reading on this dataset is in the book Pragmatic AI, in Chapter 6 or full book, Pragmatic AI: An introduction to Cloud-based Machine Learning and watch lesson 9 in Essential Machine Learning and AI with Python and Jupyter Notebook

    Followup Items

    Acknowledgement

    Data sources include ESPN, Basketball-Reference, Twitter, Five-ThirtyEight, and Wikipedia. The source code for this dataset (in Python and R) can be found on GitHub. Links to more writing can be found at noahgift.com.

    Inspiration

    • Do NBA fans know more about who the best players are, or do owners?
    • What is the true worth of the social media presence of athletes in the NBA?
  16. UCI Heart Disease - Explainable AI Project Assets

    • kaggle.com
    zip
    Updated Nov 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ariyan_Pro (2025). UCI Heart Disease - Explainable AI Project Assets [Dataset]. https://www.kaggle.com/datasets/ariyannadeem/uci-heart-disease-explainable-ai-project-assets
    Explore at:
    zip(1051043 bytes)Available download formats
    Dataset updated
    Nov 18, 2025
    Authors
    Ariyan_Pro
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Medical-Grade Explainable AI Project Assets

    This dataset contains comprehensive assets for a production-ready Explainable AI (XAI) heart disease prediction system achieving 94.1% accuracy with full model transparency.

    πŸ“Š CONTEXT: Healthcare AI faces a critical "black box" problem where models make predictions without explanations. This project demonstrates how to build trustworthy medical AI using SHAP and LIME for real-time explainability.

    🎯 PROJECT GOAL: Create a clinically deployable AI system that not only predicts heart disease with high accuracy but also provides interpretable explanations for each prediction, enabling doctor-AI collaboration.

    πŸš€ KEY FEATURES: - 94.1% prediction accuracy (XGBoost + Optuna) - Real-time SHAP & LIME explanations - FastAPI backend with medical validation - Gradio clinical dashboard - Full MLOps pipeline (MLflow tracking) - 4-Layer enterprise architecture

    πŸ“ ASSETS INCLUDED: - heart_clean.csv - Clinical dataset ready for analysis - SHAP summary plots for global explainability - Performance metrics and visualizations - Architecture diagrams - Model evaluation results

    πŸ”— COMPANION RESOURCES: - Live Demo: https://huggingface.co/spaces/Ariyan-Pro/HeartDisease-Predictor - Notebook: https://www.kaggle.com/code/ariyannadeem/heart-disease-prediction-with-explainable-ai - Source Code: https://github.com/Ariyan-Pro/ExplainableAI-HeartDisease

    Perfect for learning medical AI implementation, explainable AI techniques, and production deployment.

  17. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Daniil Krizhanovskyi (2025). mlops-dataset [Dataset]. https://www.kaggle.com/datasets/daniilkrizhanovskyi/mlops-dataset
Organization logo

mlops-dataset

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
zip(4024 bytes)Available download formats
Dataset updated
Nov 12, 2025
Authors
Daniil Krizhanovskyi
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Dataset

This dataset was created by Daniil Krizhanovskyi

Released under MIT

Contents

Search
Clear search
Close search
Google apps
Main menu