1 dataset found
  1. Gym User Dropout Prediction Dataset

    • kaggle.com
    Updated Aug 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hassan Abdul-razeq (2025). Gym User Dropout Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/hassanabdulrazeq/gym-user-dropout-prediction-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 3, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Hassan Abdul-razeq
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Gym User Dropout Prediction Dataset

    Description

    This synthetic dataset simulates user behavior in a fitness application, designed to predict the risk of gym membership dropout based on attendance patterns and personal attributes. The dataset contains 10,000 realistic user profiles with features that influence gym retention, making it ideal for classification tasks in behavioral analytics.

    Key Features

    • Realistic distributions matching actual gym user behavior patterns
    • Complex feature interactions that simulate real-world decision-making
    • Controlled noise to mimic natural data variability
    • Balanced classes for effective machine learning modeling

    Potential Use Cases

    • Predicting at-risk users for retention interventions
    • Analyzing factors contributing to gym commitment
    • Developing personalized workout recommendations
    • Behavioral segmentation of fitness app users

    Dataset Characteristics

    • Number of instances: 10,000
    • Number of features: 8 predictive + 1 target
    • Missing values: No
    • Synthetic but realistic: Yes

    Columns Description

    FeatureTypeDescriptionValue Range
    user_idintUnique user identifier1-10000
    ageintUser's age18-60 (peaked at 25-40)
    gendercategoricalUser's genderMale/Female
    sessions_per_weekintWeekly gym attendance0-7 sessions
    avg_session_durationfloatAverage workout length in minutes10-120
    progress_scorefloatComposite fitness progress metric0-100
    mood_aftercategoricalPost-workout emotional stateEnergized/Neutral/Fatigued
    injurycategoricalReported workout injuriesNone/Knee/Back/Shoulder
    dropoutbinaryTarget variable - quit status0 (active)/1 (quit)

    Generation Methodology

    Data was programmatically generated with: 1. Base distributions matching real gym statistics 2. Logical correlations between features (e.g., more sessions → longer durations) 3. Non-linear relationships in target variable 4. Controlled noise injection (Gaussian + categorical variability)

    Suggested Evaluation Metrics

    For classification models: - Precision-Recall curves (class imbalance consideration) - F1 score - ROC AUC - Feature importance analysis

    License

    CC0: Public Domain (Free to use for any purpose)

    Acknowledgements

    Synthetic dataset created for machine learning education and benchmarking purposes. Inspired by real fitness app analytics challenges.

    Dataset Link

    gym_user_dropout_dataset.csv

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Hassan Abdul-razeq (2025). Gym User Dropout Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/hassanabdulrazeq/gym-user-dropout-prediction-dataset
Organization logo

Gym User Dropout Prediction Dataset

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 3, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Hassan Abdul-razeq
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Gym User Dropout Prediction Dataset

Description

This synthetic dataset simulates user behavior in a fitness application, designed to predict the risk of gym membership dropout based on attendance patterns and personal attributes. The dataset contains 10,000 realistic user profiles with features that influence gym retention, making it ideal for classification tasks in behavioral analytics.

Key Features

  • Realistic distributions matching actual gym user behavior patterns
  • Complex feature interactions that simulate real-world decision-making
  • Controlled noise to mimic natural data variability
  • Balanced classes for effective machine learning modeling

Potential Use Cases

  • Predicting at-risk users for retention interventions
  • Analyzing factors contributing to gym commitment
  • Developing personalized workout recommendations
  • Behavioral segmentation of fitness app users

Dataset Characteristics

  • Number of instances: 10,000
  • Number of features: 8 predictive + 1 target
  • Missing values: No
  • Synthetic but realistic: Yes

Columns Description

FeatureTypeDescriptionValue Range
user_idintUnique user identifier1-10000
ageintUser's age18-60 (peaked at 25-40)
gendercategoricalUser's genderMale/Female
sessions_per_weekintWeekly gym attendance0-7 sessions
avg_session_durationfloatAverage workout length in minutes10-120
progress_scorefloatComposite fitness progress metric0-100
mood_aftercategoricalPost-workout emotional stateEnergized/Neutral/Fatigued
injurycategoricalReported workout injuriesNone/Knee/Back/Shoulder
dropoutbinaryTarget variable - quit status0 (active)/1 (quit)

Generation Methodology

Data was programmatically generated with: 1. Base distributions matching real gym statistics 2. Logical correlations between features (e.g., more sessions → longer durations) 3. Non-linear relationships in target variable 4. Controlled noise injection (Gaussian + categorical variability)

Suggested Evaluation Metrics

For classification models: - Precision-Recall curves (class imbalance consideration) - F1 score - ROC AUC - Feature importance analysis

License

CC0: Public Domain (Free to use for any purpose)

Acknowledgements

Synthetic dataset created for machine learning education and benchmarking purposes. Inspired by real fitness app analytics challenges.

Dataset Link

gym_user_dropout_dataset.csv

Search
Clear search
Close search
Google apps
Main menu