Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was generated to simulate realistic conditions under which a marriage may or may not end in divorce, allowing for training and evaluation of binary classification models.
The dataset consists of 5,000 synthetic observations, each representing a unique couple, with an associated divorced target variable (1 = divorced, 0 = not divorced).
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset explores marriage trends in India, comparing love marriages and arranged marriages across various demographic, social, and economic factors. I, capturing key aspects such as age at marriage, caste and religion dynamics, parental approval, dowry exchange, marital satisfaction, divorce rates, income levels, and urban-rural differences.
The dataset aims to provide valuable insights into changing marriage patterns, the role of tradition vs. modernity, and their impact on marital outcomes. Researchers, sociologists, and data analysts can use this dataset to study relationship trends, predict marriage success, and analyze social influences on marriage in India.
ID – Unique identifier
Marriage_Type – Love / Arranged
Age_at_Marriage – Age of the person at marriage
Gender – Male / Female
Education_Level – School / Graduate / Postgraduate / PhD
Caste_Match – Same / Different
Religion – Hindu / Muslim / Christian / Sikh / Others
Parental_Approval – Yes / No / Partial
Urban_Rural – Urban / Rural
Dowry_Exchanged – Yes / No / Not Disclosed
Marital_Satisfaction – Low / Medium / High
Divorce_Status – Yes / No
Children_Count – Number of children (0-5)
Income_Level – Low / Middle / High
Years_Since_Marriage – Number of years since marriage
Spouse_Working – Yes / No
Inter-Caste – Yes / No
Inter-Religion – Yes / No
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is a synthetic dataset on marriage proposals for anyone who wants to carry out a classification experiment.
The dataset has the following features:
Height: Represents the height of an individual in centimeters (from 150 to 180). Age: Represents the age of an individual (from 20 to 80). Income: Represents the monthly income of an individual (from $5,000 to $20,000). RomanticGestureScore: Represents a score (from 0 to 10) related to romantic gestures. CompatibilityScore: Represents a score (from 0 to 9) related to compatibility. CommunicationScore: Represents a score (from 0 to 9) related to communication. DistanceKM: Represents the distance (from 1 to 99) in kilometers. AgeCategory: This is a derived feature from 'Age', categorized into groups like 'Young', 'Middle-aged', etc. Response: Represents the response variable, indicating marriage proposal acceptance as 1 or marriage proposal rejection as 0.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data introduction • Banking dataset is a dataset created to identify existing customers who are more likely to sign up for long-term deposits and focus marketing activities on these customers.
2) Data utilization (1)Banking data has characteristics that: • Predict whether a customer subscribes to a term deposit with yes or no based on 13 columns of data such as age, occupation, and marriage. (2)Banking data can be used to: • Customer Segmentation: Data sets segment customers based on their likelihood of signing up for term deposits, enabling more personalized and relevant communication strategies. • Financial Planning: Financial institutions can use insights from this data set to predict future demand for term deposits and support strategic planning and resource allocation.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was obtained from the https://datos.gob.mx/busca/dataset/registro-civil
This is the Mexican government official dataset for the number of divorces in the city of Xalapa, Mexico
The dataset contains records of approximately 4,900+ divorces for the 15 years period (2000-2015) in the city of Xalapa, Mexico. The special good thing about this dataset is that it contains divorcees birth dates which are usually considered as being sensitive information and usually not included in the public datasets.
The dataset is originally in Spanish and I did translate all the column headers into English. Files are as follows:
divorces_2000-2015_original.csv - original dataset (in Spanish) descriptions_for_ column.csv - descriptions for each column (in Spanish) divorces_2000-2015_translated.csv - the version with the English translated column headers (please note that only column headers were translated) comp_matrix.csv - the table of the zodiac signs compatibility rates that were used in my notebook (https://www.kaggle.com/aagghh/testing-the-astrology-and-zodiac-claims). Note: ignore it if you do not need it
Major features are:
Date of divorce
Birth dates for both partners (man/woman)
Nationality for both partners (man/woman)
Place of birth and residence for both partners (man/woman)
Monthly income for both partners (man/woman)
Occupation for both partners (man/woman)
Date of marriage (man/woman)
Level of education for both partners (man/woman)
Employment status for both partners (man/woman)
Number of children and their custody
Other features - please refer to the file & columns descriptions below
A potential use-case for this data could be a practice in classification/clustering problems in an attempt to predict a divorce.
Some other data analysis can be applied to this dataset. For instance, I am gonna use this data for the horoscope/zodiac/astrology claims validation/testing.
Please refer to my notebook on it here: https://www.kaggle.com/aagghh/is-astrology-right-testing-the-zodiac-claims
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset delves into the complex topic of marital affairs, aiming to uncover the underlying social, demographic and psychological factors that may contribute to extramarital relationships. Originally derived from a well-known behavioral study, the dataset captures the characteristics of 6,365 married individuals, documenting their personal attributes and self-reported involvement in affairs.
📊 Features Overview: - rate_marriage – Self-rated satisfaction with the marriage (scale: 1–5) - age – Age of the respondent - yrs_married – Number of years married - children – Number of children - religious – Level of religious commitment (scale: 1–5) - educ – Education level (years of education) - occupation – Occupation of the respondent (categorical encoded) - occupation_husb – Occupation of the respondent's husband (for female respondents; also categorical encoded) - affairs – Time spent in extramarital affairs (in months)
🎯 Target Variable: - affairs is a continuous variable, but for classification purposes, a binary label can be created (e.g., Had Affair: Yes/No where 0 = no affair, >0 = had affair).
📌 Why This Dataset? Understanding human behavior in relationships is both socially and psychologically significant. This dataset can be used for: * Behavioral and psychological analysis * Predictive modeling (classification/regression) * Exploratory data analysis (EDA) * Feature engineering and model interpretability exercises * Educational purposes in social sciences and data science
🛠️ Use Case Ideas: * Predicting the likelihood of having an affair based on personal and relationship characteristics * Analyzing how marriage satisfaction correlates with infidelity * Building interpretable models to identify key predictors * Understanding the role of education, age, and religion in marital stability
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was generated to simulate realistic conditions under which a marriage may or may not end in divorce, allowing for training and evaluation of binary classification models.
The dataset consists of 5,000 synthetic observations, each representing a unique couple, with an associated divorced target variable (1 = divorced, 0 = not divorced).