32 datasets found
  1. A

    ‘Income Dataset ’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Income Dataset ’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-income-dataset-530e/latest
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Income Dataset ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mastmustu/income on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    The dataset provided predictive feature like education , employment status , marital status to predict if the salary is greater than $50K

    It can be used to practice machine learning problem like classification.

    --- Original source retains full ownership of the source dataset ---

  2. High School Graduate Outcomes Earnings by Industry

    • kaggle.com
    Updated Jun 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khaled Ben Ali (2025). High School Graduate Outcomes Earnings by Industry [Dataset]. https://www.kaggle.com/datasets/khaledxbenali/242424242424/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 14, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Khaled Ben Ali
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The High School Graduate Outcomes Earnings by Industry dataset provides information on post-graduation earnings of high school graduates across different industries. It offers insights into career pathways, income trends, and the economic impact of secondary education.

  3. Adult Income Prediction Classification

    • kaggle.com
    Updated Dec 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sathyam A (2024). Adult Income Prediction Classification [Dataset]. https://www.kaggle.com/datasets/isathyam31/adult-income-prediction-classification
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 13, 2024
    Dataset provided by
    Kaggle
    Authors
    Sathyam A
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset contains information about adult income prediction. It includes the following columns:

    workclass: The type of employment (e.g., Private, Self-emp-not-inc, Federal-gov, Local-gov) fnlwgt: The number of people the census believes the entry represents education: The highest level of education achieved education-num: The numeric representation of the previous column marital-status: The marital status of the individual occupation: The occupation of the individual relationship: The relationship of the individual to their household race: The race of the individual sex: The gender of the individual capital-gain: The capital gains of the individual capital-loss: The capital losses of the individual hours-per-week: The number of hours the individual works per week country: The native country of the individual salary: The income level of the individual, which is the target variable to predict.

    The goal of this dataset is to build a model that can accurately predict the income level of an individual based on the provided features.

  4. A

    ‘Country Socioeconomic Status Scores: 1880-2010’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 24, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2018). ‘Country Socioeconomic Status Scores: 1880-2010’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-country-socioeconomic-status-scores-1880-2010-3da0/latest
    Explore at:
    Dataset updated
    Nov 24, 2018
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Country Socioeconomic Status Scores: 1880-2010’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/sdorius/globses on 14 February 2022.

    --- Dataset description provided by original source is as follows ---

    This dataset contains estimates of the socioeconomic status (SES) position of each of 149 countries covering the period 1880-2010. Measures of SES, which are in decades, allow for a 130 year time-series analysis of the changing position of countries in the global status hierarchy. SES scores are the average of each country’s income and education ranking and are reported as percentile rankings ranging from 1-99. As such, they can be interpreted similarly to other percentile rankings, such has high school standardized test scores. If country A has an SES score of 55, for example, it indicates that 55 percent of the world’s people live in a country with a lower average income and education ranking than country A. ISO alpha and numeric country codes are included to allow users to merge these data with other variables, such as those found in the World Bank’s World Development Indicators Database and the United Nations Common Database.

    See here for a working example of how the data might be used to better understand how the world came to look the way it does, at least in terms of status position of countries.

    VARIABLE DESCRIPTIONS: UNID: ISO numeric country code (used by the United Nations) WBID: ISO alpha country code (used by the World Bank) SES: Socioeconomic status score (percentile) based on GDP per capita and educational attainment (n=174) country: Short country name year: Survey year SES: Socioeconomic status score (1-99) for each of 174 countries gdppc: GDP per capita: Single time-series (imputed) yrseduc: Completed years of education in the adult (15+) population popshare: Total population shares

    DATA SOURCES: The dataset was compiled by Shawn Dorius (sdorius@iastate.edu) from a large number of data sources, listed below. GDP per Capita: 1. Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. Maddison population data in 000s; GDP & GDP per capita data in (1990 Geary-Khamis dollars, PPPs of currencies and average prices of commodities). Maddison data collected from: http://www.ggdc.net/MADDISON/Historical_Statistics/horizontal-file_02-2010.xls. 2. World Development Indicators Database Years of Education 1. Morrisson and Murtin.2009. 'The Century of Education'. Journal of Human Capital(3)1:1-42. Data downloaded from http://www.fabricemurtin.com/ 2. Cohen, Daniel & Marcelo Cohen. 2007. 'Growth and human capital: Good data, good results' Journal of economic growth 12(1):51-76. Data downloaded from http://soto.iae-csic.org/Data.htm 3. Barro, Robert and Jong-Wha Lee, 2013, "A New Data Set of Educational Attainment in the World, 1950-2010." Journal of Development Economics, vol 104, pp.184-198. Data downloaded from http://www.barrolee.com/ Total Population 1. Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. 13.
    2. United Nations Population Division. 2009.

    --- Original source retains full ownership of the source dataset ---

  5. USA Unemployment & Education Level

    • kaggle.com
    Updated Sep 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Val Bauman (2021). USA Unemployment & Education Level [Dataset]. https://www.kaggle.com/valbauman/student-engagement-online-learning-supplement/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 29, 2021
    Dataset provided by
    Kaggle
    Authors
    Val Bauman
    Area covered
    United States
    Description

    Context & Content

    This dataset consists of the unemployment rate and education level of adults in the USA by county. That is, for each county in the USA, this dataset provides the count and percentage of unemployed adults as well as the count and percentage of adults of various educational backgrounds. Each county was been assigned one of four locale categories (City, Suburb, Town, Rural) according to its 2013 Urban Influence Code and their descriptions provided in UIC_codes.csv. From the descriptions of each of the codes and the descriptions of the locales "City", "Suburb", "Town", and "Rural" provided on page 2 of the locale user manual (locale_user_manual.pdf), each county was assigned one of four locales.

    The unemployment rate data includes the count and percentage of unemployed adults for each county in the USA for each year from 2000-2020. The median household income for 2019 is also included. The education level data includes the count and percentage of adults with less than a high school diploma, a high school diploma only, some college, and a bachelor's degree/four years of college or more for the years 1970, 1980, 1990, 2000, and 2019. The Urban Influence Code data includes the UIC and locale description of each county in the USA and the locale user manual has been included as a PDF as strictly a reference file, to understand how each county was assigned a locale within the unemployment.csv and education.csv files.

    Data Sources

    Source for the unemployment rate and education level data by county: "County-level Data Sets." USDA Economic Research Service, US Department of Agriculture. Access date: Sept 8, 2021. URL: https://www.ers.usda.gov/data-products/county-level-data-sets/

    Source for Urban Influence Codes by county: "Urban Influence Codes." USDA Economic Research Service, US Department of Agriculture. Access date: Sept 8, 2021. URL: https://www.ers.usda.gov/data-products/urban-influence-codes/#:~:text=The%202013%20Urban%20Influence%20Codes,to%20metro%20and%20micropolitan%20areas.&text=An%20update%20of%20the%20Urban,is%20planned%20for%20mid%2D2023.

    Inspiration

    This dataset was created to be used as an additional data source for the LearnPlatform COVID-19 Impact on Digital Learning Kaggle competition, but is suitable for other analyses related to unemployment rate and education level in the USA.

  6. Employee Attrition Uncleaned Dataset

    • kaggle.com
    Updated Aug 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NIKHIL BHOSLE (2024). Employee Attrition Uncleaned Dataset [Dataset]. https://www.kaggle.com/datasets/nikhilbhosle/employee-attrition-uncleaned-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 26, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    NIKHIL BHOSLE
    Description

    The Synthetic Employee Attrition Dataset is a simulated dataset designed for the analysis and prediction of employee attrition. It contains detailed information about various aspects of an employee's profile, including demographics, job-related features, and personal circumstances.

    The dataset comprises 74,610 samples, to facilitate model development and evaluation. Each record includes a unique Employee ID and features that influence employee attrition. The goal is to understand the factors contributing to attrition and develop predictive models to identify at-risk employees.

    This dataset is ideal for HR analytics, machine learning model development, and demonstrating advanced data analysis techniques. It provides a comprehensive and realistic view of the factors affecting employee retention, making it a valuable resource for researchers and practitioners in the field of human resources and organizational development.

    FEATURES:

    Employee ID: A unique identifier assigned to each employee. Age: The age of the employee, ranging from 18 to 60 years. Gender: The gender of the employee Years at Company: The number of years the employee has been working at the company. Monthly Income: The monthly salary of the employee, in dollars. Job Role: The department or role the employee works in, encoded into categories such as Finance, Healthcare, Technology, Education, and Media. Work-Life Balance: The employee's perceived balance between work and personal life, (Poor, Below Average, Good, Excellent) Job Satisfaction: The employee's satisfaction with their job: (Very Low, Low, Medium, High) Performance Rating: The employee's performance rating: (Low, Below Average, Average, High) Number of Promotions: The total number of promotions the employee has received. Distance from Home: The distance between the employee's home and workplace, in miles. Education Level: The highest education level attained by the employee: (High School, Associate Degree, Bachelor’s Degree, Master’s Degree, PhD) Marital Status: The marital status of the employee: (Divorced, Married, Single) Job Level: The job level of the employee: (Entry, Mid, Senior) Company Size: The size of the company the employee works for: (Small,Medium,Large) Company Tenure: The total number of years the employee has been working in the industry. Remote Work: Whether the employee works remotely: (Yes or No) Leadership Opportunities: Whether the employee has leadership opportunities: (Yes or No) Innovation Opportunities: Whether the employee has opportunities for innovation: (Yes or No) Company Reputation: The employee's perception of the company's reputation: (Very Poor, Poor,Good, Excellent) Employee Recognition: The level of recognition the employee receives:(Very Low, Low, Medium, High) Attrition: Whether the employee has left the company, encoded as 0 (stayed) and 1 (Left).

  7. A

    ‘Customer Personality Analysis’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Customer Personality Analysis’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-customer-personality-analysis-ff46/11756007/?iid=079-340&v=presentation
    Explore at:
    Dataset updated
    Nov 21, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Customer Personality Analysis’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/imakash3011/customer-personality-analysis on 21 November 2021.

    --- Dataset description provided by original source is as follows ---

    Context

    Problem Statement

    Customer Personality Analysis is a detailed analysis of a company’s ideal customers. It helps a business to better understand its customers and makes it easier for them to modify products according to the specific needs, behaviors and concerns of different types of customers.

    Customer personality analysis helps a business to modify its product based on its target customers from different types of customer segments. For example, instead of spending money to market a new product to every customer in the company’s database, a company can analyze which customer segment is most likely to buy the product and then market the product only on that particular segment.

    Content

    Attributes

    People

    • ID: Customer's unique identifier
    • Year_Birth: Customer's birth year
    • Education: Customer's education level
    • Marital_Status: Customer's marital status
    • Income: Customer's yearly household income
    • Kidhome: Number of children in customer's household
    • Teenhome: Number of teenagers in customer's household
    • Dt_Customer: Date of customer's enrollment with the company
    • Recency: Number of days since customer's last purchase
    • Complain: 1 if customer complained in the last 2 years, 0 otherwise

    Products

    • MntWines: Amount spent on wine in last 2 years
    • MntFruits: Amount spent on fruits in last 2 years
    • MntMeatProducts: Amount spent on meat in last 2 years
    • MntFishProducts: Amount spent on fish in last 2 years
    • MntSweetProducts: Amount spent on sweets in last 2 years
    • MntGoldProds: Amount spent on gold in last 2 years

    Promotion

    • NumDealsPurchases: Number of purchases made with a discount
    • AcceptedCmp1: 1 if customer accepted the offer in the 1st campaign, 0 otherwise
    • AcceptedCmp2: 1 if customer accepted the offer in the 2nd campaign, 0 otherwise
    • AcceptedCmp3: 1 if customer accepted the offer in the 3rd campaign, 0 otherwise
    • AcceptedCmp4: 1 if customer accepted the offer in the 4th campaign, 0 otherwise
    • AcceptedCmp5: 1 if customer accepted the offer in the 5th campaign, 0 otherwise
    • Response: 1 if customer accepted the offer in the last campaign, 0 otherwise

    Place

    • NumWebPurchases: Number of purchases made through the company’s web site
    • NumCatalogPurchases: Number of purchases made using a catalogue
    • NumStorePurchases: Number of purchases made directly in stores
    • NumWebVisitsMonth: Number of visits to company’s web site in the last month

    Target

    Need to perform clustering to summarize customer segments.

    Solution

    You can take help from following link to know more about the approach to solve this problem. Visit this URL

    Inspiration

    happy learning....

    Hope you like this dataset please don't forget to like this dataset

    --- Original source retains full ownership of the source dataset ---

  8. Loan Approval Dataset

    • kaggle.com
    Updated Oct 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arbaaz Tamboli (2024). Loan Approval Dataset [Dataset]. https://www.kaggle.com/datasets/arbaaztamboli/loan-approval-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 15, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Arbaaz Tamboli
    Description

    This dataset contains a wealth of information from 52,000 loan applications, offering detailed insights into the factors that influence loan approval decisions. Collected from financial institutions, this data is highly valuable for credit risk analysis, financial modeling, and predictive analytics. The dataset is particularly useful for anyone interested in applying machine learning techniques to real-world financial decision-making scenarios.

    Overview: This dataset provides information about various applicants and the loans they applied for, including their demographic details, income, loan terms, and approval status. By analyzing this data, one can gain an understanding of which factors are most critical for determining the likelihood of loan approval. The dataset can also help in evaluating credit risk and building robust credit scoring systems.

    Dataset Columns: Applicant_ID: Unique identifier for each loan application. Gender: Gender of the applicant (Male/Female). Age: Age of the applicant. Marital_Status: Marital status of the applicant (Single/Married). Dependents: Number of dependents the applicant has. Education: Education level of the applicant (Graduate/Not Graduate). Employment_Status: Employment status of the applicant (Employed, Self-Employed, Unemployed). Occupation_Type: Type of occupation, which provides insights into the nature of the applicant’s job (Salaried, Business, Others). Residential_Status: Type of residence (Owned, Rented, Mortgage). City/Town: The city or town where the applicant resides. Annual_Income: The total annual income of the applicant, a key factor in loan eligibility. Monthly_Expenses: The monthly expenses of the applicant, indicating their financial obligations. Credit_Score: The applicant's credit score, reflecting their creditworthiness. Existing_Loans: Number of existing loans the applicant is servicing. Total_Existing_Loan_Amount: The total amount of all existing loans the applicant has. Outstanding_Debt: The remaining amount of debt yet to be paid by the applicant. Loan_History: The applicant’s previous loan history (Good/Bad), indicating their repayment reliability. Loan_Amount_Requested: The loan amount the applicant has applied for. Loan_Term: The term of the loan in months. Loan_Purpose: The purpose of the loan (e.g., Home, Car, Education, Personal, Business). Interest_Rate: The interest rate applied to the loan. Loan_Type: The type of loan (Secured/Unsecured). Co-Applicant: Indicates if there is a co-applicant for the loan (Yes/No). Bank_Account_History: Applicant’s banking history, showing past transactions and reliability. Transaction_Frequency: The frequency of financial transactions in the applicant’s bank account (Low/Medium/High). Default_Risk: The risk level of the applicant defaulting on the loan (Low/Medium/High). Loan_Approval_Status: Final decision on the loan application (Approved/Rejected).

  9. o

    Bangalore Consumer Food Delivery Survey

    • opendatabay.com
    .undefined
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Bangalore Consumer Food Delivery Survey [Dataset]. https://www.opendatabay.com/data/consumer/5a39206d-fcd2-472c-9bf9-d5100f226bcb
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Bengaluru, Food & Beverage Consumption
    Description

    This dataset aims to explain the increasing demand for online food delivery services within metropolitan areas, specifically focusing on Bangalore, India. It compiles consumer survey data to shed light on the underlying reasons for this trend. The data is useful for building classification models, such as predicting customer repurchase behaviour, conducting text analysis on consumer reviews, and performing geo-spatial analysis related to consumer locations. This dataset originated from a master's thesis research project.

    Columns

    The dataset contains nearly 55 variables covering various aspects of consumer demographics and purchase decisions. Key columns include: * Age: The age of the consumer. * Gender: The gender of the consumer (e.g., Male, Female). * Marital Status: The marital status of the consumer (e.g., Single, Married, Other). * Occupation: The job or occupation status of the consumer (e.g., Student, Employee, Other). * Monthly Income: The income bracket of the consumer (e.g., No Income, 25001 to 50000, Other). * Educational Qualifications: The education level of the consumer (e.g., Graduate, Post Graduate, Other). * Family size: The number of family members or friends living with the consumer. * latitude: The latitude of the consumer's residence. * longitude: The longitude of the consumer's residence. * Pin code: The pincode of the consumer's residence within Bangalore. * Overall/general purchase decision: Information related to the consumer's general purchase choices. * Time of delivery influencing the purchase decision: Data on how delivery time affects purchase decisions. * Rating of Restaurant influencing the purchase decision: Data on how restaurant ratings influence purchase decisions.

    Distribution

    The dataset is structured with nearly 55 variables and is typically provided in a CSV file format. While specific total record counts are not available, value counts for various attributes offer insight into the data distribution. For example, 57% of consumers are Male and 43% are Female. Regarding marital status, 69% are Single, 28% are Married, and 3% fall into other categories. Students represent the largest occupation group at 53%, followed by Employees at 30%. Income distribution shows 48% with No Income. Educational qualifications are split almost evenly between Graduates (46%) and Post Graduates (45%). The data includes ranges and counts for attributes like age, family size, latitude, longitude, and Bangalore pincodes, indicating a diverse set of consumer responses.

    Usage

    This dataset is ideal for: * Classification modelling: Predicting consumer behaviour, such as whether a consumer will make a repeat purchase. * Text analysis: Analysing consumer reviews to extract insights. * Geo-spatial analysis: Understanding purchasing patterns based on consumer location (latitude and longitude). * Market research: Gaining insights into consumer preferences and demographics impacting online food delivery.

    Coverage

    The dataset focuses geographically on the Bangalore region in India. It covers a diverse demographic scope of consumers residing in Bangalore, with detailed attributes including age, gender, marital status, occupation, income bracket, educational qualifications, and family size. Distributions for these demographic groups are available within the dataset. A specific time range for the data collection is not explicitly stated.

    License

    CCO

    Who Can Use It

    • Market Researchers: To analyse consumer demographics, purchase decisions, and the factors influencing online food delivery in Bangalore.
    • Food Delivery Companies: For strategic planning, customer segmentation, predicting repeat purchases, and optimising delivery services and restaurant partnerships.
    • Data Scientists and Analysts: For building predictive models, performing geo-spatial analysis, and conducting text mining on consumer feedback.
    • Academic Researchers and Students: Particularly those studying consumer behaviour, urban consumption trends, or e-commerce in India.

    Dataset Name Suggestions

    • Online Food Delivery Preferences - Bangalore Region
    • Bangalore Consumer Food Delivery Survey
    • Indian Metropolitan Food Delivery Habits
    • Consumer Demographics in Online Food Ordering
    • Food Delivery Behaviour - Bangalore

    Attributes

    Original Data Source: Online Food Delivery Preferences-Bangalore region

  10. Food Delivery Data

    • kaggle.com
    Updated Mar 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ADtech_1234 (2024). Food Delivery Data [Dataset]. https://www.kaggle.com/datasets/adtech1234/food-delivery-data/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 19, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ADtech_1234
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    The dataset titled "Online Delivery Data" comprises 388 entries, each representing an individual's response to a survey concerning their preferences and experiences with online food delivery services in Australia. The dataset is structured into 53 columns, encompassing a wide range of information from demographic details to specific preferences and feedback on online food delivery services. Below is an in-depth description of its structure and the types of information it contains.

    Dataset Overview Entries: 388 Attributes: 53 Core Attributes Description Demographic and Background Information

    Age: The respondent's age. Gender: The gender of the respondent. Marital Status: Marital status of the respondent (e.g., Single, Married). Occupation: The respondent's occupation. Monthly Income: Monthly income category of the respondent. Educational Qualifications: Educational level achieved by the respondent. City: The city in Australia where the respondent resides. Family size: Number of members in the respondent's family. Service Utilization Preferences

    Medium of ordering (P1 and P2): Primary and secondary preferences for ordering mediums, such as food delivery apps or direct calls. Meal preference (P1 and P2): Primary and secondary meal preferences. Preference reasons (P1 and P2): Primary and secondary reasons for their preferences. Perceptions and Attitudes

    Various columns capture the respondent's attitudes towards ease and convenience, time-saving aspects, variety of choices, payment options, discounts and offers, food quality, tracking system, and several other factors related to online food delivery. Health and Hygiene Concerns

    Specific concerns regarding health, delivery punctuality, hygiene, and past negative experiences with online food delivery services. Service Quality and Feedback

    Attributes covering delivery time importance, packaging quality, customer service aspects (such as the number of calls to service and politeness), food freshness, temperature, taste, and quantity. Output: Likely a binary response (e.g., Yes or No) to a specific survey question, which could pertain to the respondent's overall satisfaction or willingness to recommend the service. Reviews: Open-ended feedback from respondents, providing qualitative insights into their experiences. Summary This dataset provides a comprehensive view of consumer preferences, behaviors, and satisfaction levels regarding online food delivery services in Australia. It encompasses a broad spectrum of variables from basic demographic information to detailed opinions on service quality, making it an invaluable resource for analyzing consumer trends, identifying areas for improvement in service delivery, and understanding the factors that influence customer satisfaction and loyalty in the online food delivery industry.

  11. Country Socioeconomic Status Scores, Part II

    • kaggle.com
    Updated Jul 14, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sdorius (2017). Country Socioeconomic Status Scores, Part II [Dataset]. https://www.kaggle.com/datasets/sdorius/countryses/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 14, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    sdorius
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This dataset contains estimates of the socioeconomic status (SES) position of each of 149 countries covering the period 1880-2010. Measures of SES, which are in decades, allow for a 130 year time-series analysis of the changing position of countries in the global status hierarchy. SES scores are the average of each country’s income and education ranking and are reported as percentile rankings ranging from 1-99. As such, they can be interpreted similarly to other percentile rankings, such has high school standardized test scores. If country A has an SES score of 55, for example, it indicates that 55 percent of the countries in this dataset have a lower average income and education ranking than country A. ISO alpha and numeric country codes are included to allow users to merge these data with other variables, such as those found in the World Bank’s World Development Indicators Database and the United Nations Common Database.

    See here for a working example of how the data might be used to better understand how the world came to look the way it does, at least in terms of status position of countries.

    VARIABLE DESCRIPTIONS:

    unid: ISO numeric country code (used by the United Nations)

    wbid: ISO alpha country code (used by the World Bank)

    SES: Country socioeconomic status score (percentile) based on GDP per capita and educational attainment (n=174)

    country: Short country name

    year: Survey year

    gdppc: GDP per capita: Single time-series (imputed)

    yrseduc: Completed years of education in the adult (15+) population

    region5: Five category regional coding schema

    regionUN: United Nations regional coding schema

    DATA SOURCES:

    The dataset was compiled by Shawn Dorius (sdorius@iastate.edu) from a large number of data sources, listed below. GDP per Capita:

    1. Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. GDP & GDP per capita data in (1990 Geary-Khamis dollars, PPPs of currencies and average prices of commodities). Maddison data collected from: http://www.ggdc.net/MADDISON/Historical_Statistics/horizontal-file_02-2010.xls.

    2. World Development Indicators Database Years of Education 1. Morrisson and Murtin.2009. 'The Century of Education'. Journal of Human Capital(3)1:1-42. Data downloaded from http://www.fabricemurtin.com/ 2. Cohen, Daniel & Marcelo Cohen. 2007. 'Growth and human capital: Good data, good results' Journal of economic growth 12(1):51-76. Data downloaded from http://soto.iae-csic.org/Data.htm

    3. Barro, Robert and Jong-Wha Lee, 2013, "A New Data Set of Educational Attainment in the World, 1950-2010." Journal of Development Economics, vol 104, pp.184-198. Data downloaded from http://www.barrolee.com/

    4. Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. 13.

    5. United Nations Population Division. 2009.

  12. 🌍 World Education Dataset 📚

    • kaggle.com
    Updated Nov 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bushra Qurban (2024). 🌍 World Education Dataset 📚 [Dataset]. https://www.kaggle.com/datasets/bushraqurban/world-education-dataset/versions/5
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 22, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Bushra Qurban
    License

    https://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets

    Area covered
    World
    Description

    Dataset Overview 📝

    The dataset includes the following key indicators, collected for over 200 countries:

    • Government Expenditure on Education (% of GDP): Shows the percentage of a country’s GDP allocated to education.
    • Literacy Rate (Adult Total): Represents the percentage of the population aged 15 and above who can read and write.
    • Primary Completion Rate: The percentage of children who complete their primary education within the official age group.
    • Pupil-Teacher Ratio (Primary and Secondary Education): Indicates the average number of students per teacher at the primary and secondary levels.
    • School Enrollment Rates (Primary, Secondary, Tertiary): Reflects the percentage of the relevant age group enrolled in schools across different education levels.

    Data Source 🌐

    World Bank: This dataset is compiled from the World Bank's educational database, providing reliable, updated statistics on educational progress worldwide.

    Potential Use Cases 🔍 This dataset is ideal for anyone interested in:

    Educational Research: Understanding how education spending and policies impact literacy, enrollment, and overall educational outcomes. Predictive Modeling: Building models to predict educational success factors, such as completion rates and literacy. Global Education Analysis: Analyzing trends in global education systems and how different countries allocate resources to education. Policy Development: Helping governments and organizations make data-driven decisions regarding educational reforms and funding.

    Key Questions You Can Explore 🤔

    How does government expenditure on education correlate with literacy rates and school enrollment across different regions? What are the trends in pupil-teacher ratios over time, and how do they affect educational outcomes? How do education indicators differ between low-income and high-income countries? Can we predict which countries will achieve universal primary education based on current trends?

    Important Notes ⚠️ - Missing Data: Some values may be missing for certain years or countries. Consider using techniques like forward filling or interpolation when working with time series models. - Data Limitations: This dataset provides global averages and may not capture regional disparities within countries.

  13. Treadmill_company_dataset

    • kaggle.com
    Updated May 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atharva Sawai (2024). Treadmill_company_dataset [Dataset]. https://www.kaggle.com/datasets/atharvasawai/treadmill-company-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 27, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Atharva Sawai
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    In this Dataset you have the data for a Fitness Company product which sells 3 Models of treadmill. The dataset contains the data for Product (i.e. Models of treadmill), Age of the Buyer , Gender, Education, Marital Status, Usage, Fitness, Income, Miles.

  14. Data on Bike Buyers by using MS EXCEL

    • kaggle.com
    Updated Mar 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Umasri (2022). Data on Bike Buyers by using MS EXCEL [Dataset]. https://www.kaggle.com/datasets/unica02/data-on-bike-buyers-by-using-ms-excel/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 25, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Umasri
    Description

    The dataset includes customer id,Martial Status,Gender,Income,Children,Education,Occupation,Home Owner,Cars,Commute Distance,Region,Age,Purchased Bike. Blog

  15. Predicting Earnings from census data

    • kaggle.com
    Updated Dec 30, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    piAI (2019). Predicting Earnings from census data [Dataset]. https://www.kaggle.com/econdata/predicting-earnings-from-census-data/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 30, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    piAI
    Description

    Context

    The United States government periodically collects demographic information by conducting a census.

    In this problem, we are going to use census information about an individual to predict how much a person earns -- in particular, whether the person earns more than $50,000 per year. This data comes from the UCI Machine Learning Repository.

    The file census.csv contains 1994 census data for 31,978 individuals in the United States.### Context

    There's a story behind every dataset and here's your opportunity to share yours.

    Content

    The dataset includes the following 13 variables:

    age = the age of the individual in years workclass = the classification of the individual's working status (does the person work for the federal government, work for the local government, work without pay, and so on) education = the level of education of the individual (e.g., 5th-6th grade, high school graduate, PhD, so on) maritalstatus = the marital status of the individual occupation = the type of work the individual does (e.g., administrative/clerical work, farming/fishing, sales and so on) relationship = relationship of individual to his/her household race = the individual's race sex = the individual's sex capitalgain = the capital gains of the individual in 1994 (from selling an asset such as a stock or bond for more than the original purchase price) capitalloss = the capital losses of the individual in 1994 (from selling an asset such as a stock or bond for less than the original purchase price) hoursperweek = the number of hours the individual works per week nativecountry = the native country of the individual over50k = whether or not the individual earned more than $50,000 in 1994

    Acknowledgements

    MITx ANALYTIX

  16. Employment Of India CLeaned and Messy Data

    • kaggle.com
    Updated Apr 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SONIA SHINDE (2025). Employment Of India CLeaned and Messy Data [Dataset]. https://www.kaggle.com/datasets/soniaaaaaaaa/employment-of-india-cleaned-and-messy-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 7, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    SONIA SHINDE
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    India
    Description

    This dataset presents a dual-version representation of employment-related data from India, crafted to highlight the importance of data cleaning and transformation in any real-world data science or analytics project.

    🔹 Dataset Composition:

    It includes two parallel datasets: 1. Messy Dataset (Raw) – Represents a typical unprocessed dataset often encountered in data collection from surveys, databases, or manual entries. 2. Cleaned Dataset – This version demonstrates how proper data preprocessing can significantly enhance the quality and usability of data for analytical and visualization purposes.

    Each record captures multiple attributes related to individuals in the Indian job market, including: - Age Group
    - Employment Status (Employed/Unemployed)
    - Monthly Salary (INR)
    - Education Level
    - Industry Sector
    - Years of Experience
    - Location
    - Perceived AI Risk
    - Date of Data Recording

    Transformations & Cleaning Applied:

    The raw dataset underwent comprehensive transformations to convert it into its clean, analysis-ready form: - Missing Values: Identified and handled using either row elimination (where critical data was missing) or imputation techniques. - Duplicate Records: Identified using row comparison and removed to prevent analytical skew. - Inconsistent Formatting: Unified inconsistent naming in columns (like 'monthly_salary_(inr)' → 'Monthly Salary (INR)'), capitalization, and string spacing. - Incorrect Data Types: Converted columns like salary from string/object to float for numerical analysis. - Outliers: Detected and handled based on domain logic and distribution analysis. - Categorization: Converted numeric ages into grouped age categories for comparative analysis. - Standardization: Uniform labels for employment status, industry names, education, and AI risk levels were applied for visualization clarity.

    Purpose & Utility:

    This dataset is ideal for learners and professionals who want to understand: - The impact of messy data on visualization and insights - How transformation steps can dramatically improve data interpretation - Practical examples of preprocessing techniques before feeding into ML models or BI tools

    It's also useful for: - Training ML models with clean inputs
    - Data storytelling with visual clarity
    - Demonstrating reproducibility in data cleaning pipelines

    By examining both the messy and clean datasets, users gain a deeper appreciation for why “garbage in, garbage out” rings true in the world of data science.

  17. salary data sheet for a company

    • kaggle.com
    Updated Oct 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamed Elkahwagy (2024). salary data sheet for a company [Dataset]. https://www.kaggle.com/datasets/mohamedelkahwagy/salary-data-sheet-for-a-company/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 12, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mohamed Elkahwagy
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The motivation behind analyzing salary data is to gain insights into compensation trends, identify factors that influence pay, and understand disparities across industries, locations, or job roles. For businesses, this analysis is crucial in shaping competitive compensation packages, attracting top talent, and ensuring fair pay practices. Additionally, individuals can benefit from understanding how their salaries compare to industry standards, aiding in negotiation strategies.

    Context With increasing attention on pay transparency and equity, salary data has become a critical dataset for human resources departments, economists, and policymakers. Companies and industries alike need to assess compensation against benchmarks, inflation, and the evolving job market. Salary datasets often contain variables such as job titles, experience levels, education, locations, and industries, which are essential in determining pay structures. This analysis allows for a deeper dive into trends like gender pay gaps, regional disparities, and the impact of education or experience on earnings.

    For the Kaggle community, salary datasets provide rich opportunities for performing exploratory data analysis, statistical modeling, and predictive analytics. It serves as a hands-on opportunity to practice data wrangling, feature engineering, and model building, especially in the realm of HR analytics.

    Description This CSV file contains anonymized company salary data across various industries, roles, and locations. The dataset includes key variables such as:

    Job Title: The role of the employee (e.g., Data Analyst, Software Engineer). Years of Experience: Number of years the employee has been in the workforce or industry. Education Level: The highest degree obtained by the employee (e.g., Bachelor's, Master's). Location: City or country where the employee works. Industry: The sector in which the company operates (e.g., Finance, Technology). Annual Salary: The employee’s yearly earnings, including bonuses or incentives. Gender: Gender identification of the employee (if available). Remote Work Percentage: The percentage of work conducted remotely, which may influence salary based on location independence. The dataset is perfect for understanding how salaries vary by job role, region, industry, and experience level. It can also be used to uncover trends such as salary growth over time, the impact of education or certifications on compensation, or potential gender pay gaps. Through data visualization, predictive models, and regression analysis, users can extract meaningful insights that could inform corporate strategy, HR policies, or even career decisions.

  18. Customer marketing (For Cluster Training)

    • kaggle.com
    Updated Nov 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahdi Navaei (2022). Customer marketing (For Cluster Training) [Dataset]. https://www.kaggle.com/datasets/mahdinavaei/customermarketing/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 26, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mahdi Navaei
    Description

    Context Problem Statement

    Customer Personality Analysis is a detailed analysis of a company’s ideal customers. It helps a business to better understand its customers. It makes it easier for them to modify products according to the specific needs, behaviors, and concerns of different types of customers.

    Customer personality analysis helps a business to modify its product based on its target customers from different types of customer segments. For example, instead of spending money to market a new product to every customer in the company’s database, a company can analyze which customer segment is most likely to buy the product and then market the product only to that particular segment.

    Content Attributes

    People

    ID: Customer's unique identifier Year_Birth: Customer's birth year Education: Customer's education level Marital_Status: Customer's marital status Income: Customer's yearly household income Kidhome: Number of children in customer's household Teenhome: Number of teenagers in customer's household Dt_Customer: Date of customer's enrollment with the company Recency: Number of days since customer's last purchase Complain: 1 if the customer complained in the last 2 years, 0 otherwise Products

    MntWines: Amount spent on wine in last 2 years MntFruits: Amount spent on fruits in last 2 years MntMeatProducts: Amount spent on meat in last 2 years MntFishProducts: Amount spent on fish in last 2 years MntSweetProducts: Amount spent on sweets in last 2 years MntGoldProds: Amount spent on gold in last 2 years Promotion

    NumDealsPurchases: Number of purchases made with a discount AcceptedCmp1: 1 if the customer accepted the offer in the 1st campaign, 0 otherwise AcceptedCmp2: 1 if customer accepted the offer in the 2nd customer accepted the offer in the 2nd campaign, 0 otherwise AcceptedCmp3: 1 if the customer accepted the offer in the 3rd campaign, 0 otherwise AcceptedCmp4: 1 if customer accepted the offer in the 4th customer accepted the offer in the 4th campaign, 0 otherwise AcceptedCmp5: 1 if the customer accepted the offer in the 5th campaign, 0 otherwise Response: 1 if customer accepted the offer in the last campaign, 0 otherwise Place

    NumWebPurchases: Number of purchases made through the company’s website NumCatalogPurchases: Number of purchases made using a catalog NumStorePurchases: Number of purchases made directly in stores NumWebVisitsMonth: Number of visits to the company’s website in the last month Target Need to perform clustering to summarize customer segments.

    Inspiration happy learning….

    I hope you like this dataset please don't forget to like this dataset

  19. Country Socioeconomic Status Scores: 1880-2010

    • kaggle.com
    Updated Apr 18, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    sdorius (2017). Country Socioeconomic Status Scores: 1880-2010 [Dataset]. https://www.kaggle.com/sdorius/globses/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 18, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    sdorius
    Description

    This dataset contains estimates of the socioeconomic status (SES) position of each of 149 countries covering the period 1880-2010. Measures of SES, which are in decades, allow for a 130 year time-series analysis of the changing position of countries in the global status hierarchy. SES scores are the average of each country’s income and education ranking and are reported as percentile rankings ranging from 1-99. As such, they can be interpreted similarly to other percentile rankings, such has high school standardized test scores. If country A has an SES score of 55, for example, it indicates that 55 percent of the world’s people live in a country with a lower average income and education ranking than country A. ISO alpha and numeric country codes are included to allow users to merge these data with other variables, such as those found in the World Bank’s World Development Indicators Database and the United Nations Common Database.

    See here for a working example of how the data might be used to better understand how the world came to look the way it does, at least in terms of status position of countries.

    VARIABLE DESCRIPTIONS: UNID: ISO numeric country code (used by the United Nations) WBID: ISO alpha country code (used by the World Bank) SES: Socioeconomic status score (percentile) based on GDP per capita and educational attainment (n=174) country: Short country name year: Survey year SES: Socioeconomic status score (1-99) for each of 174 countries gdppc: GDP per capita: Single time-series (imputed) yrseduc: Completed years of education in the adult (15+) population popshare: Total population shares

    DATA SOURCES: The dataset was compiled by Shawn Dorius (sdorius@iastate.edu) from a large number of data sources, listed below. GDP per Capita: 1. Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. Maddison population data in 000s; GDP & GDP per capita data in (1990 Geary-Khamis dollars, PPPs of currencies and average prices of commodities). Maddison data collected from: http://www.ggdc.net/MADDISON/Historical_Statistics/horizontal-file_02-2010.xls. 2. World Development Indicators Database Years of Education 1. Morrisson and Murtin.2009. 'The Century of Education'. Journal of Human Capital(3)1:1-42. Data downloaded from http://www.fabricemurtin.com/ 2. Cohen, Daniel & Marcelo Cohen. 2007. 'Growth and human capital: Good data, good results' Journal of economic growth 12(1):51-76. Data downloaded from http://soto.iae-csic.org/Data.htm 3. Barro, Robert and Jong-Wha Lee, 2013, "A New Data Set of Educational Attainment in the World, 1950-2010." Journal of Development Economics, vol 104, pp.184-198. Data downloaded from http://www.barrolee.com/ Total Population 1. Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. 13.
    2. United Nations Population Division. 2009.

  20. bank_loan_data

    • kaggle.com
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Uday Malviya (2025). bank_loan_data [Dataset]. http://doi.org/10.34740/kaggle/dsv/10791226
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Uday Malviya
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Overview This dataset contains 45,000 records of loan applicants, with various attributes related to personal demographics, financial status, and loan details. The dataset can be used for predictive modeling, particularly in credit risk assessment and loan default prediction.

    Dataset Content The dataset includes 14 columns representing different factors influencing loan approvals and defaults:

    Personal Information

    person_age: Age of the applicant (in years). person_gender: Gender of the applicant (male, female). person_education: Educational background (High School, Bachelor, Master, etc.). person_income: Annual income of the applicant (in USD). person_emp_exp: Years of employment experience. person_home_ownership: Type of home ownership (RENT, OWN, MORTGAGE). Loan Details

    loan_amnt: Loan amount requested (in USD). loan_intent: Purpose of the loan (PERSONAL, EDUCATION, MEDICAL, etc.). loan_int_rate: Interest rate on the loan (percentage). loan_percent_income: Ratio of loan amount to income. Credit & Loan History

    cb_person_cred_hist_length: Length of the applicant's credit history (in years). credit_score: Credit score of the applicant. previous_loan_defaults_on_file: Whether the applicant has previous loan defaults (Yes or No). Target Variable

    loan_status: 1 if the loan was repaid successfully, 0 if the applicant defaulted. Use Cases Loan Default Prediction: Build a classification model to predict loan repayment. Credit Risk Analysis: Analyze the relationship between income, credit score, and loan defaults. Feature Engineering: Extract new insights from employment history, home ownership, and loan amounts. Acknowledgments This dataset is synthetic and designed for machine learning and financial risk analysis.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Income Dataset ’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-income-dataset-530e/latest

‘Income Dataset ’ analyzed by Analyst-2

Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Analysis of ‘Income Dataset ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mastmustu/income on 28 January 2022.

--- Dataset description provided by original source is as follows ---

The dataset provided predictive feature like education , employment status , marital status to predict if the salary is greater than $50K

It can be used to practice machine learning problem like classification.

--- Original source retains full ownership of the source dataset ---

Search
Clear search
Close search
Google apps
Main menu