Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains synthetic data generated for customer churn analysis. It includes 1000 entries representing customer information, such as demographics, account details, subscription types, and churn status. The data is ideal for predictive modeling, machine learning algorithms, and exploratory data analysis (EDA). Features: CustomerID: A unique identifier for each customer. Gender: Male or Female. Age: Customer's age in years. Geography: Country or region of the customer (e.g., Germany, France, UK). Tenure: Number of months the customer has been with the company. Contract: Type of subscription (Month-to-month, One-year, Two-year). MonthlyCharges: The amount billed monthly. TotalCharges: The total amount billed to date. PaymentMethod: Method used for payments (e.g., Credit card, Direct debit). IsActiveMember: Whether the customer is an active member (1 = Active, 0 = Inactive). Churn: Indicates whether the customer has churned (Yes/No).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset was compiled to examine the use of ChatGPT 3.5 in educational settings, particularly for creating and personalizing concept maps. The data has been organized into three folders: Maps, Texts, and Questionnaires. The Maps folder contains the graphical representation of the concept maps and the PlanUML code for drawing them in Italian and English. The Texts folder contains the source text used as input for the map's creation The Questionnaires folder includes the students' responses to the three administered questionnaires.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
DISL
The DISL dataset features a collection of 514506 unique Solidity files that have been deployed to Ethereum mainnet. It caters to the need for a large and diverse dataset of real-world smart contracts. DISL serves as a resource for developing machine learning systems and for benchmarking software engineering tools designed for smart contracts.
Content
the raw subset has full contracts source code and it's not deduplicated, it has 3,298,271 smart contracts the… See the full description on the dataset page: https://huggingface.co/datasets/ASSERT-KTH/DISL.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We include the sets of adversarial questions for each of the seven EquityMedQA datasets (OMAQ, EHAI, FBRT-Manual, FBRT-LLM, TRINDS, CC-Manual, and CC-LLM), the three other non-EquityMedQA datasets used in this work (HealthSearchQA, Mixed MMQA-OMAQ, and Omiye et al.), as well as the data generated as a part of the empirical study, including the generated model outputs (Med-PaLM 2 [1] primarily, with Med-PaLM [2] answers for pairwise analyses) and ratings from human annotators (physicians, health equity experts, and consumers). See the paper for details on all datasets.
We include other datasets evaluated in this work: HealthSearchQA [2], Mixed MMQA-OMAQ, and Omiye et al [3].
A limited number of data elements described in the paper are not included here. The following elements are excluded:
The reference answers written by physicians to HealthSearchQA questions, introduced in [2], and the set of corresponding pairwise ratings. This accounts for 2,122 rated instances.
The free-text comments written by raters during the ratings process.
Demographic information associated with the consumer raters (only age group information is included).
Singhal, K., et al. Towards expert-level medical question answering with large language models. arXiv preprint arXiv:2305.09617 (2023).
Singhal, K., Azizi, S., Tu, T. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023). https://doi.org/10.1038/s41586-023-06291-2
Omiye, J.A., Lester, J.C., Spichak, S. et al. Large language models propagate race-based medicine. npj Digit. Med. 6, 195 (2023). https://doi.org/10.1038/s41746-023-00939-z
Abacha, Asma Ben, et al. "Overview of the medical question answering task at TREC 2017 LiveQA." TREC. 2017.
Abacha, Asma Ben, et al. "Bridging the gap between consumers’ medication questions and trusted answers." MEDINFO 2019: Health and Wellbeing e-Networks for All. IOS Press, 2019. 25-29.
Independent Ratings [ratings_independent.csv]: Contains ratings of the presence of bias and its dimensions in Med-PaLM 2 outputs using the independent assessment rubric for each of the datasets studied. The primary response regarding the presence of bias is encoded in the column bias_presence with three possible values (No bias, Minor bias, Severe bias). Binary assessments of the dimensions of bias are encoded in separate columns (e.g., inaccuracy_for_some_axes). Instances for the Mixed MMQA-OMAQ dataset are triple-rated for each rater group; other datasets are single-rated. Instances were missing for five instances in MMQA-OMAQ and two instances in CC-Manual. This file contains 7,519 rated instances.
Paired Ratings [ratings_pairwise.csv]: Contains comparisons of the presence or degree of bias and its dimensions in Med-PaLM and Med-PaLM 2 outputs for each of the datasets studied. Pairwise responses are encoded in terms of two binary columns corresponding to which of the answers was judged to contain a greater degree of bias (e.g., Med-PaLM-2_answer_more_bias). Dimensions of bias are encoded in the same way as for ratings_independent.csv. Instances for the Mixed MMQA-OMAQ dataset are triple-rated for each rater group; other datasets are single-rated. Four ratings were missing (one for EHAI, two for FRT-Manual, one for FBRT-LLM). This file contains 6,446 rated instances.
Counterfactual Paired Ratings [ratings_counterfactual.csv]: Contains ratings under the counterfactual rubric for pairs of questions defined in the CC-Manual and CC-LLM datasets. Contains a binary assessment of the presence of bias (bias_presence), columns for each dimension of bias, and categorical columns corresponding to other elements of the rubric (ideal_answers_diff, how_answers_diff). Instances for the CC-Manual dataset are triple-rated, instances for CC-LLM are single-rated. Due to a data processing error, we removed questions that refer to `Natal'' from the analysis of the counterfactual rubric on the CC-Manual dataset. This affects three questions (corresponding to 21 pairs) derived from one seed question based on the TRINDS dataset. This file contains 1,012 rated instances.
Open-ended Medical Adversarial Queries (OMAQ) [equitymedqa_omaq.csv]: Contains questions that compose the OMAQ dataset. The OMAQ dataset was first described in [1].
Equity in Health AI (EHAI) [equitymedqa_ehai.csv]: Contains questions that compose the EHAI dataset.
Failure-Based Red Teaming - Manual (FBRT-Manual) [equitymedqa_fbrt_manual.csv]: Contains questions that compose the FBRT-Manual dataset.
Failure-Based Red Teaming - LLM (FBRT-LLM); full [equitymedqa_fbrt_llm.csv]: Contains questions that compose the extended FBRT-LLM dataset.
Failure-Based Red Teaming - LLM (FBRT-LLM) [equitymedqa_fbrt_llm_661_sampled.csv]: Contains questions that compose the sampled FBRT-LLM dataset used in the empirical study.
TRopical and INfectious DiseaseS (TRINDS) [equitymedqa_trinds.csv]: Contains questions that compose the TRINDS dataset.
Counterfactual Context - Manual (CC-Manual) [equitymedqa_cc_manual.csv]: Contains pairs of questions that compose the CC-Manual dataset.
Counterfactual Context - LLM (CC-LLM) [equitymedqa_cc_llm.csv]: Contains pairs of questions that compose the CC-LLM dataset.
HealthSearchQA [other_datasets_healthsearchqa.csv]: Contains questions sampled from the HealthSearchQA dataset [1,2].
Mixed MMQA-OMAQ [other_datasets_mixed_mmqa_omaq]: Contains questions that compose the Mixed MMQA-OMAQ dataset.
Omiye et al. [other datasets_omiye_et_al]: Contains questions proposed in Omiye et al. [3].
Version 2: Updated to include ratings and generated model outputs. Dataset files were updated to include unique ids associated with each question. Version 1: Contained datasets of questions without ratings. Consistent with v1 available as a preprint on Arxiv (https://arxiv.org/abs/2403.12025)
WARNING: These datasets contain adversarial questions designed specifically to probe biases in AI systems. They can include human-written and model-generated language and content that may be inaccurate, misleading, biased, disturbing, sensitive, or offensive.
NOTE: the content of this research repository (i) is not intended to be a medical device; and (ii) is not intended for clinical use of any kind, including but not limited to diagnosis or prognosis.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The CIFAR-100 dataset is a large dataset of labeled images. It is a popular dataset for machine learning and artificial intelligence research. The dataset consists of 100,000 32x32 images. These images are split into 100 mutually exclusive classes, with 1,000 images per class. The classes are animals, vehicles, and other objects.
Facebook
TwitterThis HydroShare resource was created to share large spatial sample datasets in Maryland on GeoServer (https://geoserver.hydroshare.org/geoserver/web/wicket/bookmarkable/org.geoserver.web.demo.MapPreviewPage) and THREDDS (https://thredds.hydroshare.org/thredds/catalog/hydroshare/resources/catalog.html).
Users can check the uploaded LSS datasets on HydroShare-GeoServer and THREDDS using this HS resource id.
Then, through the RHESSys workflows, users can subset LSS datasets using OWSLib and xarray.
Facebook
TwitterBig-Math: UNVERIFIED
[!WARNING] WARNING: This dataset contains ONLY questions whose answers have not been verified to be correct. Use this dataset at your own caution.
Dataset Creation
Big-Math-Unverified is created as an offshoot of the Big-Math dataset (HuggingFace Dataset Link). Big-Math-Unverified goes through the same filters as the rest of Big-Math (eg. remove non-English, remove multiple choice, etc.), except that these problems were not solved in any of the… See the full description on the dataset page: https://huggingface.co/datasets/SynthLabsAI/Big-Math-RL-UNVERIFIED.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
We present DaMuEL, a large Multilingual Dataset for Entity Linking containing data in 53 languages. DaMuEL consists of two components: a knowledge base that contains language-agnostic information about entities, including their claims from Wikidata and named entity types (PER, ORG, LOC, EVENT, BRAND, WORK_OF_ART, MANUFACTURED); and Wikipedia texts with entity mentions linked to the knowledge base, along with language-specific text from Wikidata such as labels, aliases, and descriptions, stored separately for each language. The Wikidata QID is used as a persistent, language-agnostic identifier, enabling the combination of the knowledge base with language-specific texts and information for each entity. Wikipedia documents deliberately annotate only a single mention for every entity present; we further automatically detect all mentions of named entities linked from each document. The dataset contains 27.9M named entities in the knowledge base and 12.3G tokens from Wikipedia texts. The dataset is published under the CC BY-SA licence.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Sergey Bochenkov
Released under CC0: Public Domain
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Facebook
TwitterAI/Human Music (Large variant)
A dataset that comprises of both AI-generated music and human-composed music. This is the "large" variant of the dataset, which is around 70GiB in size. It contains 10,000 audio files from human and 10,000 audio files from AI. The distribution is: $256$ are from SunoCaps, $4,872$ are from Udio, and $4,872$ are from MusicSet. Data sources for this dataset:
https://huggingface.co/datasets/blanchon/udio_dataset… See the full description on the dataset page: https://huggingface.co/datasets/SleepyJesse/ai_music_large.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains a selection of behavioral datasets collected using soluble agents and labeled using realistic threat simulation and IDS rules. The collected datasets are anonymized and aggregated using time window representations. The dataset generation pipeline preprocesses the application logs from the corporate network, structures them according to entities and users inventory, and labels them based on the IDS and phishing simulation appliances.
This repository is associated with the article "RBD24: A labelled dataset with risk activities using log applications data" published in the journal Computers & Security. For more information go to https://doi.org/10.1016/j.cose.2024.104290" target="_blank" rel="noreferrer noopener">https://doi.org/10.1016/j.cose.2024.104290
The RBD24 dataset comprises various risk activities collected from real entities and users over a period of 15 days, with the samples segmented by Desktop (DE) and Smartphone (SM) devices.
| DatasetId | Entity | Observed Behaviour | Groundtruth | Sample Shape |
| Crypto_desktop.parquet | DE | Miner Checking | IDS | 0: 738/161202, 1: 11/1343 |
| Crypto_smarphone.parquet | SM | Miner Checking | IDS | 0: 613/180021, 1: 4/956 |
| OutFlash_desktop.parquet | DE | Outdated software components | IDS | 0: 738/161202, 1: 56/10820 |
| OutFlash_smartphone.parquet | SM | Outdated software components | IDS | 0: 613/180021, 1: 22/6639 |
| OutTLS_desktop.parquet | DE | Outdated TLS protocol | IDS | 0: 738/161202, 1: 18/2458 |
| OutTLS_smartphone.parquet | SM | Outdated TLS protocol | IDS | 0: 613/180021, 1: 11/2930 |
| P2P_desktop.parquet | DE | P2P Activity | IDS | 0: 738/161202, 1: 177/35892 |
| P2P_smartphone.parquet | SM | P2P Activity | IDS | 0: 613/180021, 1: 94/21688 |
| NonEnc_desktop.parquet | DE | Non-encrypted password | IDS | 0: 738/161202, 1: 291/59943 |
| NonEnc_smaprthone.parquet | SM | Non-encrypted password | IDS | 0: 613/180021, 1: 167/41434 |
| Phishing_desktop.parquet | DE | Phishing email |
Experimental Campaign | 0: 98/13864, 1: 19/3072 |
| Phishing_smartphone.parquet | SM | Phishing email | Experimental Campaign | 0: 117/34006, 1: 26/8968 |
To collect the dataset, we have deployed multiple agents and soluble agents within an infrastructure with
more than 3k entities, comprising laptops, workstations, and smartphone devices. The methods to build
ground truth are as follows:
- Simulator: We launch different realistic phishing campaigns, aiming to expose user credentials or defeat access to a service.
- IDS: We deploy an IDS to collect various alerts associated with behavioral anomalies, such as cryptomining or peer-to-peer traffic.
For each user exposed to the behaviors stated in the summary table, different TW is computed, aggregating
user behavior within a fixed time interval. This TW serves as the basis for generating various supervised
and unsupervised methods.
The time windows (TW) are a data representation based on aggregated logs from multimodal sources between two
timestamps. In this study, logs from HTTP, DNS, SSL, and SMTP are taken into consideration, allowing the
construction of rich behavioral profiles. The indicators described in the TE are a set of manually curated
interpretable features designed to describe device-level properties within the specified time frame. The most
influential features are described below.
Parquet format uses a columnar storage format, which enhances efficiency and compression, making it suitable for large datasets and complex analytical tasks. It has support across various tools and languages, including Python. Parquet can be used with pandas library in Python, allowing pandas to read and write Parquet files through the `pyarrow` or `fastparquet` libraries. Its efficient data retrieval and fast query execution improve performance over other formats. Compared to row-based storage formats such as CSV, Parquet's columnar storage greatly reduces read times and storage costs for large datasets. Although binary formats like HDF5 are effective for specific use cases, Parquet provides broader compatibility and optimization. The provided datasets use the Parquet format. Here’s an example of how to retrieve data using pandas, ensure you have the fastparquet library installed:
```pythonimport pandas as pd
# Reading a Parquet filedf = pd.read_parquet( 'path_to_your_file.parquet', engine='fastparquet' )
```
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is designed for skill gap analysis, focusing on evaluating the skill gap between students’ current skills and industry requirements. It provides insights into technical skills, soft skills, career interests, and challenges, helping in skill gap analysis to identify areas for improvement.
By leveraging this dataset, educators, recruiters, and researchers can conduct skill gap analysis to assess students’ job readiness and tailor training programs accordingly. It serves as a valuable resource for identifying skill deficiencies and skill gaps improving career guidance, and enhancing curriculum design through targeted skill gap analysis.
Following is the column descriptors: Name - Student's full name. email_id - Student's email address. Year - The academic year the student is currently in (e.g., 1st Year, 2nd Year, etc.). Current Course - The course the student is currently pursuing (e.g., B.Tech CSE, MBA, etc.). Technical Skills - List of technical skills possessed by the student (e.g., Python, Data Analysis, Cloud Computing). Programming Languages - Programming languages known by the student (e.g., Python, Java, C++). Rating - Self-assessed rating of technical skills on a scale of 1 to 5. Soft Skills - List of soft skills (e.g., Communication, Leadership, Teamwork). Rating - Self-assessed rating of soft skills on a scale of 1 to 5. Projects - Indicates whether the student has worked on any projects (Yes/No). Career Interest - The student's preferred career path (e.g., Data Scientist, Software Engineer). Challenges - Challenges faced while applying for jobs/internships (e.g., Lack of experience, Resume building issues).
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by AYUSH SINGH331
Released under MIT
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Big Flat population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Big Flat. The dataset can be utilized to understand the population distribution of Big Flat by age. For example, using this dataset, we can identify the largest age group in Big Flat.
Key observations
The largest age group in Big Flat, AR was for the group of age 15 to 19 years years with a population of 16 (25%), according to the ACS 2018-2022 5-Year Estimates. At the same time, the smallest age group in Big Flat, AR was the 5 to 9 years years with a population of 0 (0%). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Big Flat Population by Age. You can refer the same here
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Big Flat population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of Big Flat across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.
Key observations
In 2022, the population of Big Flat was 89, a 0.00% decrease year-by-year from 2021. Previously, in 2021, Big Flat population was 89, an increase of 1.14% compared to a population of 88 in 2020. Over the last 20 plus years, between 2000 and 2022, population of Big Flat decreased by 15. In this period, the peak population was 111 in the year 2007. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).
When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).
Data Coverage:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Big Flat Population by Year. You can refer the same here
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
3 BIG DATA is a dataset for object detection tasks - it contains Cats annotations for 943 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Big Falls town population by gender and age. The dataset can be utilized to understand the gender distribution and demographics of Big Falls town.
The dataset constitues the following two datasets across these two themes
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Large Labelled Datset is a dataset for object detection tasks - it contains Waste Litter Garbage Trash annotations for 12,919 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Big Spring population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Big Spring. The dataset can be utilized to understand the population distribution of Big Spring by age. For example, using this dataset, we can identify the largest age group in Big Spring.
Key observations
The largest age group in Big Spring, TX was for the group of age 35-39 years with a population of 2,222 (8.48%), according to the 2021 American Community Survey. At the same time, the smallest age group in Big Spring, TX was the 80-84 years with a population of 286 (1.09%). Source: U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Big Spring Population by Age. You can refer the same here
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains synthetic data generated for customer churn analysis. It includes 1000 entries representing customer information, such as demographics, account details, subscription types, and churn status. The data is ideal for predictive modeling, machine learning algorithms, and exploratory data analysis (EDA). Features: CustomerID: A unique identifier for each customer. Gender: Male or Female. Age: Customer's age in years. Geography: Country or region of the customer (e.g., Germany, France, UK). Tenure: Number of months the customer has been with the company. Contract: Type of subscription (Month-to-month, One-year, Two-year). MonthlyCharges: The amount billed monthly. TotalCharges: The total amount billed to date. PaymentMethod: Method used for payments (e.g., Credit card, Direct debit). IsActiveMember: Whether the customer is an active member (1 = Active, 0 = Inactive). Churn: Indicates whether the customer has churned (Yes/No).