12 datasets found
  1. Movies Performance and Feature Statistics

    • kaggle.com
    Updated Jan 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Movies Performance and Feature Statistics [Dataset]. https://www.kaggle.com/datasets/thedevastator/movies-performance-and-feature-statistics
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 16, 2023
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    Movies Performance and Feature Statistics

    Analyzing Box Office Performance, Rating and Audience Reactions

    By Yashwanth Sharaff [source]

    About this dataset

    This dataset contains essential characteristics of a variety of movies, including basic pieces of information such as the movie's title and budget, as well as performance indicators like the movie's MPAA rating, gross revenue, release date, genre, runtime, rating count and summary. With this data set we can better understand the film industry and uncover insights on how different features and performance metrics impact one another to guarantee a movie's success. The movies dataset also helps you make informed decisions about which features are key indicators in setting up a high-grossing feature film

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    To get the most out of this data set you need to understand what each column in it represents. The ‘Title’ column gives you the title of the movie which can be used for further search or exploration on popular streaming services and websites that are dedicated to providing detailed information about movies. The ‘MPAA Rating’ lists any Motion Picture Association (MPAA) rating for a movie which consists of G (General Audiences), PG (Parental Guidance Suggested), PG-13 (Parents Strongly Cautioned), R (Under 17 Requires Accompanying Parent or Guardian) etc. The 'Budget' column give you an approximate idea about how much a particular production cost while the 'Gross' columns depicts its earnings if it was released in theaters while its successor 'Release Date' reveals when each film has been released or is going to release in future. The columns 'Genre', 'Runtime', and ‘Rating Count’ cover subje​cts such as what type of movie is it? Every genre will have an associated runtime limit along with rating count which refers to number people who have rated/reviewed a particular flick whether on IMDB or other streaming services as well as paper mediums like newspapers . Last but not least summary field states an overview of what we can expect from film so take this in account before watching anything especially if include children members in your family.

    So go ahead - start exploring this interesting dataset today!

    Research Ideas

    • Creating a box office prediction model using budget, genre, release date and MPAA rating
    • Using the summary data to create a sentiment analysis tool for movie reviews
    • Building a recommendation engine for users based on their prior ratings and what other users with similar tastes have rated as highly

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    See the dataset description for more information.

    Columns

    File: movies.csv | Column name | Description | |:-----------------|:-------------------------------------------------------------------------------| | Title | The title of the movie. (String) | | MPAA Rating | The Motion Picture Association of America (MPAA) rating of the movie. (String) | | Budget | The budget of the movie in US dollars. (Integer) | | Gross | The gross revenue of the movie in US dollars. (Integer) | | Release Date | The date the movie was released. (Date) | | Genre | The genre of the movie. (String) | | Runtime | The length of the movie in minutes. (Integer) | | Rating Count | The number of ratings the movie has received. (Integer) | | Summary | A brief summary of the movie. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Yashwanth Sharaff.

  2. Official Netflix Viewership Database

    • kaggle.com
    Updated Dec 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sujay Kapadnis (2023). Official Netflix Viewership Database [Dataset]. https://www.kaggle.com/datasets/sujaykapadnis/official-netflix-streaming-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 20, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sujay Kapadnis
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Methodology Every Tuesday, we publish four global Top 10 lists for films and TV: Film (English), TV (English), Film (Non-English), and TV (Non-English). These lists rank titles based on ‘views’ for each title from Monday to Sunday of the previous week. We define views for a title as the total hours viewed divided by the total runtime. Values are rounded to 100,000.

    We consider each season of a series and each film on their own, so you might see both Stranger Things seasons 2 and 3 in the Top 10. Because titles sometimes move in and out of the Top 10, we also show the total number of weeks that a season of a series or film has spent on the list.

    To give you a sense of what people are watching around the world, we also publish Top 10 lists for nearly 100 countries and territories (the same locations where there are Top 10 rows on Netflix). Country lists are also ranked by views.

    Finally, we provide a list of the Top 10 most popular Netflix films and TV overall (branded Netflix in any country) in each of the four categories based on the views of each title in its first 91 days.

    Some TV shows have multiple premiere dates, whether weekly or in parts, and therefore the runtime increases over time. For the weekly lists, we show the views based on the total hours viewed during the week divided by the total runtime available at the end of the week. On the Most Popular List, we wait until all episodes have premiered, so you see the views of the entire season. For titles that are Netflix branded in some countries but not others, we still include all of the hours viewed.

    Information on the site starts from June 28, 2021 and any lists published before June 20, 2023 are ranked by hours viewed.

  3. Netlifx_hour_2023

    • kaggle.com
    zip
    Updated Dec 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    willian oliveira (2023). Netlifx_hour_2023 [Dataset]. https://www.kaggle.com/datasets/willianoliveiragibin/netlifx-hour-2023
    Explore at:
    zip(2117567 bytes)Available download formats
    Dataset updated
    Dec 25, 2023
    Authors
    willian oliveira
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Since launching our weekly Top 10 and Most Popular lists in 2021, Netflix has provided more information about what people are watching than any other streamer except YouTube. And now we believe it’s time to go further.

    Starting today we will publish What We Watched: A Netflix Engagement Report twice a year. This is a comprehensive report of what people watched on Netflix over a six month period1, including:

    Hours viewed for every title — original and licensed — watched for over 50,000 hours2;

    The premiere date3 for any Netflix TV series or film; and

    Whether a title was available globally.

    In total, this report covers more than 18,000 titles — representing 99% of all viewing on Netflix — and nearly 100 billion hours viewed.

    Over 60% of Netflix titles released between January and June 2023 appeared on our weekly Top 10 lists. So while this report is broader in scope, the trends reflected in it are very similar to those in the Top 10 lists, including:

    The strength of returning favorites like Ginny & Georgia, Alice in Borderland, The Marked Heart, Outer Banks, You, Queen Charlotte: A Bridgerton Story, XO Kitty and film sequels Murder Mystery 2 and Extraction 2;

    The popularity of new series like The Night Agent, The Diplomat, Beef, The Glory, Alpha Males, FUBAR and Fake Profile, which generate huge audiences and fandoms;

    The size of the audience of our films across every genre including The Mother, Luther: The Fallen Sun, You People, AKA, ¡Que viva México! and Hunger;

    The enthusiasm for non-English stories, which generated 30% of all viewing;

    The staying power of titles on Netflix, which extends well beyond their premieres. All Quiet on the Western Front, for example, debuted in October 2022 and generated 80M hours viewed between January and June; and

    The demand for older, licensed titles, which generates tremendous value for our members and for rights holders.

    When reading the report it’s important to remember:

    Success on Netflix comes in all shapes and sizes, and is not determined by hours viewed alone. We have enormously successful movies and TV shows with both lower and higher hours viewed. It’s all about whether a movie or TV show thrilled its audience — and the size of that audience relative to the economics of the title; and

    To compare between titles it’s best to use our weekly Top 10 and Most Popular lists, which take into account run times and premiere dates.

    This is a big step forward for Netflix and our industry. We believe the viewing information in this report — combined with our weekly Top 10 and Most Popular lists — will give creators and our industry deeper insights into our audiences, and what resonates with them.

  4. Netflix Engagement Report

    • kaggle.com
    zip
    Updated Dec 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konrad Banachewicz (2023). Netflix Engagement Report [Dataset]. https://www.kaggle.com/datasets/konradb/netflix-engagement-report
    Explore at:
    zip(349809 bytes)Available download formats
    Dataset updated
    Dec 20, 2023
    Authors
    Konrad Banachewicz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    From the report page:

    Since launching our weekly Top 10 and Most Popular lists in 2021, Netflix has provided more information about what people are watching than any other streamer except YouTube. And now we believe it’s time to go further.

    Starting today we will publish What We Watched: A Netflix Engagement Report twice a year. This is a comprehensive report of what people watched on Netflix over a six month period1, including:

    Hours viewed for every title — original and licensed — watched for over 50,000 hours2;

    The premiere date3 for any Netflix TV series or film; and

    Whether a title was available globally.

    In total, this report covers more than 18,000 titles — representing 99% of all viewing on Netflix — and nearly 100 billion hours viewed.

    Over 60% of Netflix titles released between January and June 2023 appeared on our weekly Top 10 lists. So while this report is broader in scope, the trends reflected in it are very similar to those in the Top 10 lists, including:

    The strength of returning favorites like Ginny & Georgia, Alice in Borderland, The Marked Heart, Outer Banks, You, Queen Charlotte: A Bridgerton Story, XO Kitty and film sequels Murder Mystery 2 and Extraction 2;

    The popularity of new series like The Night Agent, The Diplomat, Beef, The Glory, Alpha Males, FUBAR and Fake Profile, which generate huge audiences and fandoms;

    The size of the audience of our films across every genre including The Mother, Luther: The Fallen Sun, You People, AKA, ¡Que viva México! and Hunger;

    The enthusiasm for non-English stories, which generated 30% of all viewing;

    The staying power of titles on Netflix, which extends well beyond their premieres. All Quiet on the Western Front, for example, debuted in October 2022 and generated 80M hours viewed between January and June; and

    The demand for older, licensed titles, which generates tremendous value for our members and for rights holders.

    When reading the report it’s important to remember:

    Success on Netflix comes in all shapes and sizes, and is not determined by hours viewed alone. We have enormously successful movies and TV shows with both lower and higher hours viewed. It’s all about whether a movie or TV show thrilled its audience — and the size of that audience relative to the economics of the title; and

    To compare between titles it’s best to use our weekly Top 10 and Most Popular lists, which take into account run times and premiere dates.

    This is a big step forward for Netflix and our industry. We believe the viewing information in this report — combined with our weekly Top 10 and Most Popular lists — will give creators and our industry deeper insights into our audiences, and what resonates with them.

  5. IBM Telco Churn Data

    • kaggle.com
    zip
    Updated Dec 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ankur Shah (2023). IBM Telco Churn Data [Dataset]. https://www.kaggle.com/datasets/nikhilrajubiyyap/ibm-telco-churn-data
    Explore at:
    zip(1919204 bytes)Available download formats
    Dataset updated
    Dec 5, 2023
    Authors
    Ankur Shah
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Context

    A fictional telco company that provided home phone and Internet services to 7043 customers in California in Q3.

    Data Description 7043 observations with 33 variables

    CustomerID: A unique ID that identifies each customer.

    Count: A value used in reporting/dashboarding to sum up the number of customers in a filtered set.

    Country: The country of the customer’s primary residence.

    State: The state of the customer’s primary residence.

    City: The city of the customer’s primary residence.

    Zip Code: The zip code of the customer’s primary residence.

    Lat Long: The combined latitude and longitude of the customer’s primary residence.

    Latitude: The latitude of the customer’s primary residence.

    Longitude: The longitude of the customer’s primary residence.

    Gender: The customer’s gender: Male, Female

    Senior Citizen: Indicates if the customer is 65 or older: Yes, No

    Partner: Indicate if the customer has a partner: Yes, No

    Dependents: Indicates if the customer lives with any dependents: Yes, No. Dependents could be children, parents, grandparents, etc.

    Tenure Months: Indicates the total amount of months that the customer has been with the company by the end of the quarter specified above.

    Phone Service: Indicates if the customer subscribes to home phone service with the company: Yes, No

    Multiple Lines: Indicates if the customer subscribes to multiple telephone lines with the company: Yes, No

    Internet Service: Indicates if the customer subscribes to Internet service with the company: No, DSL, Fiber Optic, Cable.

    Online Security: Indicates if the customer subscribes to an additional online security service provided by the company: Yes, No

    Online Backup: Indicates if the customer subscribes to an additional online backup service provided by the company: Yes, No

    Device Protection: Indicates if the customer subscribes to an additional device protection plan for their Internet equipment provided by the company: Yes, No

    Tech Support: Indicates if the customer subscribes to an additional technical support plan from the company with reduced wait times: Yes, No

    Streaming TV: Indicates if the customer uses their Internet service to stream television programing from a third party provider: Yes, No. The company does not charge an additional fee for this service.

    Streaming Movies: Indicates if the customer uses their Internet service to stream movies from a third party provider: Yes, No. The company does not charge an additional fee for this service.

    Contract: Indicates the customer’s current contract type: Month-to-Month, One Year, Two Year.

    Paperless Billing: Indicates if the customer has chosen paperless billing: Yes, No

    Payment Method: Indicates how the customer pays their bill: Bank Withdrawal, Credit Card, Mailed Check

    Monthly Charge: Indicates the customer’s current total monthly charge for all their services from the company.

    Total Charges: Indicates the customer’s total charges, calculated to the end of the quarter specified above.

    Churn Label: Yes = the customer left the company this quarter. No = the customer remained with the company. Directly related to Churn Value.

    Churn Value: 1 = the customer left the company this quarter. 0 = the customer remained with the company. Directly related to Churn Label.

    Churn Score: A value from 0-100 that is calculated using the predictive tool IBM SPSS Modeler. The model incorporates multiple factors known to cause churn. The higher the score, the more likely the customer will churn.

    CLTV: Customer Lifetime Value. A predicted CLTV is calculated using corporate formulas and existing data. The higher the value, the more valuable the customer. High value customers should be monitored for churn.

    Churn Reason: A customer’s specific reason for leaving the company. Directly related to Churn Category.

    Source This dataset is detailed in: https://community.ibm.com/community/user/businessanalytics/blogs/steven-macko/2019/07/11/telco-customer-churn-1113

    Downloaded from: https://community.ibm.com/accelerators/?context=analytics&query=telco%20churn&type=Data&product=Cognos%20Analytics

    There are several related datasets as documented in: https://community.ibm.com/community/user/businessanalytics/blogs/steven-macko/2018/09/12/base-samples-for-ibm-cognos-analytics

  6. Telco customer churn (11.1.3+)

    • kaggle.com
    zip
    Updated May 8, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Al Fath Terry (2024). Telco customer churn (11.1.3+) [Dataset]. https://www.kaggle.com/datasets/alfathterry/telco-customer-churn-11-1-3
    Explore at:
    zip(525781 bytes)Available download formats
    Dataset updated
    May 8, 2024
    Authors
    Al Fath Terry
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Telco Customer Churn Dataset

    The Telco customer churn data contains information about a fictional telco company that provided home phone and Internet services to 7043 customers in California in Q3. It indicates which customers have left, stayed, or signed up for their service. Multiple important demographics are included for each customer, as well as a Satisfaction Score, Churn Score, and Customer Lifetime Value (CLTV) index.

    Columns Description:

    • CustomerID: A unique ID that identifies each customer.

    • Gender: The customer’s gender: Male, Female

    • Age: The customer’s current age, in years, at the time the fiscal quarter ended.

    • Senior Citizen: Indicates if the customer is 65 or older: Yes, No

    • Married: Indicates if the customer is married: Yes, No

    • Dependents: Indicates if the customer lives with any dependents: Yes, No. Dependents could be children, parents, grandparents, etc.

    • Number of Dependents: Indicates the number of dependents that live with the customer.

    • CustomerID: A unique ID that identifies each customer.

    • Count: A value used in reporting/dashboarding to sum up the number of customers in a filtered set.

    • Country: The country of the customer’s primary residence.

    • State: The state of the customer’s primary residence.

    • City: The city of the customer’s primary residence.

    • Zip Code: The zip code of the customer’s primary residence.

    • Latitude: The latitude of the customer’s primary residence.

    • Longitude: The longitude of the customer’s primary residence.

    • Zip Code: The zip code of the customer’s primary residence.

    • Population: A current population estimate for the entire Zip Code area.

    • CustomerID: A unique ID that identifies each customer.

    • Count: A value used in reporting/dashboarding to sum up the number of customers in a filtered set.

    • Quarter: The fiscal quarter that the data has been derived from (e.g. Q3).

    • Referred a Friend: Indicates if the customer has ever referred a friend or family member to this company: Yes, No

    • Number of Referrals: Indicates the number of referrals to date that the customer has made.

    • Tenure in Months: Indicates the total amount of months that the customer has been with the company by the end of the quarter specified above.

    • Offer: Identifies the last marketing offer that the customer accepted, if applicable. Values include None, Offer A, Offer B, Offer C, Offer D, and Offer E.

    • Phone Service: Indicates if the customer subscribes to home phone service with the company: Yes, No

    • Avg Monthly Long Distance Charges: Indicates the customer’s average long distance charges, calculated to the end of the quarter specified above.

    • Multiple Lines: Indicates if the customer subscribes to multiple telephone lines with the company: Yes, No

    • Internet Service: Indicates if the customer subscribes to Internet service with the company: No, DSL, Fiber Optic, Cable.

    • Avg Monthly GB Download: Indicates the customer’s average download volume in gigabytes, calculated to the end of the quarter specified above.

    • Online Security: Indicates if the customer subscribes to an additional online security service provided by the company: Yes, No

    • Online Backup: Indicates if the customer subscribes to an additional online backup service provided by the company: Yes, No

    • Device Protection Plan: Indicates if the customer subscribes to an additional device protection plan for their Internet equipment provided by the company: Yes, No

    • Premium Tech Support: Indicates if the customer subscribes to an additional technical support plan from the company with reduced wait times: Yes, No

    • Streaming TV: Indicates if the customer uses their Internet service to stream television programing from a third party provider: Yes, No. The company does not charge an additional fee for this service.

    • Streaming Movies: Indicates if the customer uses their Internet service to stream movies from a third party provider: Yes, No. The company does not charge an additional fee for this service.

    • Streaming Music: Indicates if the customer uses their Internet service to stream music from a third party provider: Yes, No. The company does not charge an additional fee for this service.

    • Unlimited Data: Indicates if the customer has paid an additional monthly fee to have unlimited data downloads/uploads: Yes, No

    • Contract: Indicates the customer’s current contract type: Month-to-Month, One Year, Two Year.

    • Paperless Billing: Indicates if the customer has chosen paperless billing: Yes, No

    • Payment Method: Indicates how the customer pays their bill: Bank Withdrawal, Credit Card, Mailed Check

    • Monthly Charge: Indicates the customer’s current total monthly charge for all their services from the company.

    • Total Charges: Indicates the customer’s total charges, calculated to the end of the quarter specified above.

    • Total Refunds: Indicates the customer’s t...

  7. Telco Customer Churn

    • kaggle.com
    zip
    Updated Feb 23, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BlastChar (2018). Telco Customer Churn [Dataset]. https://www.kaggle.com/datasets/blastchar/telco-customer-churn
    Explore at:
    zip(175758 bytes)Available download formats
    Dataset updated
    Feb 23, 2018
    Authors
    BlastChar
    Description

    Context

    "Predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs." [IBM Sample Data Sets]

    Content

    Each row represents a customer, each column contains customer’s attributes described on the column Metadata.

    The data set includes information about:

    • Customers who left within the last month – the column is called Churn
    • Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies
    • Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges
    • Demographic info about customers – gender, age range, and if they have partners and dependents

    Inspiration

    To explore this type of models and learn more about the subject.

    New version from IBM: https://community.ibm.com/community/user/businessanalytics/blogs/steven-macko/2019/07/11/telco-customer-churn-1113

  8. Cleaned Churn

    • kaggle.com
    zip
    Updated Oct 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gökhan Ergül (2024). Cleaned Churn [Dataset]. https://www.kaggle.com/datasets/gokhanergul/cleaned-churn
    Explore at:
    zip(98170 bytes)Available download formats
    Dataset updated
    Oct 12, 2024
    Authors
    Gökhan Ergül
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Customer Churn Dataset Description

    This dataset contains various features that describe customer information for a telecommunications company. The objective is to analyze these features to predict whether a customer will churn (i.e., discontinue service). Below is a breakdown of each feature in the dataset:

    FeatureDescription
    genderIndicates the gender of the customer (1 = Female, 0 = Male).
    seniorcitizenIndicates if the customer is a senior citizen (1 = Yes, 0 = No).
    partnerIndicates if the customer has a partner (1 = Yes, 0 = No).
    dependentsIndicates if the customer has dependents (1 = Yes, 0 = No).
    tenureThe number of months the customer has been with the company.
    phoneserviceIndicates if the customer has a phone service (1 = Yes, 0 = No).
    multiplelinesIndicates if the customer has multiple lines (1 = Yes, 0 = No).
    onlinesecurityIndicates if the customer has online security (1 = Yes, 0 = No).
    onlinebackupIndicates if the customer has online backup (1 = Yes, 0 = No).
    deviceprotectionIndicates if the customer has device protection (1 = Yes, 0 = No).
    techsupportIndicates if the customer has tech support (1 = Yes, 0 = No).
    streamingtvIndicates if the customer has streaming TV services (1 = Yes, 0 = No).
    streamingmoviesIndicates if the customer has streaming movies services (1 = Yes, 0 = No).
    paperlessbillingIndicates if the customer uses paperless billing (1 = Yes, 0 = No).
    monthlychargesThe amount charged to the customer each month.
    totalchargesThe total amount charged to the customer up to the current month.
    labelIndicates whether the customer has churned (1 = Churned, 0 = Not Churned).
    contract_Month-to-monthIndicates if the customer is on a month-to-month contract (1 = Yes, 0 = No).
    contract_One yearIndicates if the customer is on a one-year contract (1 = Yes, 0 = No).
    contract_Two yearIndicates if the customer is on a two-year contract (1 = Yes, 0 = No).
    paymentmethod_Electronic checkIndicates if the payment method is Electronic Check (1 = Yes, 0 = No).
    paymentmethod_Credit card (automatic)Indicates if the payment method is Credit Card (Automatic) (1 = Yes, 0 = No).
    paymentmethod_Mailed checkIndicates if the payment method is Mailed Check (1 = Yes, 0 = No).
    paymentmethod_Bank transfer (automatic)Indicates if the payment method is Bank Transfer (Automatic) (1 = Yes, 0 = No).
    internetservice_Fiber opticIndicates if the customer has Fiber Optic internet service (1 = Yes, 0 = No).
    internetservice_DSLIndicates if the customer has DSL internet service (1 = Yes, 0 = No).
    internetservice_NoIndicates if the customer has no internet service (1 = Yes, 0 = No).

    Summary

    The dataset captures a wide range of features that are likely to influence customer retention. By analyzing these features, we can build models to predict which customers are at risk of churning and develop strategies to retain them.

  9. Telco Customer Churn

    • kaggle.com
    zip
    Updated Apr 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    aephidayatuloh (2024). Telco Customer Churn [Dataset]. https://www.kaggle.com/datasets/aephidayatuloh/telco-customer-churn
    Explore at:
    zip(1744352 bytes)Available download formats
    Dataset updated
    Apr 7, 2024
    Authors
    aephidayatuloh
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Context

    This sample data tracks a fictional telco company's customer churn based on a variety of possible factors. The churn column indicates whether or not the customer left within the last month. Other columns include gender, dependents, monthly charges, and many with information about the types of services each customer has. Source: IBM.

    Inventory of Telco Assets

    A variety of objects have been updated/created that work together to tell a comprehensive story:

    Telco churn: This sample dashboard tracks a fictional telco company's customer churn based on a variety of factors. The Churn Label column indicates whether or not the customer left within the last month. Other columns include location, monthly charges, services, and customer lifetime value. Location: Team content > Samples > Dashboards.

    Quarterly churn update: This sample story shows quarterly changes of customer churn in a fictional telco company, and which contract and location has the highest churn in order to decide the goals for the next quarter. The churn label column indicates whether or not the customer left within the last quarter. Location: Team content > Samples > Stories.

    Telco customer churn: This sample data module tracks a fictional telco company's customer churn based on a variety of possible factors. The churn column indicates whether or not the customer left within the last month. Other columns include gender, dependents, monthly charges, and many with information about the types of services each customer has. The Telco customer churn data module is composed of 3 uploaded files:

    Telco_customer_churn_demographics.xlsx Telco_customer_churn_services.xlsx Telco_customer_churn_status.xlsx

    Content

    Data

    Each table is described below. Demographics CustomerID: A unique ID that identifies each customer.

    Count: A value used in reporting/dashboarding to sum up the number of customers in a filtered set.

    Gender: The customer’s gender: Male, Female

    Age: The customer’s current age, in years, at the time the fiscal quarter ended.

    Senior Citizen: Indicates if the customer is 65 or older: Yes, No

    Married: Indicates if the customer is married: Yes, No

    Dependents: Indicates if the customer lives with any dependents: Yes, No. Dependents could be children, parents, grandparents, etc.

    Number of Dependents: Indicates the number of dependents that live with the customer.

    Services CustomerID: A unique ID that identifies each customer.

    Count: A value used in reporting/dashboarding to sum up the number of customers in a filtered set.

    Quarter: The fiscal quarter that the data has been derived from (e.g. Q3).

    Referred a Friend: Indicates if the customer has ever referred a friend or family member to this company: Yes, No

    Number of Referrals: Indicates the number of referrals to date that the customer has made.

    Tenure in Months: Indicates the total amount of months that the customer has been with the company by the end of the quarter specified above.

    Offer: Identifies the last marketing offer that the customer accepted, if applicable. Values include None, Offer A, Offer B, Offer C, Offer D, and Offer E.

    Phone Service: Indicates if the customer subscribes to home phone service with the company: Yes, No

    Avg Monthly Long Distance Charges: Indicates the customer’s average long distance charges, calculated to the end of the quarter specified above.

    Multiple Lines: Indicates if the customer subscribes to multiple telephone lines with the company: Yes, No

    Internet Service: Indicates if the customer subscribes to Internet service with the company: No, DSL, Fiber Optic, Cable.

    Avg Monthly GB Download: Indicates the customer’s average download volume in gigabytes, calculated to the end of the quarter specified above.

    Online Security: Indicates if the customer subscribes to an additional online security service provided by the company: Yes, No

    Online Backup: Indicates if the customer subscribes to an additional online backup service provided by the company: Yes, No

    Device Protection Plan: Indicates if the customer subscribes to an additional device protection plan for their Internet equipment provided by the company: Yes, No

    Premium Tech Support: Indicates if the customer subscribes to an additional technical support plan from the company with reduced wait times: Yes, No

    Streaming TV: Indicates if the customer uses their Internet service to stream television programing from a third party provider: Yes, No. The company does not charge an additional fee for this service.

    Streaming Movies: Indicates if the customer uses their Internet service to stream movies from a third party provider: Yes, No. The company does not charge an additional fee for this service.

    Streaming Music: Indicates if the customer uses their Internet service to stream music from a third party provider: Yes, No. The company does not charge an additional fee for ...

  10. 📊 Telco Customer Churn Dataset

    • kaggle.com
    zip
    Updated Jul 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Austin Kleon (2025). 📊 Telco Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/jethwaaatmik/telco-customer-churn-dataset
    Explore at:
    zip(172687 bytes)Available download formats
    Dataset updated
    Jul 18, 2025
    Authors
    Austin Kleon
    Description

    📝 Dataset Description This dataset contains information about customers of a telecommunications company, including their demographic details, account information, service subscriptions, and churn status. It is a modified version of the popular Telco Churn dataset, curated for exploratory data analysis, machine learning model development, and churn prediction tasks.

    The dataset includes simulated missing values in some columns to reflect real-world data issues and support preprocessing and imputation tasks. This makes it especially useful for demonstrating data cleaning techniques and evaluating model robustness.

    📂 Files Included telco_data_modified.csv: The main dataset with 21 columns and 7043 rows (some missing values are intentionally inserted).

    📌 Features Column Name Description customerID Unique identifier for each customer gender Customer gender: Male/Female SeniorCitizen Indicates if the customer is a senior citizen (0 = No, 1 = Yes) Partner Whether the customer has a partner Dependents Whether the customer has dependents tenure Number of months the customer has stayed with the company PhoneService Whether the customer has phone service MultipleLines Whether the customer has multiple lines InternetService Customer's internet service provider (DSL, Fiber optic, No) OnlineSecurity Whether the customer has online security OnlineBackup Whether the customer has online backup DeviceProtection Whether the customer has device protection TechSupport Whether the customer has tech support StreamingTV Whether the customer has streaming TV StreamingMovies Whether the customer has streaming movies Contract Type of contract: Month-to-month, One year, Two year PaperlessBilling Whether the customer uses paperless billing PaymentMethod Payment method: (e.g., Electronic check, Mailed check, etc.) MonthlyCharges Monthly charges TotalCharges Total charges to date Churn Whether the customer has left the company (Yes/No)

    🔍 Use Cases Binary classification: Predict customer churn

    Data preprocessing and imputation exercises

    Feature engineering and importance analysis

    Customer segmentation and churn modeling

    ⚠️ Notes Missing values were intentionally inserted in the dataset to help simulate real-world conditions.

    Some preprocessing may be required before modeling (e.g., converting categorical to numerical data, handling TotalCharges as numeric).

    🏷️ Tags

    telecom #churn #classification #customer-analytics #data-cleaning #feature-engineering

    🙏 Acknowledgements This dataset is based on the original Telco Customer Churn dataset (initially provided by IBM). The current version has been modified for academic and practical exercises.

  11. Telco Customer Churn

    • kaggle.com
    zip
    Updated Jun 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md. Abdur Rahman (2024). Telco Customer Churn [Dataset]. https://www.kaggle.com/datasets/borhanitrash/telco-customer-churn
    Explore at:
    zip(585755 bytes)Available download formats
    Dataset updated
    Jun 16, 2024
    Authors
    Md. Abdur Rahman
    Description

    Dataset Card for Telco Customer Churn

    This dataset contains information about customers of a fictional telecommunications company, including demographic information, services subscribed to, location details, and churn behavior. This merged dataset combines the information from the original Telco Customer Churn dataset with additional details.

    Dataset Details

    Dataset Description This merged Telco Customer Churn dataset provides a comprehensive view of customer attributes, service usage, location data, and churn behavior. This expanded dataset is a valuable resource for understanding churn patterns, customer segmentation, and developing targeted marketing strategies.

    Uses

    Direct Use This dataset can be used for various purposes, including:

    Customer churn prediction: Develop machine learning models to predict which customers are at risk of churning, leveraging the expanded features. Customer segmentation: Identify different customer segments based on demographics, service usage, location, and churn behavior. Targeted marketing campaigns: Develop targeted marketing campaigns to retain at-risk customers or attract new customers, tailoring campaigns based on the insights derived from the merged dataset. Location-based analysis: Analyze customer churn trends based on specific locations, cities, or zip codes, and identify potential regional differences.

    Out-of-Scope Use

    The dataset is not suitable for:

    Real-time churn prediction: The dataset lacks real-time data, making it inappropriate for immediate churn prediction. Personal identification: While the dataset contains customer information, it is anonymized and should not be used to identify individuals.

    Dataset Structure

    The dataset is structured as a CSV file with 49 columns, each representing a customer attribute. The columns include:

    Age: The customer's age in years. Avg Monthly GB Download: The customer's average monthly gigabyte download volume. Avg Monthly Long Distance Charges: The customer's average monthly long distance charges. Churn Category: A high-level category for the customer's reason for churning. Churn Label: Indicates whether the customer churned. Churn Reason: The customer's specific reason for leaving the company. Churn Score: A score from 0-100 indicating the likelihood of the customer churning. Churn Value: A numerical value representing whether the customer churned (1 for churned, 0 for not churned). City: The city of the customer's residence. CLTV: Customer Lifetime Value. Contract: The customer's contract type. Country: The country of the customer's residence. Customer ID: A unique identifier for each customer. Customer Status: The customer's status at the end of the quarter (Churned, Stayed, or Joined). Dependents: Whether the customer has dependents. Device Protection Plan: Whether the customer has a device protection plan. Gender: The customer's gender. Internet Service: Indicates whether the customer subscribes to internet service. Internet Type: The type of internet service provider. Lat Long: The combined latitude and longitude of the customer's residence. Latitude: The latitude of the customer's residence. Longitude: The longitude of the customer's residence. Married: Indicates if the customer is married. Monthly Charge: The customer's total monthly charge for all their services. Multiple Lines: Whether the customer has multiple phone lines. Number of Dependents: The number of dependents the customer has. Number of Referrals: The number of referrals made by the customer. Offer: The last marketing offer the customer accepted. Online Backup: Whether the customer has online backup service. Online Security: Whether the customer has online security service. Paperless Billing: Whether the customer has paperless billing. Partner: Whether the customer has a partner. Payment Method: The customer's payment method. Phone Service: Whether the customer has phone service. Population: The estimated population of the customer's zip code. Premium Tech Support: Whether the customer has premium tech support. Quarter: The fiscal quarter for the data. Referred a Friend: Indicates if the customer has referred a friend. Satisfaction Score: The customer's satisfaction rating. Senior Citizen: Whether the customer is a senior citizen. State: The state of the customer's residence. Streaming Movies: Whether the customer has streaming movies service. Streaming Music: Whether the customer has streaming music service. Streaming TV: Whether the customer has streaming TV service. Tenure in Months: The number of months the customer has been with the company. Total Charges: The customer's total charges. Total Extra Data Charges: The total charges for extra data downloads. Total Long Distance Charges: The total charges for long distance calls. Total Refunds: The total refunds received by the customer. Total Revenue: The total revenue generated by...

  12. Why Do Customers Leave? Can You Spot the Churners?

    • kaggle.com
    zip
    Updated Dec 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hassan El Fattmi (2024). Why Do Customers Leave? Can You Spot the Churners? [Dataset]. https://www.kaggle.com/datasets/hassanelfattmi/why-do-customers-leave-can-you-spot-the-churners/code
    Explore at:
    zip(3469602 bytes)Available download formats
    Dataset updated
    Dec 21, 2024
    Authors
    Hassan El Fattmi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The Telco customer churn dataset contains information about a fictional telco company that provided home phone and Internet services to 7043 customers in California in Q3. It indicates which customers have left, stayed, or signed up for their service. Multiple important demographics are included for each customer, as well as a Satisfaction Score, Churn Score, and Customer Lifetime Value (CLTV) index. It is designed to facilitate analysis of customer behavior and retention strategies.

    Customer_Info.csv

    Column NameDescription
    CustomerIDA unique ID that identifies each customer.
    GenderThe customer’s gender: Male, Female.
    AgeThe customer’s current age, in years, at the time the fiscal quarter ended.
    Senior CitizenIndicates if the customer is 65 or older: Yes, No.
    MarriedIndicates if the customer is married: Yes, No.
    DependentsIndicates if the customer lives with any dependents: Yes, No.
    Number of DependentsIndicates the number of dependents that live with the customer.

    Location_Data.csv

    Column NameDescription
    CustomerIDA unique ID that identifies each customer.
    CountryThe country of the customer’s primary residence.
    StateThe state of the customer’s primary residence.
    CityThe city of the customer’s primary residence.
    Zip CodeThe zip code of the customer’s primary residence.
    Total PopulationA current population estimate for the entire Zip Code area.
    LatitudeThe latitude of the customer’s primary residence.
    LongitudeThe longitude of the customer’s primary residence.

    Online_Services.csv

    Column NameDescription
    CustomerIDA unique ID that identifies each customer.
    Phone ServiceIndicates if the customer subscribes to home phone service with the company: Yes, No
    Internet ServiceIndicates if the customer subscribes to Internet service with the company: No, DSL, Fiber Optic, Cable.
    Online SecurityIndicates if the customer subscribes to an additional online security service provided by the company: Yes, No
    Online BackupIndicates if the customer subscribes to an additional online backup service provided by the company: Yes, No
    Device Protection PlanIndicates if the customer subscribes to an additional device protection plan: Yes, No
    Premium Tech SupportIndicates if the customer subscribes to an additional technical support plan: Yes, No
    Streaming TVIndicates if the customer uses their Internet service to stream television programming: Yes, No
    Streaming MoviesIndicates if the customer uses their Internet service to stream movies: Yes, No
    Streaming MusicIndicates if the customer uses their Internet service to stream music: Yes, No

    Payment_Info.csv

    Column NameDescription
    CustomerIDA unique ID that identifies each customer. ...
  13. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Devastator (2023). Movies Performance and Feature Statistics [Dataset]. https://www.kaggle.com/datasets/thedevastator/movies-performance-and-feature-statistics
Organization logo

Movies Performance and Feature Statistics

Analyzing Box Office Performance, Rating and Audience Reactions

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 16, 2023
Dataset provided by
Kaggle
Authors
The Devastator
Description

Movies Performance and Feature Statistics

Analyzing Box Office Performance, Rating and Audience Reactions

By Yashwanth Sharaff [source]

About this dataset

This dataset contains essential characteristics of a variety of movies, including basic pieces of information such as the movie's title and budget, as well as performance indicators like the movie's MPAA rating, gross revenue, release date, genre, runtime, rating count and summary. With this data set we can better understand the film industry and uncover insights on how different features and performance metrics impact one another to guarantee a movie's success. The movies dataset also helps you make informed decisions about which features are key indicators in setting up a high-grossing feature film

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

To get the most out of this data set you need to understand what each column in it represents. The ‘Title’ column gives you the title of the movie which can be used for further search or exploration on popular streaming services and websites that are dedicated to providing detailed information about movies. The ‘MPAA Rating’ lists any Motion Picture Association (MPAA) rating for a movie which consists of G (General Audiences), PG (Parental Guidance Suggested), PG-13 (Parents Strongly Cautioned), R (Under 17 Requires Accompanying Parent or Guardian) etc. The 'Budget' column give you an approximate idea about how much a particular production cost while the 'Gross' columns depicts its earnings if it was released in theaters while its successor 'Release Date' reveals when each film has been released or is going to release in future. The columns 'Genre', 'Runtime', and ‘Rating Count’ cover subje​cts such as what type of movie is it? Every genre will have an associated runtime limit along with rating count which refers to number people who have rated/reviewed a particular flick whether on IMDB or other streaming services as well as paper mediums like newspapers . Last but not least summary field states an overview of what we can expect from film so take this in account before watching anything especially if include children members in your family.

So go ahead - start exploring this interesting dataset today!

Research Ideas

  • Creating a box office prediction model using budget, genre, release date and MPAA rating
  • Using the summary data to create a sentiment analysis tool for movie reviews
  • Building a recommendation engine for users based on their prior ratings and what other users with similar tastes have rated as highly

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: movies.csv | Column name | Description | |:-----------------|:-------------------------------------------------------------------------------| | Title | The title of the movie. (String) | | MPAA Rating | The Motion Picture Association of America (MPAA) rating of the movie. (String) | | Budget | The budget of the movie in US dollars. (Integer) | | Gross | The gross revenue of the movie in US dollars. (Integer) | | Release Date | The date the movie was released. (Date) | | Genre | The genre of the movie. (String) | | Runtime | The length of the movie in minutes. (Integer) | | Rating Count | The number of ratings the movie has received. (Integer) | | Summary | A brief summary of the movie. (String) |

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Yashwanth Sharaff.

Search
Clear search
Close search
Google apps
Main menu