Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3023333%2F9f9df25b75671db2d255b2d284c2c80c%2Fnetwork_diagram.svg?generation=1739380045025331&alt=media" alt="">
Discover the new, expanded version of this dataset with 20,000 ticket entries! Perfect for training models to classify and prioritize support tickets.
Definetly check out my other Dataset:
Tickets from Github Issues
It includes priorities, queues, types, tags, and business types. This preview offers a detailed structure with classifications by department, type, priority, language, subject, full email text, and agent answers.
Field | Description | Values |
---|---|---|
🔀 Queue | Specifies the department to which the email ticket is routed | e.g. Technical Support, Customer Service, Billing and Payments, ... |
🚦 Priority | Indicates the urgency and importance of the issue | 🟢Low 🟠Medium 🔴Critical |
🗣️ Language | Indicates the language in which the email is written | EN, DE, ES, FR, PT |
Subject | Subject of the customer's email | |
Body | Body of the customer's email | |
Answer | The response provided by the helpdesk agent | |
Type | The type of ticket as picked by the agent | e.g. Incident, Request, Problem, Change ... |
🏢 Business Type | The business type of the support helpdesk | e.g. Tech Online Store, IT Services, Software Development Company |
Tags | Tags/categories assigned to the ticket, split into ten columns in the dataset | e.g. "Software Bug", "Warranty Claim" |
Specifies the department to which the email ticket is categorized. This helps in routing the ticket to the appropriate support team for resolution. - 💻 Technical Support: Technical issues and support requests. - 🈂️ Customer Service: Customer inquiries and service requests. - 💰 Billing and Payments: Billing issues and payment processing. - 🖥️ Product Support: Support for product-related issues. - 🌐 IT Support: Internal IT support and infrastructure issues. - 🔄 Returns and Exchanges: Product returns and exchanges. - 📞 Sales and Pre-Sales: Sales inquiries and pre-sales questions. - 🧑💻 Human Resources: Employee inquiries and HR-related issues. - ❌ Service Outages and Maintenance: Service interruptions and maintenance. - 📮 General Inquiry: General inquiries and information requests.
Indicates the urgency and importance of the issue. Helps in managing the workflow by prioritizing tickets that need immediate attention. - 🟢 1 (Low): Non-urgent issues that do not require immediate attention. Examples: general inquiries, minor inconveniences, routine updates, and feature requests. - 🟠 2 (Medium): Moderately urgent issues that need timely resolution but are not critical. Examples: performance issues, intermittent errors, and detailed user questions. - 🔴 3 (Critical): Urgent issues that require immediate attention and quick resolution. Examples: system ...
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Overview This dataset comprises detailed records of customer support tickets, providing valuable insights into various aspects of customer service operations. It is designed to aid in the analysis and modeling of customer support processes, offering a wealth of information for data scientists, machine learning practitioners, and business analysts.
Dataset Description The dataset includes the following features:
Ticket ID: Unique identifier for each support ticket. Customer Name: Name of the customer who submitted the ticket. Customer Email: Email address of the customer. Customer Age: Age of the customer. Customer Gender: Gender of the customer. Product Purchased: Product for which the customer has requested support. Date of Purchase: Date when the product was purchased. Ticket Type: Type of support ticket (e.g., Technical Issue, Billing Inquiry). Ticket Subject: Brief subject or title of the ticket. Ticket Description: Detailed description of the issue or inquiry. Ticket Status: Current status of the ticket (e.g., Open, Closed, Pending). Resolution: Description of how the ticket was resolved. Ticket Priority: Priority level of the ticket (e.g., High, Medium, Low). Ticket Channel: The Channel through which the ticket was submitted (e.g., Email, Phone, Web). First Response Time: Time taken for the first response to the ticket. Time to Resolution: Total time taken to resolve the ticket. Customer Satisfaction Rating: Customer satisfaction rating for the support received. Usage This dataset can be utilized for various analytical and modeling purposes, including but not limited to:
Customer Support Analysis: Understand trends and patterns in customer support requests, and analyze ticket volumes, response times, and resolution effectiveness. NLP for Ticket Categorization: Develop natural language processing models to automatically classify tickets based on their content. Customer Satisfaction Prediction: Build predictive models to estimate customer satisfaction based on ticket attributes. Ticket Resolution Time Prediction: Predict the time required to resolve tickets based on historical data. Customer Segmentation: Segment customers based on their support interactions and demographics. Recommender Systems: Develop systems to recommend products or solutions based on past support tickets. Potential Applications: Enhancing customer support workflows by identifying bottlenecks and areas for improvement. Automating the ticket triaging process to ensure timely responses. Improving customer satisfaction through predictive analytics. Personalizing customer support based on segmentation and past interactions. File information: The dataset is provided in CSV format and contains 8470 records and [number of columns] features.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Featuring Labeled Customer Emails and Support Responses
🔧 Synthetic IT Ticket Generator — Custom Dataset
Create a dataset tailored to your own queues & priorities (no PII). 👉 Generate custom data
Define your queues, priorities, language
Need an on-prem AI to auto-classify tickets?→ Open Ticket AI There are 2 Versions of the dataset, the new version has more tickets, but only languages english and german. So please look at both files, to find what best fits your needs.… See the full description on the dataset page: https://huggingface.co/datasets/Tobi-Bueck/customer-support-tickets.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains fictitious data representing customer sales for TechGenius Inc. The dataset includes various attributes related to customers, their sales revenue for the years 2022 and 2023, customer satisfaction scores, customer ages, customer locations, and technical support tickets.
Data Description:
Customer ID: Unique identifier for each customer. Year: The year of the sales data (2022 or 2023). Sales Revenue 2022 (USD): Total sales revenue generated by the customer in the year 2022 (USD). Sales Revenue 2023 (USD): Total sales revenue generated by the customer in the year 2023 (USD). Customer Satisfaction Score: The satisfaction score of the customer, ranging from 3.5 to 5.0. Customer Age: The age of the customer. Customer Location: The location (city) of the customer. Technical Support Tickets: The number of technical support tickets raised by the customer.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Yash Dixit
Released under Apache 2.0
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Airline data holds immense importance as it offers insights into the functioning and efficiency of the aviation industry. It provides valuable information about flight routes, schedules, passenger demographics, and preferences, which airlines can leverage to optimize their operations and enhance customer experiences. By analyzing data on delays, cancellations, and on-time performance, airlines can identify trends and implement strategies to improve punctuality and mitigate disruptions. Moreover, regulatory bodies and policymakers rely on this data to ensure safety standards, enforce regulations, and make informed decisions regarding aviation policies. Researchers and analysts use airline data to study market trends, assess environmental impacts, and develop strategies for sustainable growth within the industry. In essence, airline data serves as a foundation for informed decision-making, operational efficiency, and the overall advancement of the aviation sector.
This dataset comprises diverse parameters relating to airline operations on a global scale. The dataset prominently incorporates fields such as Passenger ID, First Name, Last Name, Gender, Age, Nationality, Airport Name, Airport Country Code, Country Name, Airport Continent, Continents, Departure Date, Arrival Airport, Pilot Name, and Flight Status. These columns collectively provide comprehensive insights into passenger demographics, travel details, flight routes, crew information, and flight statuses. Researchers and industry experts can leverage this dataset to analyze trends in passenger behavior, optimize travel experiences, evaluate pilot performance, and enhance overall flight operations.
https://i.imgur.com/cUFuMeU.png" alt="">
The dataset provided here is a simulated example and was generated using the online platform found at Mockaroo. This web-based tool offers a service that enables the creation of customizable Synthetic datasets that closely resemble real data. It is primarily intended for use by developers, testers, and data experts who require sample data for a range of uses, including testing databases, filling applications with demonstration data, and crafting lifelike illustrations for presentations and tutorials. To explore further details, you can visit their website.
Cover Photo by: Kevin Woblick on Unsplash
Thumbnail by: Airplane icons created by Freepik - Flaticon
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The Flights Booking Dataset of various Airlines is a scraped datewise from a famous website in a structured format. The dataset contains the records of flight travel details between the cities in India. Here, multiple features are present like Source & Destination City, Arrival & Departure Time, Duration & Price of the flight etc.
This data is available as a CSV file. We are going to analyze this data set using the Pandas DataFrame.
This analyse will be helpful for those working in Airlines, Travel domain.
Using this dataset, we answered multiple questions with Python in our Project.
Q.1. What are the airlines in the dataset, accompanied by their frequencies?
Q.2. Show Bar Graphs representing the Departure Time & Arrival Time.
Q.3. Show Bar Graphs representing the Source City & Destination City.
Q.4. Does price varies with airlines ?
Q.5. Does ticket price change based on the departure time and arrival time?
Q.6. How the price changes with change in Source and Destination?
Q.7. How is the price affected when tickets are bought in just 1 or 2 days before departure?
Q.8. How does the ticket price vary between Economy and Business class?
Q.9. What will be the Average Price of Vistara airline for a flight from Delhi to Hyderabad in Business Class ?
These are the main Features/Columns available in the dataset :
1) Airline: The name of the airline company is stored in the airline column. It is a categorical feature having 6 different airlines.
2) Flight: Flight stores information regarding the plane's flight code. It is a categorical feature.
3) Source City: City from which the flight takes off. It is a categorical feature having 6 unique cities.
4) Departure Time: This is a derived categorical feature obtained created by grouping time periods into bins. It stores information about the departure time and have 6 unique time labels.
5) Stops: A categorical feature with 3 distinct values that stores the number of stops between the source and destination cities.
6) Arrival Time: This is a derived categorical feature created by grouping time intervals into bins. It has six distinct time labels and keeps information about the arrival time.
7) Destination City: City where the flight will land. It is a categorical feature having 6 unique cities.
8) Class: A categorical feature that contains information on seat class; it has two distinct values: Business and Economy.
9) Duration: A continuous feature that displays the overall amount of time it takes to travel between cities in hours.
10) Days Left: This is a derived characteristic that is calculated by subtracting the trip date by the booking date.
11) Price: Target variable stores information of the ticket price.
DESCRIPTION
Comcast is an American global telecommunication company. The firm has been providing terrible customer service. They continue to fall short despite repeated promises to improve. Only last month (October 2016) the authority fined them a $2.3 million, after receiving over 1000 consumer complaints. The existing database will serve as a repository of public customer complaints filed against Comcast. It will help to pin down what is wrong with Comcast's customer service.
Data Dictionary
Analysis Task
To perform these tasks, you can use any of the different Python libraries such as NumPy, SciPy, Pandas, scikit-learn, matplotlib, and BeautifulSoup.
Which complaint types are maximum i.e., around internet, network issues, or across any other domains. - Create a new categorical variable with value as Open and Closed. Open & Pending is to be categorized as Open and Closed & Solved is to be categorized as Closed. - Provide state wise status of complaints in a stacked bar chart. Use the categorized variable from Q3. Provide insights on:
Which state has the maximum complaints Which state has the highest percentage of unresolved complaints - Provide the percentage of complaints resolved till date, which were received through the Internet and customer care calls.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is a combination of Stanford's Alpaca (https://github.com/tatsu-lab/stanford_alpaca) and FiQA (https://sites.google.com/view/fiqa/) with another 1.3k pairs custom generated using GPT3.5 Script for tuning through Kaggle's (https://www.kaggle.com) free resources using PEFT/LoRa: https://www.kaggle.com/code/gbhacker23/wealth-alpaca-lora GitHub repo with performance analyses, training and data generation scripts, and inference notebooks: https://github.com/gaurangbharti1/wealth-alpaca… See the full description on the dataset page: https://huggingface.co/datasets/gbharti/finance-alpaca.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
A high-quality synthetic dataset simulating 100,000+ customers with:
- Demographics (age, gender, location, income)
- Transaction history (purchase dates, spending, product categories)
- Behavioral metrics (churn risk, device usage, support tickets)
- Perfect for segmentation, CLV prediction, and churn modeling
No real PII – Safe for sharing
Realistic distributions – Mimics e-commerce patterns
Ready-to-use – Clean, structured, and documented
synthetic_customers.csv
(100K rows × 20 columns)
synthetic_customers_metadata.json
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
A ride-sharing company wants to implement a dynamic pricing strategy to optimize fares based on real-time market conditions. The company only uses ride duration to decide ride fares currently. The company aims to leverage data-driven techniques to analyze historical data and develop a predictive model that can dynamically adjust prices in response to changing factors.
The dataset containing historical ride data has been provided. It includes features such as the number of riders, number of drivers, location category, customer loyalty status, number of past rides, average ratings, time of booking, vehicle type, expected ride duration, and historical cost of the rides.
Your goal is to build a dynamic pricing model that incorporates the provided features to predict optimal fares for rides in real-time. The model must consider factors such as demand patterns and supply availability.
https://i0.wp.com/vitalflux.com/wp-content/uploads/2023/07/dynamic-pricing-machine-learning-strategies-examples.png?resize=1536%2C698&ssl=1" alt="ridimage">
Features:
'Number_of_Riders', 'Number_of_Drivers', 'Location_Category', 'Customer_Loyalty_Status', 'Number_of_Past_Rides', 'Average_Ratings', 'Time_of_Booking', 'Vehicle_Type', 'Expected_Ride_Duration', 'Historical_Cost_of_Ride'
Some References:
- Dynamic Pricing Explained: Machine Learning in Revenue Management and Pricing Optimization
- Dynamic Pricing using Reinforcement Learning
- Dynamic Pricing on E-commerce Platform with Deep Reinforcement Learning: A Field Experiment
- Engineering Extreme Event Forecasting at Uber with Recurrent Neural Networks
http://www.gnu.org/licenses/fdl-1.3.htmlhttp://www.gnu.org/licenses/fdl-1.3.html
Small toy data inspired by ITSM (IT service management) tickets. Including noisy labels, multiple languages and missing data on purpose. Here is one data examination and cleaning procedure written by me:
Feel free to add yours!
Description: The data represents Slack Queries of an Ed Tech Company and their resolution status.
Features:
Ticket Id – Unique ticket id for each query raised. Student or WP- Categories of person asking query (Student, Working Professional). Program Name- Program Name ('Full stack Program', 'Backend Program', 'Fellowship Program'). Status (Ticket)- Status of Ticket. Created Time (Ticket)- Time when ticket was created. Ticket Closed Time- Time when query was resolved. First Response Time- Time when first responded. Project Phase- Project phase ('trial phase', 'fullstack-phase-1', 'fullstack-phase-2', 'system-issues', 'backend-phase2', 'fullstack-phase-4', 'fullstack-phase-3', 'fellowship-phase-1', 'backend-phase-3', 'backend-phase1')
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3023333%2F9f9df25b75671db2d255b2d284c2c80c%2Fnetwork_diagram.svg?generation=1739380045025331&alt=media" alt="">
Discover the new, expanded version of this dataset with 20,000 ticket entries! Perfect for training models to classify and prioritize support tickets.
Definetly check out my other Dataset:
Tickets from Github Issues
It includes priorities, queues, types, tags, and business types. This preview offers a detailed structure with classifications by department, type, priority, language, subject, full email text, and agent answers.
Field | Description | Values |
---|---|---|
🔀 Queue | Specifies the department to which the email ticket is routed | e.g. Technical Support, Customer Service, Billing and Payments, ... |
🚦 Priority | Indicates the urgency and importance of the issue | 🟢Low 🟠Medium 🔴Critical |
🗣️ Language | Indicates the language in which the email is written | EN, DE, ES, FR, PT |
Subject | Subject of the customer's email | |
Body | Body of the customer's email | |
Answer | The response provided by the helpdesk agent | |
Type | The type of ticket as picked by the agent | e.g. Incident, Request, Problem, Change ... |
🏢 Business Type | The business type of the support helpdesk | e.g. Tech Online Store, IT Services, Software Development Company |
Tags | Tags/categories assigned to the ticket, split into ten columns in the dataset | e.g. "Software Bug", "Warranty Claim" |
Specifies the department to which the email ticket is categorized. This helps in routing the ticket to the appropriate support team for resolution. - 💻 Technical Support: Technical issues and support requests. - 🈂️ Customer Service: Customer inquiries and service requests. - 💰 Billing and Payments: Billing issues and payment processing. - 🖥️ Product Support: Support for product-related issues. - 🌐 IT Support: Internal IT support and infrastructure issues. - 🔄 Returns and Exchanges: Product returns and exchanges. - 📞 Sales and Pre-Sales: Sales inquiries and pre-sales questions. - 🧑💻 Human Resources: Employee inquiries and HR-related issues. - ❌ Service Outages and Maintenance: Service interruptions and maintenance. - 📮 General Inquiry: General inquiries and information requests.
Indicates the urgency and importance of the issue. Helps in managing the workflow by prioritizing tickets that need immediate attention. - 🟢 1 (Low): Non-urgent issues that do not require immediate attention. Examples: general inquiries, minor inconveniences, routine updates, and feature requests. - 🟠 2 (Medium): Moderately urgent issues that need timely resolution but are not critical. Examples: performance issues, intermittent errors, and detailed user questions. - 🔴 3 (Critical): Urgent issues that require immediate attention and quick resolution. Examples: system ...