29 datasets found
  1. IBM🎗️ | Stock Prices Dataset📊

    • kaggle.com
    zip
    Updated May 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mursaleen Ameer (2024). IBM🎗️ | Stock Prices Dataset📊 [Dataset]. https://www.kaggle.com/datasets/innocentmfa/ibm-stock-prices-dataset
    Explore at:
    zip(27608 bytes)Available download formats
    Dataset updated
    May 8, 2024
    Authors
    Mursaleen Ameer
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Description:

    This dataset contains historical stock price data for International Business Machines Corporation (IBM) from [Jan/01/2020] to [May/01/2024]. The dataset includes daily closing prices, adjusted closing prices, and other relevant information.

    Features:

    • Date:
    • Open:
    • High:
    • Low:
    • Close:
    • Adj Close:
    • Volume:

    Use Cases:

    • Predicting stock prices
    • Building stock forecasting models
    • Analyzing stock market trends
    • Backtesting investment strategies
    • Comparing machine learning models for stock prediction

      This dataset is perfect for data scientists, analysts, and students looking to practice their skills in:

    • Time series analysis

    • Stock market analysis

    • Predictive modeling

    • Machine learning

    Get started: Download the dataset and start exploring!

  2. i

    IBM Debater® - Recorded Debating Dataset - Release #4 (Compressed audio...

    • research.ibm.com
    Updated Sep 25, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). IBM Debater® - Recorded Debating Dataset - Release #4 (Compressed audio files) + Annotated general-purpose claim-rebuttal pairs 200 speeches recorded by professional debaters discussing 50 controversial topics (with their manual and automatic transcriptions), and 55 general-purpose claim-rebuttal pairs, along with the results of several annotation experiments performed on these data. The dataset includes: - Audio files of 200 debating speeches (down-sampled, mono & compressed with flac). [first released in IBM Debater® - Recorded Debating Dataset - Release #2] - Manual and automatic transcripts of the speeches, in both raw and cleaned (processed) versions. [first released in IBM Debater® - Recorded Debating Dataset - Release #2] - 55 general-purpose claim-rebuttal pairs written by an expert human debater - The results of several annotation experiments performed using the general-purpose claim-rebuttal pairs and the speeches Size: 1.2 GB [Dataset]. https://research.ibm.com/haifa/dept/vst/debating_data.shtml
    Explore at:
    Dataset updated
    Sep 25, 2017
    Description

    200 speeches recorded by professional debaters discussing 50 controversial topics (with their manual and automatic transcriptions), and 55 general-purpose claim-rebuttal pairs, along with the results of several annotation experiments performed on these data. The dataset includes: - Audio files of 200 debating speeches (down-sampled, mono & compressed with flac). [first released in IBM Debater® - Recorded Debating Dataset - Release #2] - Manual and automatic transcripts of the speeches, in both raw and cleaned (processed) versions. [first released in IBM Debater® - Recorded Debating Dataset - Release #2] - 55 general-purpose claim-rebuttal pairs written by an expert human debater - The results of several annotation experiments performed using the general-purpose claim-rebuttal pairs and the speeches Size: 1.2 GB

  3. Coffee shop sample data (11.1.3+)

    • kaggle.com
    zip
    Updated Nov 8, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jack Chang (2019). Coffee shop sample data (11.1.3+) [Dataset]. https://www.kaggle.com/datasets/ylchang/coffee-shop-sample-data-1113/suggestions
    Explore at:
    zip(602708 bytes)Available download formats
    Dataset updated
    Nov 8, 2019
    Authors
    Jack Chang
    Description

    Context

    This sample data module contains representative retail data from a fictional coffee chain. The source data is contained in an uploaded file named April Sales.zip. Source: IBM.

    We have created sample data for a fictional coffee shop chain with three locations in New York city. The chain has purchased IBM Cognos Analytics to identify factors that contribute to their success, and ultimately to make data-informed decisions.

    Amber and Sandeep are the co-founders of the coffee chain. They uploaded their data in a series of spreadsheets and created a data module. From that data, they designed an operations dashboard and a marketing dashboard.

    Inventory

    Amber and Sandeep have created two dashboards and one data module that is based on nine spreadsheets:

    • Coffee operations: This sample dashboard demonstrates operational data from a fictional coffee chain. Location: Team content > Samples > Dashboards.
    • Coffee marketing: This sample dashboard demonstrates marketing data from a fictional coffee chain. Location: Team content > Samples > Dashboards.
    • Coffee sales and marketing: This sample data module contains representative retail data from a fictional coffee chain. Location: Team content > Samples > Data.
    • April Sales.zip: This sample data contains representative retail data from a fictional coffee chain. This ZIP file contains nine related CSV files. Location: Team content > Samples > Data > Source files > Retail.

    Content

    Data

    The sample data module named Coffee sales and marketing can be found in Team content > Samples > Data. There are nine tables:

    • Sales Receipts
    • Pastry Inventory
    • Sales Targets
    • Customer
    • Dates
    • Product
    • Sales Outlet
    • Staff
    • Generation

    Acknowledgements

    https://community.ibm.com/community/user/businessanalytics/blogs/steven-macko/2019/07/12/beanie-coffee-1113

  4. h

    earnings_call

    • huggingface.co
    • dataverse.nl
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Henning, earnings_call [Dataset]. http://doi.org/10.34894/TJE0D0
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    John Henning
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    The dataset reports a collection of earnings call transcripts, the related stock prices, and the sector index In terms of volume, there is a total of 188 transcripts, 11970 stock prices, and 1196 sector index values. Furthermore, all of these data originated in the period 2016-2020 and are related to the NASDAQ stock market. Furthermore, the data collection was made possible by Yahoo Finance and Thomson Reuters Eikon. Specifically, Yahoo Finance enabled the search for stock values and Thomson Reuters Eikon provided the earnings call transcripts. Lastly, the dataset can be used as a benchmark for the evaluation of several NLP techniques to understand their potential for financial applications. Moreover, it is also possible to expand the dataset by extending the period in which the data originated following a similar procedure.

  5. 2019 Stack Overflow Developer Survey Random Sample

    • kaggle.com
    zip
    Updated Mar 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nolan37 (2022). 2019 Stack Overflow Developer Survey Random Sample [Dataset]. https://www.kaggle.com/datasets/nolan37/m1surveydata
    Explore at:
    zip(5934140 bytes)Available download formats
    Dataset updated
    Mar 4, 2022
    Authors
    Nolan37
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    The underlying data is from Stack Overflow's 2019 Developer Survey Responses and can be found: https://stackoverflow.blog/2019/04/09/the-2019-stack-overflow-developer-survey-results-are-in/ Please note my intent with uploading this is to showcase my experience working with the datasets. My goal is to build a centralized portfolio.

    Please note that we are using a randomized sample of 1/10th the original data set. Conclusions may not reflect real world.

    The goal of this project was to explore, analyze, and visualize.

    Follow this link to see the Cognos Dashboard I created: https://dataplatform.cloud.ibm.com/dashboards/ee7bf962-3882-4145-a41c-ecdda9323484/view/4427dc2d63b71c921ee1e6e4079c29002c362d5fe4bb860ad18c7b495d607297f3614099c82f4d5bde135661a7e8400f9d

    Feel free to filter and play with the dashboard as you want.

  6. Naturalistic Variation in Goal-Oriented Dialog datasets

    • github.com
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM (2024). Naturalistic Variation in Goal-Oriented Dialog datasets [Dataset]. https://github.com/IBM/naturalistic-variation-goal-oriented-dialog-datasets
    Explore at:
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    IBMhttp://ibm.com/
    Description

    The datasets are new and more effective testbeds for bAbI dialog task 5 and Stanford Multi-Domain datasets, which incorporate naturalistic variation by the user. Existing benchmarks used to evaluate the performance of end-to-end neural dialog systems lack a key component: natural variation present in human conversations. Most datasets are constructed through crowdsourcing, where the crowd workers follow a fixed template of instructions while enacting the role of a user/agent. This results in straight-forward, somewhat routine, and mostly trouble-free conversations, as crowd workers do not think to represent the full range of actions that occur naturally with real users. We observe that there is a significant drop in performance (more than 60% in Ent. F1 on SMD and 85% in per-dialog accuracy on bAbI task) of recent state-of-the-art end-to-end neural methods such as BossNet and GLMP on both updated datasets which incorporate naturalistic variation by the user.

  7. i

    IBM Debater® - Recorded Debating Dataset - Release #1 (Light version - no...

    • research.ibm.com
    Updated Sep 25, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). IBM Debater® - Recorded Debating Dataset - Release #1 (Light version - no audio files) 60 speeches recorded by professional debaters about controversial topics, and their manual and automatic transcripts, in both raw and cleaned (processed) versions. The dataset includes: - Manual and automatic transcripts of the speeches, raw and cleaned versions Size: 1MB [Dataset]. https://research.ibm.com/haifa/dept/vst/debating_data.shtml
    Explore at:
    Dataset updated
    Sep 25, 2017
    Description

    60 speeches recorded by professional debaters about controversial topics, and their manual and automatic transcripts, in both raw and cleaned (processed) versions. The dataset includes: - Manual and automatic transcripts of the speeches, raw and cleaned versions Size: 1MB

  8. Telco Customer Churn

    • kaggle.com
    zip
    Updated Feb 23, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BlastChar (2018). Telco Customer Churn [Dataset]. https://www.kaggle.com/datasets/blastchar/telco-customer-churn
    Explore at:
    zip(175758 bytes)Available download formats
    Dataset updated
    Feb 23, 2018
    Authors
    BlastChar
    Description

    Context

    "Predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs." [IBM Sample Data Sets]

    Content

    Each row represents a customer, each column contains customer’s attributes described on the column Metadata.

    The data set includes information about:

    • Customers who left within the last month – the column is called Churn
    • Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies
    • Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges
    • Demographic info about customers – gender, age range, and if they have partners and dependents

    Inspiration

    To explore this type of models and learn more about the subject.

    New version from IBM: https://community.ibm.com/community/user/businessanalytics/blogs/steven-macko/2019/07/11/telco-customer-churn-1113

  9. i

    IBM Debater® - Recorded Debating Dataset - Release #5 (Full version - 2...

    • research.ibm.com
    Updated Sep 25, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). IBM Debater® - Recorded Debating Dataset - Release #5 (Full version - 2 parts) + Counter speech annotations 3,562 speeches recorded by professional debaters discussing 440 controversial topics (with their automatic and manually-corrected transcript texts), and an annotation specifying the response speeches recorded for each speech. The dataset will include: - Audio files of all debate speeches - Automatic and manually-corrected transcripts of the speeches, in both raw and cleaned (processed) versions - An annotation specifying the response speeches recorded for each speech, and the type of the response (explicit/implicit) - Metadata describing the speeches, such as the topic discussed in each speech Size: 30 + 21.7 GB [Dataset]. https://research.ibm.com/haifa/dept/vst/debating_data.shtml
    Explore at:
    Dataset updated
    Sep 25, 2017
    Description

    3,562 speeches recorded by professional debaters discussing 440 controversial topics (with their automatic and manually-corrected transcript texts), and an annotation specifying the response speeches recorded for each speech. The dataset will include: - Audio files of all debate speeches - Automatic and manually-corrected transcripts of the speeches, in both raw and cleaned (processed) versions - An annotation specifying the response speeches recorded for each speech, and the type of the response (explicit/implicit) - Metadata describing the speeches, such as the topic discussed in each speech Size: 30 + 21.7 GB

  10. MarketScan Dental

    • redivis.com
    • stanford.redivis.com
    application/jsonl +7
    Updated Jun 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Center for Population Health Sciences (2025). MarketScan Dental [Dataset]. http://doi.org/10.57761/g33d-dy59
    Explore at:
    csv, avro, parquet, spss, arrow, application/jsonl, stata, sasAvailable download formats
    Dataset updated
    Jun 27, 2025
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford Center for Population Health Sciences
    Time period covered
    Jan 1, 2007 - Dec 31, 2023
    Description

    Abstract

    The MarketScan Dental Database is a standalone product that corresponds with and is linkable to a given year and version of the IBM MarketScan Commercial Claims and Encounters Database and the MarketScan Medicare Supplemental and Coordination of Benefits Database. Currently, data is available for the years: 2005 - 2023. In order to view the MarketScan Dental user guide or data dictionary, you must have data access to this dataset.

    Usage

    In addition to what's on this page, we also have:

    %3C!-- --%3E

    %3C!-- --%3E

    **Starting in 2026, there will be a data access fee for using the full dataset **(though the 1% sample will remain free to use). The pricing structure and other **relevant information can be found in this **FAQ Sheet.

    Before Manuscript Submission

    All manuscripts (and other items you'd like to publish) must be submitted to

    support@stanfordphs.freshdesk.com for approval prior to journal submission.

    We will check your cell sizes and citations.

    For more information about how to cite PHS and PHS datasets, please visit:

    https:/phsdocs.developerhub.io/need-help/citing-phs-data-core

    Data Documentation

    Data access is required to view this section.

    Section 3

    Metadata access is required to view this section.

    Section 4

    Metadata access is required to view this section.

    Section 5

    Metadata access is required to view this section.

    Section 6

    Metadata access is required to view this section.

  11. IBM Transactions for Anti Money Laundering (AML)

    • kaggle.com
    zip
    Updated Jul 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erik Altman (2025). IBM Transactions for Anti Money Laundering (AML) [Dataset]. https://www.kaggle.com/datasets/ealtman2019/ibm-transactions-for-anti-money-laundering-aml/code
    Explore at:
    zip(8176169418 bytes)Available download formats
    Dataset updated
    Jul 8, 2025
    Authors
    Erik Altman
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    CONTEXT

    ========================================

    ========================================


    Money laundering is a multi-billion dollar issue. Detection of laundering is very difficult. Most automated algorithms have a high false positive rate: legitimate transactions incorrectly flagged as laundering. The converse is also a major problem -- false negatives, i.e. undetected laundering transactions. Naturally, criminals work hard to cover their tracks.

    Access to real financial transaction data is highly restricted -- for both proprietary and privacy reasons. Even when access is possible, it is problematic to provide a correct tag (laundering or legitimate) to each transaction -- as noted above. This synthetic transaction data from IBM avoids these problems.

    The data provided here is based on a virtual world inhabited by individuals, companies, and banks. Individuals interact with other individuals and companies. Likewise, companies interact with other companies and with individuals. These interactions can take many forms, e.g. purchase of consumer goods and services, purchase orders for industrial supplies, payment of salaries, repayment of loans, and more. These financial transactions are generally conducted via banks, i.e. the payer and receiver both have accounts, with accounts taking multiple forms from checking to credit cards to bitcoin.

    Some (small) fraction of the individuals and companies in the generator model engage in criminal behavior -- such as smuggling, illegal gambling, extortion, and more. Criminals obtain funds from these illicit activities, and then try to hide the source of these illicit funds via a series of financial transactions. Such financial transactions to hide illicit funds constitute laundering. Thus, the data available here is labelled and can be used for training and testing AML (Anti Money Laundering) models and for other purposes.

    The data generator that created the data here not only models illicit activity, but also tracks funds derived from illicit activity through arbitrarily many transactions -- thus creating the ability to label laundering transactions many steps removed from their illicit source. With this foundation, it is straightforward for the generator to label individual transactions as laundering or legitimate.

    Note that this IBM generator models the entire money laundering cycle: - Placement: Sources like smuggling of illicit funds. - Layering: Mixing the illicit funds into the financial system. - Integration: Spending the illicit funds.

    As another capability possible only with synthetic data, note that a real bank or other institution typically has access to only a portion of the transactions involved in laundering: the transactions involving that bank. Transactions happening at other banks or between other banks are not seen. Thus, models built on real transactions from one institution can have only a limited view of the world.

    By contrast these synthetic transactions contain an entire financial ecosystem. Thus it may be possible to create laundering detection models that undertand the broad sweep of transactions across institutions, but apply those models to make inferences only about transactions at a particular bank.

    As another point of reference, IBM previously released data from a very early version of this data generator: https://ibm.box.com/v/AML-Anti-Money-Laundering-Data

    The generator has been made significantly more robust since that previous data was released, and these transactions reflect improved realism, bug fixes, and other improvements compared to the previous release.

    Credit card transaction data labeled for fraud and built using a related generator is also available on Kaggle: https://www.kaggle.com/datasets/ealtman2019/credit-card-transactions

    CONTENT

    We release 6 datasets here divided into two groups of three: - Group HI has a relatively higher illicit ratio (more laundering). - Group LI has a relatively lower illicit ratio (less laundering).

    Both HI and LI internally have three sets of data: small, medium, and large. The goal is to support a broad degree of modeling and computational resources. All of these datasets are independent, e.g. the small datasets are not ...

  12. The Dionísio Effect: Perfect Quantum Coherence Stability Dataset

    • zenodo.org
    zip
    Updated Jun 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andre Luis Tomaz Dionisio; Andre Luis Tomaz Dionisio (2025). The Dionísio Effect: Perfect Quantum Coherence Stability Dataset [Dataset]. http://doi.org/10.5281/zenodo.15668069
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 15, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andre Luis Tomaz Dionisio; Andre Luis Tomaz Dionisio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    Dataset containing 160 quantum computing experiments performed on IBM Quantum hardware
    demonstrating unprecedented perfect coherence stability. All experiments yielded
    identical suppression factors (64.0 ± 0.0) across 1,310,720 individual quantum
    measurements. This work was completed using only IBM's free quantum computing tier
    (10 minutes/month), proving that groundbreaking quantum research is accessible to all.
    Related to Physical Review Letters submissions LF19916 and LF20020.
  13. Detailed Data for Cluster Analysis in IBM SPSS Statistics

    • figshare.com
    xlsx
    Updated Sep 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vitaliy Kolomiets (2024). Detailed Data for Cluster Analysis in IBM SPSS Statistics [Dataset]. http://doi.org/10.6084/m9.figshare.27083131.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Sep 22, 2024
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Vitaliy Kolomiets
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ASSESSMENT OF THE ECONOMIC EFFICIENCY OF FEDERAL FMCG RETAIL CHAINS IN RUSSIA: A CLUSTER APPROACH

  14. Multinomial logistic regression parameter estimates for the Allergy Model.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sydney Banton; Andrew Baynham; Júlia G. Pezzali; Michael von Massow; Anna K. Shoveller (2023). Multinomial logistic regression parameter estimates for the Allergy Model. [Dataset]. http://doi.org/10.1371/journal.pone.0250806.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Sydney Banton; Andrew Baynham; Júlia G. Pezzali; Michael von Massow; Anna K. Shoveller
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Multinomial logistic regression parameter estimates for the Allergy Model.

  15. Z

    Brainport, Automated valet parking, parking spot detected by drone

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DLR (Deutsches Zentrum für Luft- und Raumfahrt); VICOMTECH; NEVS (National Electric Vehicle Sweden); TNO (2020). Brainport, Automated valet parking, parking spot detected by drone [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_3607937
    Explore at:
    Dataset updated
    Jan 24, 2020
    Authors
    DLR (Deutsches Zentrum für Luft- und Raumfahrt); VICOMTECH; NEVS (National Electric Vehicle Sweden); TNO
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Scenario description:

    The drone receives AVP command message (Message type AutoPilot.DroneAVPCommand) from PMS via the IBM IoT Platfom. The command message contains the instruction about the selected parking spots to be checked. The drone takes off and fly to the corresponding parking spots detects the occupancy (FREE or OCCUPIED) of the parking spot and publishes the message from type AutoPilot.ParkingSpotDetection to the PMS via IBM Watson IoT platform und return to the lading position and landed. During the flight the drone sends continuously the message about it current position and some status information as message from type AutoPilot.PositionEstimate to the PMS via IoT Platform.

    Session description:

    Selection of one free parking spot to be check (see command message contain), the drone detects the parking spot and publish the occupancy information to the PMS for parking management purpose

    Datasets descriptions:

    AUTOPILOT_BrainPort_AutomatedValetParking_DriverVehicleInteraction: Data extracted from the CAN of the vehicle

    Dataset Description This dataset contains e.g. throttlestatus, clutchstatus, brakestatus, brakeforce, wipersstatus, steeringwheel for the vehicle

    AUTOPILOT_BrainPort_AutomatedValetParking_DroneAvpCommand: Data sent from drone

    Dataset Description This dataset contains route information for a vehicle to a designated parking spot

    AUTOPILOT_BrainPort_AutomatedValetParking_EnvironmentSensorsAbsolute: Data extracted from the vehicle environment sensors

    Dataset Description This dataset contains information about detected object, with absolute coordinates

    AUTOPILOT_BrainPort_AutomatedValetParking_EnvironmentSensorsRelative: Data extracted from the vehicle environment sensors

    Dataset Description This dataset contains information about detected object, with relative coordinates

    AUTOPILOT_BrainPort_AutomatedValetParking_IotVehicleMessage: Data sent between all devices, vehicles and services

    Dataset Description Each sensor data submission is a Message. A Message has an Envelope, a Path, and optionally (but likely) Path Events and optionally Path Media. The envelope bears fundamental information about the individual sender (the vehicle) but not to a level that owner of the vehicle can be identified or different messages can be identified that originate from a single vehicle.

    AUTOPILOT_BrainPort_AutomatedValetParking_ParkingSpotDetection: Data sent from drone to parkingService

    Dataset Description This dataset contains informaton about detected parking spots

    AUTOPILOT_BrainPort_AutomatedValetParking_PositioningSystem: Data from GPS on the vehicle

    Dataset Description This dataset contains speed, longitude, latitude, heading from the GPS

    AUTOPILOT_BrainPort_AutomatedValetParking_PositioningSystemResampled: Data from GPS on the vehicle

    Dataset Description This dataset contains speed,longitude,latitude,heading from the GPS, resampled to 100 milliseconds

    AUTOPILOT_BrainPort_AutomatedValetParking_Vehicle: Data from the CAN and sensors about the state of the vehicle

    Dataset Description This dataset contains a.o temperature and battery state of the vehicles

    AUTOPILOT_BrainPort_AutomatedValetParking_VehicleAvpCommand: Data sent from ParkingService to vehicle

    Dataset Description This dataset contains route to parkingspot, and some other environmental information

    AUTOPILOT_BrainPort_AutomatedValetParking_VehicleAvpStatus: Data sent from vehicle to ParkingService

    Dataset Description This dataset contains information about the current status and parkingstatus of the vehicle

    AUTOPILOT_BrainPort_AutomatedValetParking_VehicleDynamics: Data from the CAN and sensors about the state of the vehicle

    Dataset Description This dataset contains a.o accelerations and speedlimit of the vehicle, as observed from the CAN and the external sensors

  16. h

    telco-customer-churn

    • huggingface.co
    Updated Feb 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    aai510-group1 (2025). telco-customer-churn [Dataset]. https://huggingface.co/datasets/aai510-group1/telco-customer-churn
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 18, 2025
    Dataset authored and provided by
    aai510-group1
    Description

    Dataset Card for Telco Customer Churn

    This dataset contains information about customers of a fictional telecommunications company, including demographic information, services subscribed to, location details, and churn behavior. This merged dataset combines the information from the original Telco Customer Churn dataset with additional details.

      Dataset Details
    
    
    
    
    
      Dataset Description
    

    This merged Telco Customer Churn dataset provides a comprehensive view of customer… See the full description on the dataset page: https://huggingface.co/datasets/aai510-group1/telco-customer-churn.

  17. Credit Card Fraud Dataset

    • kaggle.com
    zip
    Updated Jul 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SAYIKANMI TITILAYO MARY (2024). Credit Card Fraud Dataset [Dataset]. https://www.kaggle.com/datasets/sayikanmititilayo/credit-card-fraud-dataset/code
    Explore at:
    zip(21958791 bytes)Available download formats
    Dataset updated
    Jul 10, 2024
    Authors
    SAYIKANMI TITILAYO MARY
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    It is a data gotten from the transformation and dart engineering. IBM has 24,386,900 and 15 columns. But this result produced a dataset with over 300,000 and 28 columns of the dataset. The remaining portion of the dataset is used with error free datasets.

  18. DVSGesture128

    • kaggle.com
    zip
    Updated Nov 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    xingfenyizhen (2023). DVSGesture128 [Dataset]. https://www.kaggle.com/datasets/xingfenyizhen/dvsgesture128
    Explore at:
    zip(2942849907 bytes)Available download formats
    Dataset updated
    Nov 25, 2023
    Authors
    xingfenyizhen
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by xingfenyizhen

    Released under Apache 2.0

    Contents

  19. Supply Chain DataSet

    • kaggle.com
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amir Motefaker (2023). Supply Chain DataSet [Dataset]. https://www.kaggle.com/datasets/amirmotefaker/supply-chain-dataset
    Explore at:
    zip(9340 bytes)Available download formats
    Dataset updated
    Jun 1, 2023
    Authors
    Amir Motefaker
    Description

    Supply chain analytics is a valuable part of data-driven decision-making in various industries such as manufacturing, retail, healthcare, and logistics. It is the process of collecting, analyzing and interpreting data related to the movement of products and services from suppliers to customers.

  20. Synthetic Financial Datasets For Fraud Detection

    • kaggle.com
    zip
    Updated Apr 3, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Edgar Lopez-Rojas (2017). Synthetic Financial Datasets For Fraud Detection [Dataset]. https://www.kaggle.com/datasets/ealaxi/paysim1
    Explore at:
    zip(186385561 bytes)Available download formats
    Dataset updated
    Apr 3, 2017
    Authors
    Edgar Lopez-Rojas
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Context

    There is a lack of public available datasets on financial services and specially in the emerging mobile money transactions domain. Financial datasets are important to many researchers and in particular to us performing research in the domain of fraud detection. Part of the problem is the intrinsically private nature of financial transactions, that leads to no publicly available datasets.

    We present a synthetic dataset generated using the simulator called PaySim as an approach to such a problem. PaySim uses aggregated data from the private dataset to generate a synthetic dataset that resembles the normal operation of transactions and injects malicious behaviour to later evaluate the performance of fraud detection methods.

    Content

    PaySim simulates mobile money transactions based on a sample of real transactions extracted from one month of financial logs from a mobile money service implemented in an African country. The original logs were provided by a multinational company, who is the provider of the mobile financial service which is currently running in more than 14 countries all around the world.

    This synthetic dataset is scaled down 1/4 of the original dataset and it is created just for Kaggle.

    NOTE: Transactions which are detected as fraud are cancelled, so for fraud detection these columns (oldbalanceOrg, newbalanceOrig, oldbalanceDest, newbalanceDest ) must not be used.

    Headers

    This is a sample of 1 row with headers explanation:

    1,PAYMENT,1060.31,C429214117,1089.0,28.69,M1591654462,0.0,0.0,0,0

    step - maps a unit of time in the real world. In this case 1 step is 1 hour of time. Total steps 744 (30 days simulation).

    type - CASH-IN, CASH-OUT, DEBIT, PAYMENT and TRANSFER.

    amount - amount of the transaction in local currency.

    nameOrig - customer who started the transaction

    oldbalanceOrg - initial balance before the transaction

    newbalanceOrig - new balance after the transaction.

    nameDest - customer who is the recipient of the transaction

    oldbalanceDest - initial balance recipient before the transaction. Note that there is not information for customers that start with M (Merchants).

    newbalanceDest - new balance recipient after the transaction. Note that there is not information for customers that start with M (Merchants).

    isFraud - This is the transactions made by the fraudulent agents inside the simulation. In this specific dataset the fraudulent behavior of the agents aims to profit by taking control or customers accounts and try to empty the funds by transferring to another account and then cashing out of the system.

    isFlaggedFraud - The business model aims to control massive transfers from one account to another and flags illegal attempts. An illegal attempt in this dataset is an attempt to transfer more than 200.000 in a single transaction.

    Past Research

    There are 5 similar files that contain the run of 5 different scenarios. These files are better explained at my PhD thesis chapter 7 (PhD Thesis Available here http://urn.kb.se/resolve?urn=urn:nbn:se:bth-12932.

    We ran PaySim several times using random seeds for 744 steps, representing each hour of one month of real time, which matches the original logs. Each run took around 45 minutes on an i7 intel processor with 16GB of RAM. The final result of a run contains approximately 24 million of financial records divided into the 5 types of categories: CASH-IN, CASH-OUT, DEBIT, PAYMENT and TRANSFER.

    Acknowledgements

    This work is part of the research project ”Scalable resource-efficient systems for big data analytics” funded by the Knowledge Foundation (grant: 20140032) in Sweden.

    Please refer to this dataset using the following citations:

    PaySim first paper of the simulator:

    E. A. Lopez-Rojas , A. Elmir, and S. Axelsson. "PaySim: A financial mobile money simulator for fraud detection". In: The 28th European Modeling and Simulation Symposium-EMSS, Larnaca, Cyprus. 2016

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Mursaleen Ameer (2024). IBM🎗️ | Stock Prices Dataset📊 [Dataset]. https://www.kaggle.com/datasets/innocentmfa/ibm-stock-prices-dataset
Organization logo

IBM🎗️ | Stock Prices Dataset📊

IBM [International Business Machine] Explore the Prices

Explore at:
zip(27608 bytes)Available download formats
Dataset updated
May 8, 2024
Authors
Mursaleen Ameer
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Description:

This dataset contains historical stock price data for International Business Machines Corporation (IBM) from [Jan/01/2020] to [May/01/2024]. The dataset includes daily closing prices, adjusted closing prices, and other relevant information.

Features:

  • Date:
  • Open:
  • High:
  • Low:
  • Close:
  • Adj Close:
  • Volume:

Use Cases:

  • Predicting stock prices
  • Building stock forecasting models
  • Analyzing stock market trends
  • Backtesting investment strategies
  • Comparing machine learning models for stock prediction

    This dataset is perfect for data scientists, analysts, and students looking to practice their skills in:

  • Time series analysis

  • Stock market analysis

  • Predictive modeling

  • Machine learning

Get started: Download the dataset and start exploring!

Search
Clear search
Close search
Google apps
Main menu