53 datasets found
  1. Car Insurance

    • kaggle.com
    Updated Nov 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Car Insurance [Dataset]. https://www.kaggle.com/datasets/thedevastator/insurance-companies-secret-sauce-finally-exposed
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 15, 2022
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Allstate’s Car Insurance

    The Dataset From "Suckers List: How Allstate’s Secret Auto Insurance Algorithm"

    About this dataset

    This dataset contains insurance rates data from across the United States, providing insights into the premiums charged by insurers, the underlying factors that affect those rates, and claims history analysis. The data is designed to help researchers understand the inner workings of the insurance industry, and how rates are calculated. It includes information on premiums, underlying factors, current premium prices, indicated premium prices, selected premium prices, fixed expenses, and more

    How to use the dataset

    This dataset can be used to understand the inner workings of the insurance industry, and how rates are calculated. The data includes information on premiums, underlying factors, claims history analysis, and more. This dataset can be used to research insurance rates across the United States and to understand how these rates are determined

    Research Ideas

    • Understand the inner workings of the insurance industry, and how rates are calculated
    • Help insurance companies better understand their own pricing models
    • Understand how their premiums are calculated

    Acknowledgements

    I would like to acknowledge The Markup for providing the data for this dataset

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: cgr-definitions-table.csv | Column name | Description | |:--------------|:----------------------------------| | cgr | Combined grade rating. (Numeric) | | aa | Average annual premium. (Numeric) | | bb | Base premium. (Numeric) | | cc | Cost of capital. (Numeric) | | va | Value of assets. (Numeric) | | dd | Direct written premium. (Numeric) | | hh | Homeownership. (Categorical) |

    File: cgr-premiums-table.csv | Column name | Description | |:-----------------------------|:--------------------------------------------------| | territory | The territory in which the person lives. (String) | | gender | The person's gender. (String) | | birthdate | The person's birthdate. (Date) | | ypc | The person's years of prior coverage. (Integer) | | current_premium | The person's current premium. (Float) | | indicated_premium | The person's indicated premium. (Float) | | selected_premium | The person's selected premium. (Float) | | underlying_premium | The person's underlying premium. (Float) | | fixed_expenses | The person's fixed expenses. (Float) | | underlying_total_premium | The person's underlying total premium. (Float) | | cgr_factor | The person's CGR factor. (Float) |

    File: territory-definitions-table.csv | Column name | Description | |:----------------|:-------------------------------------------------------------------| | territory | The territory in which the person lives. (String) | | county | The county in which the person lives. (String) | | county_code | The county code for the county in which the person lives. (String) | | zipcode | The zip code for the county in which the person lives. (String) | | town | The town in which the person lives. (String) |

    ]

  2. m

    Dataset of an actual motor vehicle insurance portfolio

    • data.mendeley.com
    • openicpsr.org
    • +1more
    Updated Jul 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Josep Lledó (2024). Dataset of an actual motor vehicle insurance portfolio [Dataset]. http://doi.org/10.17632/5cxyb5fp4f.2
    Explore at:
    Dataset updated
    Jul 30, 2024
    Authors
    Josep Lledó
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data is formatted as a spreadsheet, encompassing the primary activities over a span of three full years (November 2015 to December 2018) concerning non-life motor insurance portfolio. This dataset comprises 105,555 rows and 30 columns. Each row signifies a policy transaction, while each column represents a distinct variable.

  3. G

    Insurance Premium and Claims Data by Class of Insurance, Alberta, 2013

    • open.canada.ca
    • data.wu.ac.at
    csv, html, xlsx
    Updated Jul 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Alberta (2024). Insurance Premium and Claims Data by Class of Insurance, Alberta, 2013 [Dataset]. https://open.canada.ca/data/en/dataset/34eb85a2-1558-46b7-adca-a40c446cb05f
    Explore at:
    xlsx, csv, htmlAvailable download formats
    Dataset updated
    Jul 24, 2024
    Dataset provided by
    Government of Alberta
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Time period covered
    Jan 1, 2013 - Dec 31, 2013
    Area covered
    Alberta
    Description

    Data provided by insurers, on the premiums written and claims incurred for the 2013 fiscal year. Based on reporting on the consolidated pages of the P&C-1 or Life-1 Annual returns. This data is also reported in the Superintendent of Insurance’s Annual Report.

  4. Comprehensive car claim frequency for physical damage in the U.S. 2007-2023

    • statista.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Comprehensive car claim frequency for physical damage in the U.S. 2007-2023 [Dataset]. https://www.statista.com/statistics/830114/comprehensive-car-claim-frequency-physical-damage-usa/
    Explore at:
    Dataset updated
    Jul 9, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    The frequency of private passenger comprehensive auto insurance claims for physical damage in the United States rose to **** per 100 car years in 2023, compared to *** in 2020. This was the highest frequency recorded over the past 15 years.

  5. Customer Data

    • kaggle.com
    Updated Oct 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Racholsan (2021). Customer Data [Dataset]. https://www.kaggle.com/racholsan/customer-data/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 5, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Racholsan
    Description

    Context

    Your client is a car insurance company. They want to price their car insurance competitively, which means having a good model for customers at risk of getting into accidents.

    Content

    Each row corresponds to a customer, the outcome column records whether the customer made a claim in the previous year or not. The client has informed you that the other columns should be self-explanatory.

    Inspiration

    The client is interested to know if the customer data can be used to predict the likelihood that a claim is made in the next year. Your task is to investigate this and make a recommendation. You should complete the following tasks:

    1. Build a proof-of-concept model to predict the outcome column from the customer data, including any necessary data processing
    2. The client is keen to be able to interpret the model you build and would be particularly interested in understanding which features are most important to the model's decisions.
  6. Average annual minimum and full car insurance premiums in the U.S. 2024, by...

    • statista.com
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Average annual minimum and full car insurance premiums in the U.S. 2024, by age [Dataset]. https://www.statista.com/statistics/675367/annual-auto-insurance-premiums-usa-by-state/
    Explore at:
    Dataset updated
    Jul 15, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2024
    Area covered
    United States
    Description

    Louisiana had the most expensive annual car insurance premiums at ***** U.S. dollars for full coverage. Alaska ranked in first place, having the highest annual cost for minimum car insurance coverage at *** U.S. dollars.Why it varies state by state The huge variance in premiums between states is due to the difference in state laws, the percentage of uninsured drivers in the state, the frequency of natural disasters, and claim rates. For instance, Michigan has a no-fault car insurance system, which means that claims are more common. This drives up the cost of insurance for all drivers because insurers need to pay out more money in claims. Male drivers also pay more There is also a difference between premiums among different age groups. In 2025, 25-year-old male drivers paid more per month than 25-year-old female drivers did. This is due to the higher incidence of accidents among young male drivers. This means that young drivers in states that already have higher premiums must pay a lot for car insurance.

  7. Event logs for process mining

    • kaggle.com
    Updated Apr 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alberto (2023). Event logs for process mining [Dataset]. https://www.kaggle.com/datasets/carlosalvite/car-insurance-claims-event-log-for-process-mining/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 11, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Alberto
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Description This event log has been artificially generated and curated to provide a comprehensive view of car insurance claims, allowing users to discover and identify bottlenecks, automation opportunities, conformance issues, reworks, and potential fraudulent cases using any process mining software.

    You can find more event logs here: https://processminingdata.com/JfVPOR

    Standard Process flow: “First Notification of Loss (FNOL)” -> “Assign Claim” -> “Claim Decision” -> “Set Reserve” -> “Payment Sent” -> “Close Claim”

    Attributes: - case ID - activity name - timestamp - claimant name - agent name - adjuster name - claim amount - claimant age - type of policy - car make - car model - car year - date and time of the accident - type of accident - user type

    Total number of claims: 30,000

    Dates: Claims belong to years 2020, 2021, and 2022.

    Disclaimer: Personal names are fake.

  8. R

    Car Damage Images Dataset

    • universe.roboflow.com
    zip
    Updated Mar 29, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Car Damage (2023). Car Damage Images Dataset [Dataset]. https://universe.roboflow.com/car-damage-kadad/car-damage-images/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 29, 2023
    Dataset authored and provided by
    Car Damage
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Car Damage Bounding Boxes
    Description

    Here are a few use cases for this project:

    1. Insurance Assessment: This model can be used by insurance companies to automate the process of assessing car damage in insurance claims. By simply using photographs of the damaged vehicle, the model can identify the type and extent of damage, making the claim processing faster and more objective.

    2. Automotive Repair Estimates: Car repair shops can use this model to get an approximate idea of the damage and therefore provide a more accurate cost estimate for their clients. It can also assist in identifying nonobvious damage.

    3. Used Car Market Evaluation: This model can be used in used car platforms to evaluate the current condition of the cars listed for sale. By identifying existing damage, buyers can make more informed decisions and sellers can price their vehicles more accurately.

    4. Law Enforcement and Road Safety: Traffic police and accident investigation teams can utilize this model to evaluate the types of damages after a road accident. It will assist in rebuilding the accident scenario, providing insights during investigations.

    5. Auto-manufacturing Quality Control: Automobile manufacturers can use this model in their factories to automatically inspect new cars for any damage or misaligned/missing parts before they are dispatched from the factory, ensuring quality control.

  9. HackerEarth's Fast, Furious and Insured Challenge

    • kaggle.com
    Updated May 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hotson Honet (2021). HackerEarth's Fast, Furious and Insured Challenge [Dataset]. https://www.kaggle.com/datasets/hotsonhonet/hackerearths-fast-furious-and-insured-challenge/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Hotson Honet
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Problem statement

    Vehicle insurance is insurance for cars, trucks, motorcycles, and other road vehicles. Its main purpose is to provide financial protection against the following: - Physical damage or bodily injury caused by traffic collisions - Liability that could arise from incidents in a vehicle

    Vehicle insurance may additionally offer financial protection against theft of the vehicle and against damage to the vehicle sustained because of events other than traffic collisions such as keying, weather, or natural disasters, and damage sustained by colliding with stationary objects.

    About the data

    You have been hired as a Machine Learning expert by a leading car insurance company. Your task is to predict the insurance claim of the cars that are provided in the dataset.

    The dataset folder contains the following: - The trainImages folder: Contains 1399 training images - The testImages folder: Contains 600 testing images - train.csv: Contains 1399 x 8 data points - test.csv: Contains 600 x 6 data points - sample_submission.csv: Contains 5 x 3 data points

    Column NameDescription
    Image PathRepresents the name of the image
    Insurance_companyRepresents masked values of some insurance companies
    Cost of VehiclesRepresents the cost of a vehicle present in the image
    Min ConverageRepresents the minimum coverage provided by an insurance company
    Expiry DateRepresents the expiry date of the insurance
    Max CoverageRepresents the maximum coverage provided by an insurance company
    ConditionRepresents whether a vehicle is damaged
    AmountRepresents the insurance amount of a vehicle

    Result submission guidelines

    • The indexes are Image_path.
    • The target is the Condition and Amount column.
    • The submission file must be submitted in .csv format only with the name submission.
    • The size of the submission file must be 600x3.

    Reasons for failed submission:

    • Incorrect index values as per the test file
    • Incorrect names of columns as provided in the sample_submission.csv

    Inspiration

    The dataset consists of parameters such as the images of damaged cars, the cost price of the cars and their insurance claim, and the like. The benefits of practicing this problem by using Machine Learning techniques are as follows:

    • This challenge encourages you to apply your Machine Learning skills to build a model that predicts the insurance claim of the cars that are provided in the dataset.
    • This challenge will help you enhance your knowledge of regression. Regression is one of the basic building blocks of Machine Learning. We challenge you to build a model that successfully predicts the insurance claim of the cars that are provided in the dataset.
  10. d

    Automobile Insurance Company Complaint Rankings: Beginning 2009

    • catalog.data.gov
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • +4more
    Updated Mar 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ny.gov (2025). Automobile Insurance Company Complaint Rankings: Beginning 2009 [Dataset]. https://catalog.data.gov/dataset/automobile-insurance-company-complaint-rankings-beginning-2009
    Explore at:
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    data.ny.gov
    Description

    The DFS ranks automobile insurance companies doing business in New York State based on the number of consumer complaints upheld against them as a percentage of their total business over a two-year period. Complaints typically involve issues like delays in the payment of no-fault claims and nonrenewal of policies. Insurers with the fewest upheld complaints per million dollars of premiums appear at the top of the list. Those with the highest complaint ratios are ranked at the bottom.

  11. d

    NCID Private Motor Insurance Data- Ultimate Claims - Dataset - CBI Opendata

    • poc.staging.derilinx.com
    Updated Apr 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). NCID Private Motor Insurance Data- Ultimate Claims - Dataset - CBI Opendata [Dataset]. https://poc.staging.derilinx.com/dataset/ncid-part-2-claims-ultimate-claims
    Explore at:
    Dataset updated
    Apr 29, 2024
    Description

    This file contains ultimate claims data taken from the private motor National Claims Iinformation Database (NCID). The claims are grouped together by accident year, the year in which the accident occurred. Not all claims are paid in the lifetime of the policy. Some claims, injury claims in particular, can take many years to be settled and be fully paid. Insurers estimate the cost/number of claims expected for a particular accident year, and this known as the ultimate cost/number of claims. The ultimate cost/number of claims is recalculated regularly, based on the most up-to-date information available. The more time that has passed since the accident year, the more certain the ultimate cost of claims becomes. To view the detailed NCID report kindly refer to the centralbank publication link in the Landing Page section under Additional Info.

  12. NCID Private Motor Insurance Data - Premiums

    • data.gov.ie
    • opendata.centralbank.ie
    • +1more
    Updated Jul 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.gov.ie (2024). NCID Private Motor Insurance Data - Premiums [Dataset]. https://data.gov.ie/dataset/ncid-private-motor-data
    Explore at:
    Dataset updated
    Jul 10, 2024
    Dataset provided by
    data.gov.ie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file contains premium data taken from the private motor National Claims Information Database (NCID). Premiums and policy numbers are presented on a “written” and “earned” policy basis and further broken down by different levels of cover - comprehensive and third party. To view the detailed NCID report refer to the Central Bank publication link under Additional Info.

  13. Premiums And Claims Of General Insurance Funds, Annual

    • data.gov.sg
    Updated Aug 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Singapore Department of Statistics (2025). Premiums And Claims Of General Insurance Funds, Annual [Dataset]. https://data.gov.sg/datasets/d_abcfd12381e7f8d175280d999cdb2dea/view
    Explore at:
    Dataset updated
    Aug 13, 2025
    Dataset authored and provided by
    Singapore Department of Statistics
    License

    https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence

    Time period covered
    Jan 1965 - Dec 2023
    Description

    Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_abcfd12381e7f8d175280d999cdb2dea/view

  14. h

    fremtpl2

    • huggingface.co
    Updated Nov 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Bilton (2024). fremtpl2 [Dataset]. https://huggingface.co/datasets/mabilton/fremtpl2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 18, 2024
    Authors
    Matthew Bilton
    License

    https://choosealicense.com/licenses/gpl-2.0/https://choosealicense.com/licenses/gpl-2.0/

    Description

    freMTPL2 Dataset

    This dataset is a mirror of the freMTPL2 frequency and severity datasets, originally published by Arthur Charpentier to accompany his textbook Computational Actuarial Science with R. The freMTPL2 dataset contains data on Third-Party Liability (TPL) Motor insurance policies issued in France, along with claims filed against those policies, observed over a duration of just over a year. These observations are organized into two separate CSV files:

    freMTPL2freq.csv: a… See the full description on the dataset page: https://huggingface.co/datasets/mabilton/fremtpl2.

  15. Average Car Insurance Premium...

    • compare.com
    Updated Sep 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Compare.com (2025). Average Car Insurance Premium DynamicTable.dataset.source.petBreedStateAvgPrices [Dataset]. https://www.compare.com/pet-insurance/dogs/pit-bulls
    Explore at:
    Dataset updated
    Sep 1, 2025
    Dataset provided by
    Compare.comhttps://www.compare.com/
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This table contains values from Compare.com's proprietary database of car insurance quotes about average DynamicTable.dataset.coverage.monthly_premium_dog car insurance costs DynamicTable.dataset.source.petBreedStateAvgPrices

  16. Vehicle Insurance Claim Fraud Detection

    • kaggle.com
    Updated Dec 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shivam Bansal (2021). Vehicle Insurance Claim Fraud Detection [Dataset]. https://www.kaggle.com/datasets/shivamb/vehicle-claim-fraud-detection/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 20, 2021
    Dataset provided by
    Kaggle
    Authors
    Shivam Bansal
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Vehicle Insurance Fraud Detection

    Vehicle insurance fraud involves conspiring to make false or exaggerated claims involving property damage or personal injuries following an accident. Some common examples include staged accidents where fraudsters deliberately “arrange” for accidents to occur; the use of phantom passengers where people who were not even at the scene of the accident claim to have suffered grievous injury, and make false personal injury claims where personal injuries are grossly exaggerated.

    About this dataset

    This dataset contains vehicle dataset - attribute, model, accident details, etc along with policy details - policy type, tenure etc. The target is to detect if a claim application is fraudulent or not - FraudFound_P

  17. S

    Switzerland Non Life Insurance: Claims Paid: Liability and Motor

    • ceicdata.com
    Updated Dec 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2024). Switzerland Non Life Insurance: Claims Paid: Liability and Motor [Dataset]. https://www.ceicdata.com/en/switzerland/non-life-insurance-claims-paid/non-life-insurance-claims-paid-liability-and-motor
    Explore at:
    Dataset updated
    Dec 15, 2024
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2005 - Dec 1, 2016
    Area covered
    Switzerland
    Variables measured
    Insurance Market
    Description

    Switzerland Non Life Insurance: Claims Paid: Liability and Motor data was reported at 4,676.000 CHF mn in 2016. This records a decrease from the previous number of 4,802.000 CHF mn for 2015. Switzerland Non Life Insurance: Claims Paid: Liability and Motor data is updated yearly, averaging 4,628.000 CHF mn from Dec 2000 (Median) to 2016, with 17 observations. The data reached an all-time high of 4,918.000 CHF mn in 2009 and a record low of 3,844.000 CHF mn in 2000. Switzerland Non Life Insurance: Claims Paid: Liability and Motor data remains active status in CEIC and is reported by Swiss Financial Market Supervisory Authority. The data is categorized under Global Database’s Switzerland – Table CH.RG011: Non Life Insurance: Claims Paid.

  18. R

    Nodamage Dataset

    • universe.roboflow.com
    zip
    Updated Aug 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    car damagenodamage (2022). Nodamage Dataset [Dataset]. https://universe.roboflow.com/car-damagenodamage/damage-nodamage-9fttc/dataset/3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 5, 2022
    Dataset authored and provided by
    car damagenodamage
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    DamageNoDamageCar Polygons
    Description

    Here are a few use cases for this project:

    1. Insurance Claim Processing: The model can be used to expedite processing of insurance claims by quickly categorizing whether a vehicle has been damaged or not from uploaded incident photos. This can help insurance agents prioritize claims and perform investigations more efficiently.

    2. Vehicle Rental Services: This model can be useful for rental agencies to automatically validate the state of their vehicles when they are returned by customers. It will allow them to spot any new damages without the need of manual inspection.

    3. Online Marketplace Quality Control: Online platforms for buying/selling used cars can provide an additional layer of quality control before listings go live. Sellers can submit photos of their vehicles which are then analyzed using this model to verify the condition of the car.

    4. Traffic Management and Law Enforcement: The model can be used by traffic authorities or law enforcement agencies to automatically identify and classify damaged vehicles from CCTV or drone footage during accidents, which can assist in accident location, investigation and traffic management.

    5. Automated Driving Systems: In autonomous vehicles, this solution can be an integral part of the system to detect and avoid damaged cars on the road, contributing to safer driving conditions.

  19. Motor Vehicle Point & Insurance Reduction Program (PIRP) Participation: Five...

    • data.ny.gov
    • datasets.ai
    • +3more
    application/rdfxml +5
    Updated Sep 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NYS DMV — Data Services (2025). Motor Vehicle Point & Insurance Reduction Program (PIRP) Participation: Five Year Window [Dataset]. https://data.ny.gov/Transportation/Motor-Vehicle-Point-Insurance-Reduction-Program-PI/u925-8y2g
    Explore at:
    application/rssxml, xml, json, tsv, csv, application/rdfxmlAvailable download formats
    Dataset updated
    Sep 1, 2025
    Dataset provided by
    New York State Department of Motor Vehicleshttp://www.dmv.ny.gov/
    Authors
    NYS DMV — Data Services
    Description

    This data set measures and describes participation in PIRP. The researcher may ascertain how many motorists have completed the course and tabulate subsets by: year and month of course completion; motorist residency, age and sex; course provider and delivery method.

  20. f

    Number of claims in one year.

    • plos.figshare.com
    xls
    Updated Dec 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gadir Alomair (2024). Number of claims in one year. [Dataset]. http://doi.org/10.1371/journal.pone.0314975.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 31, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Gadir Alomair
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Accurate forecasting of claim frequency in automobile insurance is essential for insurers to assess risks effectively and establish appropriate pricing policies. Traditional methods typically rely on a Poisson distribution for modeling claim counts; however, this approach can be inadequate due to frequent zero-claim periods, leading to zero inflation in the data. Zero inflation occurs when more zeros are observed than expected under standard Poisson or negative binomial (NB) models. While machine learning (ML) techniques have been explored for predictive analytics in other contexts, their application to zero-inflated insurance data remains limited. This study investigates the utility of ML in improving forecast accuracy under conditions of zero-inflation, a data characteristic common in automobile insurance. The research involved a comparative evaluation of several models, including Poisson, NB, zero-inflated Poisson (ZIP), hurdle Poisson, zero-inflated negative binomial (ZINB), hurdle negative binomial, random forest (RF), support vector machine (SVM), and artificial neural network (ANN) on an insurance dataset. The performance of these models was assessed using mean absolute error. The results reveal that the SVM model outperforms others in predictive accuracy, particularly in handling zero-inflation, followed by the ZIP and ZINB models. In contrast, the traditional Poisson and NB models showed lower predictive capabilities. By addressing the challenge of zero-inflation in automobile claim data, this study offers insights into improving the accuracy of claim frequency predictions. Although this study is based on a single dataset, the findings provide valuable perspectives on enhancing prediction accuracy and improving risk management practices in the insurance industry.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Devastator (2022). Car Insurance [Dataset]. https://www.kaggle.com/datasets/thedevastator/insurance-companies-secret-sauce-finally-exposed
Organization logo

Car Insurance

The Dataset From "Suckers List: How Allstate’s Secret Auto Insurance Algorithm"

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 15, 2022
Dataset provided by
Kaggle
Authors
The Devastator
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Allstate’s Car Insurance

The Dataset From "Suckers List: How Allstate’s Secret Auto Insurance Algorithm"

About this dataset

This dataset contains insurance rates data from across the United States, providing insights into the premiums charged by insurers, the underlying factors that affect those rates, and claims history analysis. The data is designed to help researchers understand the inner workings of the insurance industry, and how rates are calculated. It includes information on premiums, underlying factors, current premium prices, indicated premium prices, selected premium prices, fixed expenses, and more

How to use the dataset

This dataset can be used to understand the inner workings of the insurance industry, and how rates are calculated. The data includes information on premiums, underlying factors, claims history analysis, and more. This dataset can be used to research insurance rates across the United States and to understand how these rates are determined

Research Ideas

  • Understand the inner workings of the insurance industry, and how rates are calculated
  • Help insurance companies better understand their own pricing models
  • Understand how their premiums are calculated

Acknowledgements

I would like to acknowledge The Markup for providing the data for this dataset

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: cgr-definitions-table.csv | Column name | Description | |:--------------|:----------------------------------| | cgr | Combined grade rating. (Numeric) | | aa | Average annual premium. (Numeric) | | bb | Base premium. (Numeric) | | cc | Cost of capital. (Numeric) | | va | Value of assets. (Numeric) | | dd | Direct written premium. (Numeric) | | hh | Homeownership. (Categorical) |

File: cgr-premiums-table.csv | Column name | Description | |:-----------------------------|:--------------------------------------------------| | territory | The territory in which the person lives. (String) | | gender | The person's gender. (String) | | birthdate | The person's birthdate. (Date) | | ypc | The person's years of prior coverage. (Integer) | | current_premium | The person's current premium. (Float) | | indicated_premium | The person's indicated premium. (Float) | | selected_premium | The person's selected premium. (Float) | | underlying_premium | The person's underlying premium. (Float) | | fixed_expenses | The person's fixed expenses. (Float) | | underlying_total_premium | The person's underlying total premium. (Float) | | cgr_factor | The person's CGR factor. (Float) |

File: territory-definitions-table.csv | Column name | Description | |:----------------|:-------------------------------------------------------------------| | territory | The territory in which the person lives. (String) | | county | The county in which the person lives. (String) | | county_code | The county code for the county in which the person lives. (String) | | zipcode | The zip code for the county in which the person lives. (String) | | town | The town in which the person lives. (String) |

]

Search
Clear search
Close search
Google apps
Main menu