82 datasets found

Predict accident risk score for unique postcode
kaggle.com
zip
Updated Mar 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gaurav Dutta (2022). Predict accident risk score for unique postcode [Dataset]. https://www.kaggle.com/datasets/gauravduttakiit/predict-accident-risk-score-for-unique-postcode
Explore at:
zip(21360724 bytes)Available download formats
Dataset updated
Mar 13, 2022
Authors
Gaurav Dutta
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
According to IBEF “Domestic automobiles production increased at 2.36% CAGR between FY16-20 with 26.36 million vehicles being manufactured in the country in FY20.Overall, domestic automobiles sales increased at 1.29% CAGR between FY16-FY20 with 21.55 million vehicles being sold in FY20”.The rise in vehicles on the road will also lead to multiple challenges and the road will be more vulnerable to accidents.Increased accident rates also leads to more insurance claims and payouts rise for insurance companies.

In order to pre-emptively plan for the losses, the insurance firms leverage accident data to understand the risk across the geographical units e.g. Postal code/district etc.

In this challenge, we are providing you the dataset to predict the “Accident_Risk_Index” against the postcodes.Accident_Risk_Index (mean casualties at a postcode) = sum(Number_of_casualities)/count(Accident_ID)

The participants are required to predict the 'Accident_risk_index' for the test.csv and against the postcode on the test data.

Then submit your 'my_submission_file.csv' on the submission tab of the hackathon page.

Pro-tip: The participants are required to perform feature engineering to first roll-up the train data at postcode level and create a column as “accident_risk_index” and optimize the model against postcode level.

Few Hypothesis to help you think: "More accidents happen in the later part of the day as those are office hours causing congestion"

"Postal codes with more single carriage roads have more accidents"

(***In the above hypothesis features such as office_hours_flag and #single _carriage roads can be formed)

Additionally, we are providing you with road network data (contains info on the nearest road to a postcode and it's characteristics) and population data (contains info about population at area level). This info are for augmentation of features, but not mandatory to use.

The provided dataset contains the following files:

Train: 4,84,042 rows x 27 columns

Test: 1,15,958 rows x 27 columns

train.csv & test.csv:

'Accident_ID', 'Police_Force', 'Number_of_Vehicles', 'Number_of_Casualties', 'Date', 'Day_of_Week', 'Time', ‘Local_Authority_(District)', 'Local_Authority_(Highway)', '1st_Road_Class', '1st_Road_Number', 'Road_Type', 'Speed_limit', '2nd_Road_Class', '2nd_Road_Number', 'Pedestrian_Crossing-Human_Control', 'Pedestrian_Crossing-Physical_Facilities', 'Light_Conditions', ‘'Weather_Conditions', 'Road_Surface_Conditions', 'Special_Conditions_at_Site', 'Carriageway_Hazards', 'Urban_or_Rural_Area', 'Did_Police_Officer_Attend_Scene_of_Accident', 'state', 'postcode', 'country'

Population: 8,035 rows x 10 columns

population.csv:

'postcode', 'Rural Urban', 'Variable: All usual residents; measures: Value', 'Variable: Males; measures: Value', 'Variable: Females; measures: Value', ‘Variable: Lives in a household; measures: Value', ‘Variable: Lives in a communal establishment; measures: Value', 'Variable: Schoolchild or full-time student aged 4 and over at their non term-time address; measures: Value', 'Variable: Area (Hectares); measures: Value', 'Variable: Density (number of persons per hectare); measures: Value'

Road Network: 91,566 rows x 8 columns

roads_network.csv:

'WKT', 'roadClassi', ‘roadFuncti', 'formOfWay', 'length', 'primaryRou', 'distance to the nearest point on rd', 'postcode’
UK Postcode-Level Flood Risk Data
kaggle.com
zip
Updated Jan 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). UK Postcode-Level Flood Risk Data [Dataset]. https://www.kaggle.com/datasets/thedevastator/uk-postcode-level-flood-risk-data
Explore at:
zip(27328679 bytes)Available download formats
Dataset updated
Jan 10, 2023
Authors
The Devastator
Area covered
United Kingdom
Description
UK Postcode-Level Flood Risk Data

Risk Level, Suitability, and Geographic Coordinates

By GetTheData [source]

About this dataset

The underlying source material has been compiled from open datasets including 'Risk of Flooding from Rivers and Sea', 'Open Postcode Geo' - all held under licence in agreement with Crown copyright & Database right (2017) & Royal Mail copyright & Database right (2017). The methodology used would combine each one of these datasets points into polygons with first identifying each risk area then mapping out corresponding postcode points within them which then could be tracked for its related longitude, latitude easting and northing positions. Through this comprehensive process you could get a better understanding regarding what individual postcodes are within high & low level flooding areas as well as find out from the latest publication date - when was it last issued? Ultimately this profound dataset comes in handy for prevention or even planning purposes informing citizens how serious some situations could become during extreme weather events such as floods or major storms allowing them to estimate potential risks before disaster ensues!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

In order to effectively use this dataset there are several key pieces of terminology that you should be familiar with:

FID – a unique identifier number associated with each record in the database.

Postcode – an alphanumeric code used to identify a specific geographic region within the country; consists of two parts: an outward code (e.g., RG4) and an inward code (e.g., 8DN).

PROB_4BAND – The flood risk level based on four categories represented in this field assigned by location - High, Medium, Low or Very Low; or None if outside of a high-risk area

SUITABILITY – The suitability rating determined by location; either suitable or not suitable for development based on constraints for building in a floodplain

Publication Date(PUB DATE) - The date that this information was made publicly available

Risk For Insurance SOP (Risk_For_Insurance_SOP) - A ranking system from 1-5 used as guidance only when offering advice before taking out insurance cover over certain property located in certain very extreme area

You will also need to be aware of some mathematical values associated with each postcode:

Easting – An eastward grid coordinate reference point corresponding to determining latitude/ longitude coordinates at specified points along an arc created by measuring distances between two other known points

Northing– A northward grid coordinate reference forming part of a geographical survey’s grid system

Finally, here is how you can get started working with this amazing dataset:

Download it onto your computer from Kaggle's website (www.kaggle/datasets/UK Postecode Level Flood Risk Data).

2

Research Ideas

Creating a custom application that provides users with real-time flood risk and safety advice based on postcode.

Developing a map-based interface that integrates flood risk levels directly into Google Maps to assist people in planning trips and relocating in safer areas.

Developing an app that tracks the geographically accurate position of every property within each postcode, allowing for better risk assessment for businesses and insurers

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: open_flood_risk_by_postcode.csv | Column name | Description | |:--------------|:--------------------------------| | TR23 0PR | Postcode area (String) | | \N | FID (Integer) | | None | PROB_4BAND (String) | | \N.1 | SUITABILITY (String) | | \N.2 | PULD_DATE (Date) | | \N.3 | RISK_FOR_INSURANCE_SOP (String) | | 87897 | Easting (Integer) | | 15021 | Northing (Integer) | | 49.953605 | Latitude (Float) | | -6.352647 | Longitude (Float) |

Acknowledgements

If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit GetTheData.
Flood risk: Postcode search tool data - Dataset - data.gov.uk
ckan.publishing.service.gov.uk
Updated Nov 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.publishing.service.gov.uk (2025). Flood risk: Postcode search tool data - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/flood-risk-postcode-search-tool-data
Explore at:
Dataset updated
Nov 7, 2025
Dataset provided by
CKANhttps://ckan.org/
Description
The ‘Flood risk: Postcode search tool data’ is primarily intended for use by an interactive tool (the ‘Flood risk: Postcode search tool’), but it is also published as open data. The tool is designed to be embedded in third party online media such as news websites, within relevant contexts (relating to flooding and climate change for example). The tool shows information about the flood risk within a postcode and invites users to find out more by clicking on a link which will take them to the GOV.UK service ‘Check your long-term flood risk’: https://www.gov.uk/check-long-term-flood-risk. The data is for long term risk of flooding, not floods that are expected to happen now or in the next 5 days. The risk is for the area around an address, not the address itself. Attribution statement: © Environment Agency copyright and/or database right 2025. All rights reserved.
w
Open Flood Risk by Postcode
data.wu.ac.at
kaggle.com
zip:csv
Updated Mar 22, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GetTheData (2017). Open Flood Risk by Postcode [Dataset]. https://data.wu.ac.at/schema/datahub_io/NmNhNGI1MzMtN2Y1My00OWQxLTk4MzAtZTQ0ZGMwNDNjZDVi
Explore at:
zip:csvAvailable download formats
Dataset updated
Mar 22, 2017
Dataset provided by
GetTheData
License
http://reference.data.gov.uk/id/open-government-licencehttp://reference.data.gov.uk/id/open-government-licence
Description
Open Flood Risk by Postcode

English postcodes and the Environment Agency's risk level for the area that postcode falls within.

Fields

postcode

FID

PROB_4BAND

SUITABILITY

PUB_DATE

RISK_FOR_INSURANCE_SOP

easting

northing

latitude

longitude

Risk levels

Possible values for PROB_4BAND

High

Medium

Low

Very Low

None

Methodology

The position of the postcode is determined using a point location from Open Postcode Geo. This point location places the postcode in the corresponding area within Risk of Flooding from Rivers and Sea and the risk level (PROB_4BAND) is linked to the postcode. This allows flood risk level to be looked up by postcode.

Note that properties within the postcode may have a slightly different point location and thus be located in a different area with a different risk level. Generally speaking the point location of the postcode is determined by the point location of the centremost property in that postcode.

Documentation

Full documentation can be found on the Open Flood Risk by Postcode homepage.

Example application: Flood Maps by Postcode

Example postcode: SW18 4BX Flood Map

Acknowledgements

Derived from the Environment Agency's Risk of Flooding from Rivers and Sea open dataset.

Postcode data: Open Postcode Geo

Licensed under the OGL (attribution required).

Attribution statements

Contains OS data © Crown copyright and database right (2017)

Contains Royal Mail data © Royal Mail copyright and Database right (2017)

Contains National Statistics data © Crown copyright and database right (2017)

Contains Environment Agency data licensed under the Open Government Licence v3.0
Risk of Flooding from Rivers and Sea - Postcodes in Areas at Risk - Dataset...
ckan.publishing.service.gov.uk
Updated Jan 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.publishing.service.gov.uk (2025). Risk of Flooding from Rivers and Sea - Postcodes in Areas at Risk - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/risk-of-flooding-from-rivers-and-sea-postcodes-in-areas-at-risk2
Explore at:
Dataset updated
Jan 28, 2025
Dataset provided by
CKANhttps://ckan.org/
Description
This dataset contains a summary of properties at risk of flooding from rivers and the sea at a postcode scale, including a breakdown by flood risk likelihood category and property type. Attribution statement: © Environment Agency copyright and/or database right 2025. All rights reserved. Some features of this map are based on digital spatial data from the Centre for Ecology & Hydrology, © NERC (CEH). © Crown Copyright and Database Rights 2025 OS AC0000807064. This product is produced in part from PAF® and Multiple Residence Data, the copyright in which is owned by Royal Mail Group Limited and/or Royal Mail Group plc. All rights reserved. Licence number AC0000807064
n
newGeoSure Insurance Product version 7 2015.1
data-search.nerc.ac.uk
metadata.bgs.ac.uk
+1more
Updated Jun 15, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2015). newGeoSure Insurance Product version 7 2015.1 [Dataset]. https://data-search.nerc.ac.uk/geonetwork/srv/search?format=Postcode%20Database
Explore at:
Dataset updated
Jun 15, 2015
Description
This dataset has been superseded The newGeoSure Insurance Product (newGIP) provides the potential insurance risk due to natural ground movement. It incorporates the combined effects of the 6 GeoSure hazards on (low-rise) buildings. This data is available as vector data, 25m gridded data or alternatively linked to a postcode database – the Derived Postcode Database. A series of GIS (Geographical Information System) maps show the most significant hazard areas. The ground movement, or subsidence, hazards included are landslides, shrink-swell clays, soluble rocks, running sands, compressible ground and collapsible deposits. The newGeoSure Insurance Product uses the individual GeoSure data layers and evaluates them using a series of processes including statistical analyses and expert elicitation techniques to create a derived product that can be used for insurance purposes such as identifying and estimating risk and susceptibility. The Derived Postcode Database (DPD) contains generalised information at a postcode level. The DPD is designed to provide a ‘summary’ value representing the combined effects of the GeoSure dataset across a postcode sector area. It is available as a GIS point dataset or a text (.txt) file format. The DPD contains a normalised hazard rating for each of the 6 GeoSure themes hazards (i.e. each GeoSure theme has been balanced against each other) and a combined unified hazard rating for each postcode in Great Britain. The combined hazard rating for each postcode is available as a standalone product. The Derived Postcode Database is available in a point data format or text file format. It is available in a range of GIS formats including ArcGIS (.shp), ArcInfo Coverages and MapInfo (.tab). More specialised formats may be available but may incur additional processing costs. The newGeoSure Insurance Product dataset has been created as vector data but is also available as a raster grid. This data is available in a range of GIS formats, including ArcGIS (.shp), ArcInfo coverage’s and MapInfo (.tab). More specialised formats may be available but may incur additional processing costs. Data for the newGIP is provided for national coverage across Great Britain. The newGeoSure Insurance Product dataset is produced for use at 1:50 000 scale providing 50 m ground resolution. This dataset has been specifically developed for the insurance of low-rise buildings. The GeoSure datasets have been developed to identify the potential hazard for low-rise buildings and those with shallow foundations of less than 2 m deep. The identification of ground instability and other geological hazards can assist regional planners; rapidly identifying areas with potential problems and aid local government offices in making development plans by helping to define land suited to different uses. Other users of these data may include developers, homeowners, solicitors, loss adjusters, the insurance industry, architects and surveyors. Version 7 released June 2015.
UK Postcode-level Flood Risk Data (Rivers and Sea)
kaggle.com
zip
Updated Jan 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). UK Postcode-level Flood Risk Data (Rivers and Sea) [Dataset]. https://www.kaggle.com/datasets/thedevastator/uk-postcode-level-flood-risk-data-rivers-and-sea/data
Explore at:
zip(27328679 bytes)Available download formats
Dataset updated
Jan 10, 2023
Authors
The Devastator
Area covered
United Kingdom
Description
UK Postcode-level Flood Risk Data (Rivers and Sea)

Identifying Risk Areas for Flood Protection and Planning

By IBM Watson AI XPRIZE - Environment [source]

About this dataset

Welcome to the UK Postcode-level Flood Risk Dataset. This open source dataset contains detailed information on flood risk levels by postcode in the UK, allowing you to map out potential problems and plan accordingly. With this dataset, you can assess each postcode's growing risk of floods due to human land use change and climate change-related weather patterns, as well as historical occurrences specific to each area.

We pull data from organizations including Risk of Flooding from Rivers & Sea, Open Postcode Geo, Royal Mail copyright & database right (2017), National Statistics data Crown copyright & database right (2017), and Environment Agency data licensed under the Open Government Licence v3.0. The associated columns in this dataset are detailed below:

Postcode - unique identifier for the postal code district where flood risk area is located

FID - Unique ID for each location point

PROB 4BAND - Flood risk level for a given postcode determined according to a four tier grade system (High, Medium, Low or Very Low)

SUITABILITY - Suitability of location based on environment factors assessed according to OFRA criteria

PUB_DATE - Date when data was published or last updated

RISK FOR INSURANCE SOP - Standard Operating Procedure assigned according the Probability 4 band Risk rating

Easting/Northing/Latitude/Longitude – Coordinates associated with a given postcode location

This data can be used by local authorities and agencies conducting flood mapping projects; insurers assessing assets at specified locations using an agreed set of methodology; advisors assessing locations for development purposes; forecasters aiding contingency planning; homeowners/commercial businesses seeking insurance cover for claims arising from flooding events etc. Ultimately we hope citizens around the world use this dataset as an important tool to predict areas exposedto potential flooding risks so that preventive measures may be taken beforehand!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This Kaggle dataset provides postcode-level flood risk data for the UK, including the flood risk level, coordinates, and other related information. This dataset is derived from Risk of Flooding from Rivers and Sea (provided by the British government) and Open Postcode Geo. It is licensed under the OGL 3.0 open government license.

In this data set you will find columns for each postcode as well as unique identifiers for a particular region (FID), an overall four band flood risk level (PROB_4BAND), whether a specific location or building is suitable or not (SUITABILITY), when it was published so you can be sure you are getting reliable up to date information (PUB_DATE), Easting/Northing which roughly measure distance eastwards/northwards of locations in meters(EASTING / NORTHING), LATITUDE & LONGITUDE that point to a precise location on google map & finally RISK_FOR_INSURANCE SOP which clearly distinguishes between sites which should generate warnings with regard to various kinds of insurance policies. This allows companies applying digital transformation solutions like hazard mapping solutions to show what risks certain locations present in relation to possible flood damage using digital technologies such as GIS systems or location intelligence tools etc., allowing organizations apply data science models or techniques like predictive analytics that may be used in decision making processes such as those taken by municipalities when signing off disaster management plans etc..

You can use this dataset for research purposes, share your findings on websites through charts & graphs to develop an educational understanding about possible hazards associated with areas that people inhabit around UK particularly at times when storm systems are localized heavily over specific regions making it most likely due causing major catastrophic event across British Isles . People living there can always access their respective postcodes very easily via our Flood Map by Postcode page here Flood Map.

When writing reports acknowledging source material properly , kindly take into account our acknowledgements including; Contains OS data © Crown copyright and database right 2017, Contains Royal Mail data © Royal Mail copyright and Database right 2017 , Contains National Statistics ...
Code-Point with polygons - Dataset - data.gov.uk
ckan.publishing.service.gov.uk
Updated Feb 11, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.publishing.service.gov.uk (2019). Code-Point with polygons - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/code-point-with-polygons2
Explore at:
Dataset updated
Feb 11, 2019
Dataset provided by
CKANhttps://ckan.org/
Description
Code-Point® with polygons shows the notional shape of every postcode unit in Great Britain, and includes major buildings with multiple postcodes. For compelling visuals, Code-Point with polygons lets you apply shading to individual postcodes on a map. This means you can analyse location data at the most granular level and bring your results vividly to life. We give you every single postcode in Great Britain and Northern Ireland – including those for different floors of high-rise buildings. For accuracy, we give every postcode a positional quality rating and map out the boundaries of only the postcodes we can locate most precisely. Code-Point® with polygons contains postcode boundaries for Great Britain. These show the extent of each postcode unit, enabling you to analyse information by postcode. Ideal for activities such as sales targeting or market profiling, as well as any statistical work. Includes notional polygons; vertical streets data; postcode units; eastings and northings; NHS® health authority codes; administrative codes; PO box indicator; and types of delivery points.
🌊 Open Flood Risk by Postcode
kaggle.com
zip
Updated Oct 4, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mexwell (2023). 🌊 Open Flood Risk by Postcode [Dataset]. https://www.kaggle.com/datasets/mexwell/open-flood-risk-by-postcode
Explore at:
zip(23950716 bytes)Available download formats
Dataset updated
Oct 4, 2023
Authors
mexwell
Description
Open Flood Risk by Postcode is derived from the Environment Agency's Risk of Flooding from Rivers and Sea which allocates a risk level to areas in England, UK. Using postcode data from Open Postcode Geo, each English postcode is placed in its risk area, allowing a flood risk level to be allocated to a postcode.

Fields

postcode

FID

PROB_4BAND

SUITABILITY

PUB_DATE

RISK_FOR_INSURANCE_SOP

easting

northing

latitude

longitude

PROB_4BAND is the flood risk level, and can be one of the folowing:

High

Medium

Low

Very Low

None

Note that where a postcode is outside a flood risk area, some of the column values will be NULL, represented as \N in this file.

Documentation

You can find full documentation on the Open Flood Risk by Postcode homepage.

Acknowlegements

Derived from Risk of Flooding from Rivers and Sea Derived from Open Postcode Geo Licensed under the OGL

Foto von Luke Moss auf Unsplash
Risk of Flooding from Rivers and Sea - Postcodes in Areas at Risk
environment.data.gov.uk
Updated Jul 23, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Environment Agency (2024). Risk of Flooding from Rivers and Sea - Postcodes in Areas at Risk [Dataset]. https://environment.data.gov.uk/dataset/8dae18e1-d465-11e4-8e78-f0def148f590
Explore at:
Dataset updated
Jul 23, 2024
Dataset authored and provided by
Environment Agencyhttps://www.gov.uk/ea
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
PLEASE NOTE: This record has been retired. It has been superseded by: https://environment.data.gov.uk/dataset/f81508d3-cf5a-44ed-ae7e-452be665af84 This dataset is a product of a national assessment of flood risk for England produced using local expertise. It is produced using the Risk of Flooding from Rivers and Sea data which shows the chance of flooding from rivers and/or the sea, based on cells of 50m. Each cell is allocated one of four flood risk categories, taking into account flood defences and their condition.

This dataset uses OS address data and Royal Mail postcode data to show how many properties are in each of four flood risk categories in each postcode, based simply on the category allocated to the cell that each property is in.
Civil Service by organisation, postcode, grade and leaving cause
gov.uk
Updated Oct 11, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cabinet Office (2019). Civil Service by organisation, postcode, grade and leaving cause [Dataset]. https://www.gov.uk/government/statistics/civil-service-by-organisation-postcode-grade-and-leaving-cause
Explore at:
Dataset updated
Oct 11, 2019
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Cabinet Office
Description
These tables show Civil Service headcounts at 31 March 2019, and Civil Service leavers between 1 April 2018 and 31 March 2019, by organisation, postcode, grade, and leaving cause.
Predict accident risk score for unique postcode
kaggle.com
zip
Updated Mar 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Manish Tripathi (2022). Predict accident risk score for unique postcode [Dataset]. https://www.kaggle.com/manishtripathi86/predict-accident-risk-score-for-unique-postcode
Explore at:
zip(21360724 bytes)Available download formats
Dataset updated
Mar 13, 2022
Authors
Manish Tripathi
Description
Dataset Source: https://machinehack.com/hackathon/predict_accident_risk_score_for_unique_postcode/data

Data set is for private consumption for the competition.

According to IBEF “Domestic automobiles production increased at 2.36% CAGR between FY16-20 with 26.36 million vehicles being manufactured in the country in FY20.Overall, domestic automobiles sales increased at 1.29% CAGR between FY16-FY20 with 21.55 million vehicles being sold in FY20”.The rise in vehicles on the road will also lead to multiple challenges and the road will be more vulnerable to accidents.Increased accident rates also leads to more insurance claims and payouts rise for insurance companies.

In order to pre-emptively plan for the losses, the insurance firms leverage accident data to understand the risk across the geographical units e.g. Postal code/district etc.

In this challenge, we are providing you the dataset to predict the “Accident_Risk_Index” against the postcodes.Accident_Risk_Index (mean casualties at a postcode) = sum(Number_of_casualities)/count(Accident_ID)

Working example:

Train Data (given)
Accident_ID Postcode Number_of_casualities 1 AL1 1JJ 2 2 AL1 1JP 3 3 AL1 3PS 2 4 AL1 3PS 1 5 AL1 3PS 1 Modelling Train Data (Rolled up at Postcode level)
Postcode Derived_feature1 Derived_feature2 Accident_risk_Index AL1 1JJ _ _ 2 AL1 1JP _ _ 3 AL1 3PS _ _ 1.33 The participants are required to predict the 'Accident_risk_index' for the test.csv and against the postcode on the test data.

Then submit your 'my_submission_file.csv' on the submission tab of the hackathon page.

Pro-tip: The participants are required to perform feature engineering to first roll-up the train data at postcode level and create a column as “accident_risk_index” and optimize the model against postcode level.

Few Hypothesis to help you think: "More accidents happen in the later part of the day as those are office hours causing congestion"

"Postal codes with more single carriage roads have more accidents"

(***In the above hypothesis features such as office_hours_flag and #single _carriage roads can be formed)

Additionally, we are providing you with road network data (contains info on the nearest road to a postcode and it's characteristics) and population data (contains info about population at area level). This info are for augmentation of features, but not mandatory to use.

The provided dataset contains the following files:

Train: 4,84,042 rows x 27 columns

Test: 1,15,958 rows x 27 columns

train.csv & test.csv:

'Accident_ID', 'Police_Force', 'Number_of_Vehicles', 'Number_of_Casualties', 'Date', 'Day_of_Week', 'Time', ‘Local_Authority_(District)', 'Local_Authority_(Highway)', '1st_Road_Class', '1st_Road_Number', 'Road_Type', 'Speed_limit', '2nd_Road_Class', '2nd_Road_Number', 'Pedestrian_Crossing-Human_Control', 'Pedestrian_Crossing-Physical_Facilities', 'Light_Conditions', ‘'Weather_Conditions', 'Road_Surface_Conditions', 'Special_Conditions_at_Site', 'Carriageway_Hazards', 'Urban_or_Rural_Area', 'Did_Police_Officer_Attend_Scene_of_Accident', 'state', 'postcode', 'country'

Population: 8,035 rows x 10 columns

population.csv:

'postcode', 'Rural Urban', 'Variable: All usual residents; measures: Value', 'Variable: Males; measures: Value', 'Variable: Females; measures: Value', ‘Variable: Lives in a household; measures: Value', ‘Variable: Lives in a communal establishment; measures: Value', 'Variable: Schoolchild or full-time student aged 4 and over at their non term-time address; measures: Value', 'Variable: Area (Hectares); measures: Value', 'Variable: Density (number of persons per hectare); measures: Value'

Road Network: 91,566 rows x 8 columns

roads_network.csv:

'WKT', 'roadClassi', ‘roadFuncti', 'formOfWay', 'length', 'primaryRou', 'distance to the nearest point on rd', 'postcode’

Overview Swiss Re is one of the largest reinsurers in the world headquartered in Zurich with offices in over 25 countries. Swiss Re’s core expertise is in underwriting in life, health, as well as the property and casualty insurance space whereas its tech strategy focuses on developing smarter and innovative solutions for clients’ value chains by leveraging data and technology.

The company’s vision is to make the world more resilient. Swiss Re believes in applying fresh perspectives, knowledge and capital to anticipate and manage risk to create smarter solutions and help the world rebuild, renew and move forward.About 1300 professionals that work in the Swiss Re Global Business Solutions Center (BSC), Bangalore combine experience, expertise and out-of-the-box thinking to bring Swiss Re's core business to life by creating new business opportunities.
ONS Postcode Directory (February 2024) for the UK
geoportal.statistics.gov.uk
Updated Feb 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2024). ONS Postcode Directory (February 2024) for the UK [Dataset]. https://geoportal.statistics.gov.uk/datasets/e14b1475ecf74b58804cf667b6740706
Explore at:
Dataset updated
Feb 28, 2024
Dataset authored and provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
https://www.ons.gov.uk/methodology/geography/licenceshttps://www.ons.gov.uk/methodology/geography/licences
Area covered

Description
This is the ONS Postcode Directory (ONSPD) for the United Kingdom as at February 2024 in Comma Separated Variable (CSV) and ASCII text (TXT) formats. This file contains the multi CSVs so that postcode areas can be opened in MS Excel. To download the zip file click the Download button. The ONSPD relates both current and terminated postcodes in the United Kingdom to a range of current statutory administrative, electoral, health and other area geographies. It also links postcodes to pre-2002 health areas, 1991 Census enumeration districts for England and Wales, 2001 Census Output Areas (OA) and Super Output Areas (SOA) for England and Wales, 2001 Census OAs and SOAs for Northern Ireland and 2001 Census OAs and Data Zones (DZ) for Scotland. It now contains 2021 Census OAs and SOAs for England, Wales and Northern Ireland. It helps support the production of area-based statistics from postcoded data. The ONSPD is produced by ONS Geography, who provide geographic support to the Office for National Statistics (ONS) and geographic services used by other organisations. The ONSPD is issued quarterly. (File size - 231 MB) Please note that this product contains Royal Mail, Gridlink, LPS (Northern Ireland), Ordnance Survey and ONS Intellectual Property Rights.
u
Acorn Postcode-Level Directory for the United Kingdom, 2024
beta.ukdataservice.ac.uk
datacatalogue.ukdataservice.ac.uk
Updated Oct 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CACI Limited (2024). Acorn Postcode-Level Directory for the United Kingdom, 2024 [Dataset]. http://doi.org/10.5255/UKDA-SN-9183-2
Explore at:
Unique identifier
https://doi.org/10.5255/UKDA-SN-9183-2
Dataset updated
Oct 4, 2024
Dataset provided by
UK Data Servicehttps://ukdataservice.ac.uk/
Authors
CACI Limited
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Time period covered
Jan 1, 2024 - Dec 30, 2024
Area covered
United Kingdom
Description
The Acorn geodemographic classification is a long-running classification developed by CACI Limited. Acorn operates by merging geography with demographics and details about consumer characteristics and behaviours. Supported by advanced AI methods, comprehensive input data, and detailed product literature, Acorn provides precise information and enables an in-depth understanding of the different types of consumers in every part of the country.

The current classification groups the entire United Kingdom population into 7 categories, 22 groups and 65 types. The data is available at unit postcode level. Further information may be found on the CACI ACORN microsite.
Use of the data requires approval from the data owner or their nominee and is restricted to those based at a Higher Education or Further Education institution. Please see the Data Access section for further information.
For the second edition (October 2024) data and documentation files for 2024 have been added to the study.
International Zip Code Database
geopostcodes.com
csv
Updated Mar 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GeoPostcodes (2024). International Zip Code Database [Dataset]. https://www.geopostcodes.com/international-zip-code/
Explore at:
csvAvailable download formats
Dataset updated
Mar 7, 2024
Dataset authored and provided by
GeoPostcodes
Area covered
World
Description
Our International Zip Code Database offers comprehensive postal code data for spatial analysis, including postal and administrative areas for numerous countries worldwide. This global dataset contains accurate and up-to-date information on all administrative divisions, cities, and zip codes, making it an invaluable resource for various applications such as address capture and validation, map and visualization, reporting and business intelligence (BI), master data management, logistics and supply chain management, and sales and marketing. Our location data packages are available in various formats, including CSV, optimized for seamless integration with popular systems like Esri ArcGIS, Snowflake, QGIS, and more. Product features include fully and accurately geocoded data, multi-language support with address names in local and foreign languages, comprehensive city definitions, and the option to combine map data with UNLOCODE and IATA codes, time zones, and daylight saving times. Companies choose our location databases for their enterprise-grade service, reduction in integration time and cost by 30%, and weekly updates to ensure the highest quality.
newGeoSure Insurance Product version 7 2016.1
data.wu.ac.at
data.europa.eu
html
Updated Aug 18, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
British Geological Survey (2018). newGeoSure Insurance Product version 7 2016.1 [Dataset]. https://data.wu.ac.at/odso/data_gov_uk/ZTA4MTJmMWYtYzNmMy00NGM3LWE3NWQtZTE0MWU5ODY0NWYy
Explore at:
htmlAvailable download formats
Dataset updated
Aug 18, 2018
Dataset provided by
British Geological Surveyhttps://www.bgs.ac.uk/
Area covered
e0ca6759812a4b16c7f8fb4e711b0694f47de1e6
Description
The newGeoSure Insurance Product (newGIP) provides the potential insurance risk due to natural ground movement. It incorporates the combined effects of the 6 GeoSure hazards on (low-rise) buildings. This data is available as vector data, 25m gridded data or alternatively linked to a postcode database - the Derived Postcode Database. A series of GIS (Geographical Information System) maps show the most significant hazard areas. The ground movement, or subsidence, hazards included are landslides, shrink-swell clays, soluble rocks, running sands, compressible ground and collapsible deposits. The newGeoSure Insurance Product uses the individual GeoSure data layers and evaluates them using a series of processes including statistical analyses and expert elicitation techniques to create a derived product that can be used for insurance purposes such as identifying and estimating risk and susceptibility. The Derived Postcode Database (DPD) contains generalised information at a postcode level. The DPD is designed to provide a 'summary' value representing the combined effects of the GeoSure dataset across a postcode sector area. It is available as a GIS point dataset or a text (.txt) file format. The DPD contains a normalised hazard rating for each of the 6 GeoSure themes hazards (i.e. each GeoSure theme has been balanced against each other) and a combined unified hazard rating for each postcode in Great Britain. The combined hazard rating for each postcode is available as a standalone product. The Derived Postcode Database is available in a point data format or text file format. It is available in a range of GIS formats including ArcGIS (.shp), ArcInfo Coverages and MapInfo (.tab). More specialised formats may be available but may incur additional processing costs. The newGeoSure Insurance Product dataset has been created as vector data but is also available as a raster grid. This data is available in a range of GIS formats, including ArcGIS (.shp), ArcInfo coverage's and MapInfo (.tab). More specialised formats may be available but may incur additional processing costs. Data for the newGIP is provided for national coverage across Great Britain. The newGeoSure Insurance Product dataset is produced for use at 1:50 000 scale providing 50m ground resolution. This dataset has been specifically developed for the insurance of low-rise buildings. The GeoSure datasets have been developed to identify the potential hazard for low-rise buildings and those with shallow foundations of less than 2 m deep. The identification of ground instability and other geological hazards can assist regional planners; rapidly identifying areas with potential problems and aid local government offices in making development plans by helping to define land suited to different uses. Other users of these data may include developers, homeowners, solicitors, loss adjusters, the insurance industry, architects and surveyors.
G
Energy & environmental performance of dwellings using EPC
dtechtive.com
csv
Updated Jan 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Glasgow City Council (uSmart) (2024). Energy & environmental performance of dwellings using EPC [Dataset]. https://dtechtive.com/datasets/39479
Explore at:
csv(0.2123 MB)Available download formats
Dataset updated
Jan 1, 2024
Dataset provided by
Glasgow City Council (uSmart)
Description
Energy Performance Certificates (EPCs) are needed whenever a property is built, sold or rented. An EPC contains information about a property's energy use and typical energy costs and recommendations about how to reduce energy use and save money. An EPC gives a property an energy efficiency rating from A (most efficient) to G (least efficient) and it is valid for 10 years. The Standard Assessment Procedure (SAP) used to create the EPC is the methodology used by the Government to assess and compare the energy and environmental performance of dwellings. It aims to provide accurate and reliable assessments of dwelling energy performances that are needed to underpin energy and environmental policy initiatives. The data come from an IBM Fuel Poverty report and provide SAP/EPC energy rating by post code within the Glasgow Housing Association (GHA) stock register. The fields are: Post Code, Current Energy Efficiency Rating, Potential Energy Efficiency Rating, Current Environmental Impact Rating and Potential Environmental Impact Rating. Date extracted 2011-05-19. Data supplied by Glasgow Housing Association Licence: None
National Statistics Postcode Lookup (May 2022) for the UK
geoportal.statistics.gov.uk
arc-gis-hub-home-arcgishub.hub.arcgis.com
+1more
Updated May 25, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2022). National Statistics Postcode Lookup (May 2022) for the UK [Dataset]. https://geoportal.statistics.gov.uk/datasets/9ac0331178b0435e839f62f41cc61c16
Explore at:
Dataset updated
May 25, 2022
Dataset authored and provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
https://www.ons.gov.uk/methodology/geography/licenceshttps://www.ons.gov.uk/methodology/geography/licences
Area covered

Description
This file contains the National Statistics Postcode Lookup (NSPL) for the United Kingdom as at May 2022 in Comma Separated Variable (CSV) and ASCII text (TXT) formats. To download the zip file click the Download button. The NSPL relates both current and terminated postcodes to a range of current statutory geographies via ‘best-fit’ allocation from the 2011 Census Output Areas (national parks and Workplace Zones are exempt from ‘best-fit’ and use ‘exact-fit’ allocations). It supports the production of area based statistics from postcoded data. The NSPL is produced by ONS Geography, who provide geographic support to the Office for National Statistics (ONS) and geographic services used by other organisations. The NSPL is issued quarterly. (File size - 196 MB).
newGeoSure Insurance Product version 7 2015.1 - Dataset - data.gov.uk
ckan.publishing.service.gov.uk
Updated Sep 8, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.publishing.service.gov.uk (2015). newGeoSure Insurance Product version 7 2015.1 - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/newgeosure-insurance-product-version-7-2015-1
Explore at:
Dataset updated
Sep 8, 2015
Dataset provided by
CKANhttps://ckan.org/
Description
This dataset has been superseded The newGeoSure Insurance Product (newGIP) provides the potential insurance risk due to natural ground movement. It incorporates the combined effects of the 6 GeoSure hazards on (low-rise) buildings. This data is available as vector data, 25m gridded data or alternatively linked to a postcode database – the Derived Postcode Database. A series of GIS (Geographical Information System) maps show the most significant hazard areas. The ground movement, or subsidence, hazards included are landslides, shrink-swell clays, soluble rocks, running sands, compressible ground and collapsible deposits. The newGeoSure Insurance Product uses the individual GeoSure data layers and evaluates them using a series of processes including statistical analyses and expert elicitation techniques to create a derived product that can be used for insurance purposes such as identifying and estimating risk and susceptibility. The Derived Postcode Database (DPD) contains generalised information at a postcode level. The DPD is designed to provide a ‘summary’ value representing the combined effects of the GeoSure dataset across a postcode sector area. It is available as a GIS point dataset or a text (.txt) file format. The DPD contains a normalised hazard rating for each of the 6 GeoSure themes hazards (i.e. each GeoSure theme has been balanced against each other) and a combined unified hazard rating for each postcode in Great Britain. The combined hazard rating for each postcode is available as a standalone product. The Derived Postcode Database is available in a point data format or text file format. It is available in a range of GIS formats including ArcGIS (.shp), ArcInfo Coverages and MapInfo (.tab). More specialised formats may be available but may incur additional processing costs. The newGeoSure Insurance Product dataset has been created as vector data but is also available as a raster grid. This data is available in a range of GIS formats, including ArcGIS (.shp), ArcInfo coverage’s and MapInfo (.tab). More specialised formats may be available but may incur additional processing costs. Data for the newGIP is provided for national coverage across Great Britain. The newGeoSure Insurance Product dataset is produced for use at 1:50 000 scale providing 50 m ground resolution. This dataset has been specifically developed for the insurance of low-rise buildings. The GeoSure datasets have been developed to identify the potential hazard for low-rise buildings and those with shallow foundations of less than 2 m deep. The identification of ground instability and other geological hazards can assist regional planners; rapidly identifying areas with potential problems and aid local government offices in making development plans by helping to define land suited to different uses. Other users of these data may include developers, homeowners, solicitors, loss adjusters, the insurance industry, architects and surveyors. Version 7 released June 2015.
c
Crystal Roof | UK Crime Data API | Last updated October 2025
crystalroof.co.uk
json
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CrystalRoof Ltd, Crystal Roof | UK Crime Data API | Last updated October 2025 [Dataset]. https://crystalroof.co.uk/api-docs/method/crime-rate-by-postcode
Explore at:
jsonAvailable download formats
Dataset authored and provided by
CrystalRoof Ltd
License
https://crystalroof.co.uk/api-terms-of-usehttps://crystalroof.co.uk/api-terms-of-use
Area covered
United Kingdom, Wales, England
Description
This method returns total crime rates, crime rates by crime types, area ratings by total crime, and area ratings by crime type for small areas (Lower Layer Super Output Areas, or LSOAs) and Local Authority Districts (LADs). The results are determined by the inclusion of the submitted postcode/coordinates/UPRN within the corresponding LSOA or LAD.

All figures are annual (for the last 12 months).

The crime rates are calculated per 1,000 resident population derived from the census 2021.

The dataset is updated on a monthly basis, with a 3-month lag between the current date and the most recent data.

Facebook

Twitter

Click to copy link

Link copied

Cite

Gaurav Dutta (2022). Predict accident risk score for unique postcode [Dataset]. https://www.kaggle.com/datasets/gauravduttakiit/predict-accident-risk-score-for-unique-postcode

Predict accident risk score for unique postcode

Explore at:

zip(21360724 bytes)Available download formats

Dataset updated

Mar 13, 2022

Authors

Gaurav Dutta

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

According to IBEF “Domestic automobiles production increased at 2.36% CAGR between FY16-20 with 26.36 million vehicles being manufactured in the country in FY20.Overall, domestic automobiles sales increased at 1.29% CAGR between FY16-FY20 with 21.55 million vehicles being sold in FY20”.The rise in vehicles on the road will also lead to multiple challenges and the road will be more vulnerable to accidents.Increased accident rates also leads to more insurance claims and payouts rise for insurance companies.

In order to pre-emptively plan for the losses, the insurance firms leverage accident data to understand the risk across the geographical units e.g. Postal code/district etc.

In this challenge, we are providing you the dataset to predict the “Accident_Risk_Index” against the postcodes.Accident_Risk_Index (mean casualties at a postcode) = sum(Number_of_casualities)/count(Accident_ID)

The participants are required to predict the 'Accident_risk_index' for the test.csv and against the postcode on the test data.

Then submit your 'my_submission_file.csv' on the submission tab of the hackathon page.

Pro-tip: The participants are required to perform feature engineering to first roll-up the train data at postcode level and create a column as “accident_risk_index” and optimize the model against postcode level.

Few Hypothesis to help you think: "More accidents happen in the later part of the day as those are office hours causing congestion"

"Postal codes with more single carriage roads have more accidents"

(***In the above hypothesis features such as office_hours_flag and #single _carriage roads can be formed)

Additionally, we are providing you with road network data (contains info on the nearest road to a postcode and it's characteristics) and population data (contains info about population at area level). This info are for augmentation of features, but not mandatory to use.

The provided dataset contains the following files:

Train: 4,84,042 rows x 27 columns

Test: 1,15,958 rows x 27 columns

train.csv & test.csv:

'Accident_ID', 'Police_Force', 'Number_of_Vehicles', 'Number_of_Casualties', 'Date', 'Day_of_Week', 'Time', ‘Local_Authority_(District)', 'Local_Authority_(Highway)', '1st_Road_Class', '1st_Road_Number', 'Road_Type', 'Speed_limit', '2nd_Road_Class', '2nd_Road_Number', 'Pedestrian_Crossing-Human_Control', 'Pedestrian_Crossing-Physical_Facilities', 'Light_Conditions', ‘'Weather_Conditions', 'Road_Surface_Conditions', 'Special_Conditions_at_Site', 'Carriageway_Hazards', 'Urban_or_Rural_Area', 'Did_Police_Officer_Attend_Scene_of_Accident', 'state', 'postcode', 'country'

Population: 8,035 rows x 10 columns

population.csv:

'postcode', 'Rural Urban', 'Variable: All usual residents; measures: Value', 'Variable: Males; measures: Value', 'Variable: Females; measures: Value', ‘Variable: Lives in a household; measures: Value', ‘Variable: Lives in a communal establishment; measures: Value', 'Variable: Schoolchild or full-time student aged 4 and over at their non term-time address; measures: Value', 'Variable: Area (Hectares); measures: Value', 'Variable: Density (number of persons per hectare); measures: Value'

Road Network: 91,566 rows x 8 columns

roads_network.csv:

'WKT', 'roadClassi', ‘roadFuncti', 'formOfWay', 'length', 'primaryRou', 'distance to the nearest point on rd', 'postcode’

Clear search

Close search

Google apps

Main menu

Predict accident risk score for unique postcode

Train: 4,84,042 rows x 27 columns

Test: 1,15,958 rows x 27 columns

Population: 8,035 rows x 10 columns

Road Network: 91,566 rows x 8 columns

UK Postcode-Level Flood Risk Data

UK Postcode-Level Flood Risk Data

Risk Level, Suitability, and Geographic Coordinates

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Acknowledgements

Flood risk: Postcode search tool data - Dataset - data.gov.uk

Open Flood Risk by Postcode

Open Flood Risk by Postcode

Fields

Risk levels

Methodology

Documentation

Acknowledgements

Attribution statements

Risk of Flooding from Rivers and Sea - Postcodes in Areas at Risk - Dataset...

newGeoSure Insurance Product version 7 2015.1

UK Postcode-level Flood Risk Data (Rivers and Sea)

UK Postcode-level Flood Risk Data (Rivers and Sea)

Identifying Risk Areas for Flood Protection and Planning

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Code-Point with polygons - Dataset - data.gov.uk

🌊 Open Flood Risk by Postcode

Fields

PROB_4BAND is the flood risk level, and can be one of the folowing:

Documentation

Acknowlegements

Risk of Flooding from Rivers and Sea - Postcodes in Areas at Risk

Civil Service by organisation, postcode, grade and leaving cause

Predict accident risk score for unique postcode

Dataset Source: https://machinehack.com/hackathon/predict_accident_risk_score_for_unique_postcode/data

Train: 4,84,042 rows x 27 columns

Test: 1,15,958 rows x 27 columns

Population: 8,035 rows x 10 columns

Road Network: 91,566 rows x 8 columns

ONS Postcode Directory (February 2024) for the UK

Acorn Postcode-Level Directory for the United Kingdom, 2024

International Zip Code Database

newGeoSure Insurance Product version 7 2016.1

Energy & environmental performance of dwellings using EPC

National Statistics Postcode Lookup (May 2022) for the UK

newGeoSure Insurance Product version 7 2015.1 - Dataset - data.gov.uk

Crystal Roof | UK Crime Data API | Last updated October 2025

Predict accident risk score for unique postcode

Predict accident risk score for unique postcode

Train: 4,84,042 rows x 27 columns

Test: 1,15,958 rows x 27 columns

Population: 8,035 rows x 10 columns

Road Network: 91,566 rows x 8 columns