Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
According to IBEF “Domestic automobiles production increased at 2.36% CAGR between FY16-20 with 26.36 million vehicles being manufactured in the country in FY20.Overall, domestic automobiles sales increased at 1.29% CAGR between FY16-FY20 with 21.55 million vehicles being sold in FY20”.The rise in vehicles on the road will also lead to multiple challenges and the road will be more vulnerable to accidents.Increased accident rates also leads to more insurance claims and payouts rise for insurance companies.
In order to pre-emptively plan for the losses, the insurance firms leverage accident data to understand the risk across the geographical units e.g. Postal code/district etc.
In this challenge, we are providing you the dataset to predict the “Accident_Risk_Index” against the postcodes.Accident_Risk_Index (mean casualties at a postcode) = sum(Number_of_casualities)/count(Accident_ID)
The participants are required to predict the 'Accident_risk_index' for the test.csv and against the postcode on the test data.
Then submit your 'my_submission_file.csv' on the submission tab of the hackathon page.
Pro-tip: The participants are required to perform feature engineering to first roll-up the train data at postcode level and create a column as “accident_risk_index” and optimize the model against postcode level.
Few Hypothesis to help you think: "More accidents happen in the later part of the day as those are office hours causing congestion"
"Postal codes with more single carriage roads have more accidents"
(***In the above hypothesis features such as office_hours_flag and #single _carriage roads can be formed)
Additionally, we are providing you with road network data (contains info on the nearest road to a postcode and it's characteristics) and population data (contains info about population at area level). This info are for augmentation of features, but not mandatory to use.
The provided dataset contains the following files:
train.csv & test.csv:
'Accident_ID', 'Police_Force', 'Number_of_Vehicles', 'Number_of_Casualties', 'Date', 'Day_of_Week', 'Time', ‘Local_Authority_(District)', 'Local_Authority_(Highway)', '1st_Road_Class', '1st_Road_Number', 'Road_Type', 'Speed_limit', '2nd_Road_Class', '2nd_Road_Number', 'Pedestrian_Crossing-Human_Control', 'Pedestrian_Crossing-Physical_Facilities', 'Light_Conditions', ‘'Weather_Conditions', 'Road_Surface_Conditions', 'Special_Conditions_at_Site', 'Carriageway_Hazards', 'Urban_or_Rural_Area', 'Did_Police_Officer_Attend_Scene_of_Accident', 'state', 'postcode', 'country'
population.csv:
'postcode', 'Rural Urban', 'Variable: All usual residents; measures: Value', 'Variable: Males; measures: Value', 'Variable: Females; measures: Value', ‘Variable: Lives in a household; measures: Value', ‘Variable: Lives in a communal establishment; measures: Value', 'Variable: Schoolchild or full-time student aged 4 and over at their non term-time address; measures: Value', 'Variable: Area (Hectares); measures: Value', 'Variable: Density (number of persons per hectare); measures: Value'
roads_network.csv:
'WKT', 'roadClassi', ‘roadFuncti', 'formOfWay', 'length', 'primaryRou', 'distance to the nearest point on rd', 'postcode’
Facebook
TwitterBy GetTheData [source]
The underlying source material has been compiled from open datasets including 'Risk of Flooding from Rivers and Sea', 'Open Postcode Geo' - all held under licence in agreement with Crown copyright & Database right (2017) & Royal Mail copyright & Database right (2017). The methodology used would combine each one of these datasets points into polygons with first identifying each risk area then mapping out corresponding postcode points within them which then could be tracked for its related longitude, latitude easting and northing positions. Through this comprehensive process you could get a better understanding regarding what individual postcodes are within high & low level flooding areas as well as find out from the latest publication date - when was it last issued? Ultimately this profound dataset comes in handy for prevention or even planning purposes informing citizens how serious some situations could become during extreme weather events such as floods or major storms allowing them to estimate potential risks before disaster ensues!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
In order to effectively use this dataset there are several key pieces of terminology that you should be familiar with:
- FID – a unique identifier number associated with each record in the database.
- Postcode – an alphanumeric code used to identify a specific geographic region within the country; consists of two parts: an outward code (e.g., RG4) and an inward code (e.g., 8DN).
- PROB_4BAND – The flood risk level based on four categories represented in this field assigned by location - High, Medium, Low or Very Low; or None if outside of a high-risk area
- SUITABILITY – The suitability rating determined by location; either suitable or not suitable for development based on constraints for building in a floodplain
- Publication Date(PUB DATE) - The date that this information was made publicly available
Risk For Insurance SOP (Risk_For_Insurance_SOP) - A ranking system from 1-5 used as guidance only when offering advice before taking out insurance cover over certain property located in certain very extreme area
You will also need to be aware of some mathematical values associated with each postcode:
Easting – An eastward grid coordinate reference point corresponding to determining latitude/ longitude coordinates at specified points along an arc created by measuring distances between two other known points
Northing– A northward grid coordinate reference forming part of a geographical survey’s grid system
Finally, here is how you can get started working with this amazing dataset:
Download it onto your computer from Kaggle's website (www.kaggle/datasets/UK Postecode Level Flood Risk Data).
2
- Creating a custom application that provides users with real-time flood risk and safety advice based on postcode.
- Developing a map-based interface that integrates flood risk levels directly into Google Maps to assist people in planning trips and relocating in safer areas.
- Developing an app that tracks the geographically accurate position of every property within each postcode, allowing for better risk assessment for businesses and insurers
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: open_flood_risk_by_postcode.csv | Column name | Description | |:--------------|:--------------------------------| | TR23 0PR | Postcode area (String) | | \N | FID (Integer) | | None | PROB_4BAND (String) | | \N.1 | SUITABILITY (String) | | \N.2 | PULD_DATE (Date) | | \N.3 | RISK_FOR_INSURANCE_SOP (String) | | 87897 | Easting (Integer) | | 15021 | Northing (Integer) | | 49.953605 | Latitude (Float) | | -6.352647 | Longitude (Float) |
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit GetTheData.
Facebook
TwitterThe ‘Flood risk: Postcode search tool data’ is primarily intended for use by an interactive tool (the ‘Flood risk: Postcode search tool’), but it is also published as open data. The tool is designed to be embedded in third party online media such as news websites, within relevant contexts (relating to flooding and climate change for example). The tool shows information about the flood risk within a postcode and invites users to find out more by clicking on a link which will take them to the GOV.UK service ‘Check your long-term flood risk’: https://www.gov.uk/check-long-term-flood-risk. The data is for long term risk of flooding, not floods that are expected to happen now or in the next 5 days. The risk is for the area around an address, not the address itself. Attribution statement: © Environment Agency copyright and/or database right 2025. All rights reserved.
Facebook
Twitterhttp://reference.data.gov.uk/id/open-government-licencehttp://reference.data.gov.uk/id/open-government-licence
English postcodes and the Environment Agency's risk level for the area that postcode falls within.
Possible values for PROB_4BAND
The position of the postcode is determined using a point location from Open Postcode Geo. This point location places the postcode in the corresponding area within Risk of Flooding from Rivers and Sea and the risk level (PROB_4BAND) is linked to the postcode. This allows flood risk level to be looked up by postcode.
Note that properties within the postcode may have a slightly different point location and thus be located in a different area with a different risk level. Generally speaking the point location of the postcode is determined by the point location of the centremost property in that postcode.
Full documentation can be found on the Open Flood Risk by Postcode homepage.
Derived from the Environment Agency's Risk of Flooding from Rivers and Sea open dataset.
Postcode data: Open Postcode Geo
Licensed under the OGL (attribution required).
Facebook
TwitterThis dataset contains a summary of properties at risk of flooding from rivers and the sea at a postcode scale, including a breakdown by flood risk likelihood category and property type. Attribution statement: © Environment Agency copyright and/or database right 2025. All rights reserved. Some features of this map are based on digital spatial data from the Centre for Ecology & Hydrology, © NERC (CEH). © Crown Copyright and Database Rights 2025 OS AC0000807064. This product is produced in part from PAF® and Multiple Residence Data, the copyright in which is owned by Royal Mail Group Limited and/or Royal Mail Group plc. All rights reserved. Licence number AC0000807064
Facebook
TwitterThis dataset has been superseded The newGeoSure Insurance Product (newGIP) provides the potential insurance risk due to natural ground movement. It incorporates the combined effects of the 6 GeoSure hazards on (low-rise) buildings. This data is available as vector data, 25m gridded data or alternatively linked to a postcode database – the Derived Postcode Database. A series of GIS (Geographical Information System) maps show the most significant hazard areas. The ground movement, or subsidence, hazards included are landslides, shrink-swell clays, soluble rocks, running sands, compressible ground and collapsible deposits. The newGeoSure Insurance Product uses the individual GeoSure data layers and evaluates them using a series of processes including statistical analyses and expert elicitation techniques to create a derived product that can be used for insurance purposes such as identifying and estimating risk and susceptibility. The Derived Postcode Database (DPD) contains generalised information at a postcode level. The DPD is designed to provide a ‘summary’ value representing the combined effects of the GeoSure dataset across a postcode sector area. It is available as a GIS point dataset or a text (.txt) file format. The DPD contains a normalised hazard rating for each of the 6 GeoSure themes hazards (i.e. each GeoSure theme has been balanced against each other) and a combined unified hazard rating for each postcode in Great Britain. The combined hazard rating for each postcode is available as a standalone product. The Derived Postcode Database is available in a point data format or text file format. It is available in a range of GIS formats including ArcGIS (.shp), ArcInfo Coverages and MapInfo (.tab). More specialised formats may be available but may incur additional processing costs. The newGeoSure Insurance Product dataset has been created as vector data but is also available as a raster grid. This data is available in a range of GIS formats, including ArcGIS (.shp), ArcInfo coverage’s and MapInfo (.tab). More specialised formats may be available but may incur additional processing costs. Data for the newGIP is provided for national coverage across Great Britain. The newGeoSure Insurance Product dataset is produced for use at 1:50 000 scale providing 50 m ground resolution. This dataset has been specifically developed for the insurance of low-rise buildings. The GeoSure datasets have been developed to identify the potential hazard for low-rise buildings and those with shallow foundations of less than 2 m deep. The identification of ground instability and other geological hazards can assist regional planners; rapidly identifying areas with potential problems and aid local government offices in making development plans by helping to define land suited to different uses. Other users of these data may include developers, homeowners, solicitors, loss adjusters, the insurance industry, architects and surveyors. Version 7 released June 2015.
Facebook
TwitterBy IBM Watson AI XPRIZE - Environment [source]
Welcome to the UK Postcode-level Flood Risk Dataset. This open source dataset contains detailed information on flood risk levels by postcode in the UK, allowing you to map out potential problems and plan accordingly. With this dataset, you can assess each postcode's growing risk of floods due to human land use change and climate change-related weather patterns, as well as historical occurrences specific to each area.
We pull data from organizations including Risk of Flooding from Rivers & Sea, Open Postcode Geo, Royal Mail copyright & database right (2017), National Statistics data Crown copyright & database right (2017), and Environment Agency data licensed under the Open Government Licence v3.0. The associated columns in this dataset are detailed below:
- Postcode - unique identifier for the postal code district where flood risk area is located
- FID - Unique ID for each location point
- PROB 4BAND - Flood risk level for a given postcode determined according to a four tier grade system (High, Medium, Low or Very Low)
- SUITABILITY - Suitability of location based on environment factors assessed according to OFRA criteria
- PUB_DATE - Date when data was published or last updated
- RISK FOR INSURANCE SOP - Standard Operating Procedure assigned according the Probability 4 band Risk rating
- Easting/Northing/Latitude/Longitude – Coordinates associated with a given postcode location
This data can be used by local authorities and agencies conducting flood mapping projects; insurers assessing assets at specified locations using an agreed set of methodology; advisors assessing locations for development purposes; forecasters aiding contingency planning; homeowners/commercial businesses seeking insurance cover for claims arising from flooding events etc. Ultimately we hope citizens around the world use this dataset as an important tool to predict areas exposedto potential flooding risks so that preventive measures may be taken beforehand!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This Kaggle dataset provides postcode-level flood risk data for the UK, including the flood risk level, coordinates, and other related information. This dataset is derived from Risk of Flooding from Rivers and Sea (provided by the British government) and Open Postcode Geo. It is licensed under the OGL 3.0 open government license.
In this data set you will find columns for each postcode as well as unique identifiers for a particular region (FID), an overall four band flood risk level (PROB_4BAND), whether a specific location or building is suitable or not (SUITABILITY), when it was published so you can be sure you are getting reliable up to date information (PUB_DATE), Easting/Northing which roughly measure distance eastwards/northwards of locations in meters(EASTING / NORTHING), LATITUDE & LONGITUDE that point to a precise location on google map & finally RISK_FOR_INSURANCE SOP which clearly distinguishes between sites which should generate warnings with regard to various kinds of insurance policies. This allows companies applying digital transformation solutions like hazard mapping solutions to show what risks certain locations present in relation to possible flood damage using digital technologies such as GIS systems or location intelligence tools etc., allowing organizations apply data science models or techniques like predictive analytics that may be used in decision making processes such as those taken by municipalities when signing off disaster management plans etc..
You can use this dataset for research purposes, share your findings on websites through charts & graphs to develop an educational understanding about possible hazards associated with areas that people inhabit around UK particularly at times when storm systems are localized heavily over specific regions making it most likely due causing major catastrophic event across British Isles . People living there can always access their respective postcodes very easily via our Flood Map by Postcode page here Flood Map.
When writing reports acknowledging source material properly , kindly take into account our acknowledgements including; Contains OS data © Crown copyright and database right 2017, Contains Royal Mail data © Royal Mail copyright and Database right 2017 , Contains National Statistics ...
Facebook
TwitterCode-Point® with polygons shows the notional shape of every postcode unit in Great Britain, and includes major buildings with multiple postcodes. For compelling visuals, Code-Point with polygons lets you apply shading to individual postcodes on a map. This means you can analyse location data at the most granular level and bring your results vividly to life. We give you every single postcode in Great Britain and Northern Ireland – including those for different floors of high-rise buildings. For accuracy, we give every postcode a positional quality rating and map out the boundaries of only the postcodes we can locate most precisely. Code-Point® with polygons contains postcode boundaries for Great Britain. These show the extent of each postcode unit, enabling you to analyse information by postcode. Ideal for activities such as sales targeting or market profiling, as well as any statistical work. Includes notional polygons; vertical streets data; postcode units; eastings and northings; NHS® health authority codes; administrative codes; PO box indicator; and types of delivery points.
Facebook
TwitterOpen Flood Risk by Postcode is derived from the Environment Agency's Risk of Flooding from Rivers and Sea which allocates a risk level to areas in England, UK. Using postcode data from Open Postcode Geo, each English postcode is placed in its risk area, allowing a flood risk level to be allocated to a postcode.
Note that where a postcode is outside a flood risk area, some of the column values will be NULL, represented as \N in this file.
You can find full documentation on the Open Flood Risk by Postcode homepage.
Derived from Risk of Flooding from Rivers and Sea Derived from Open Postcode Geo Licensed under the OGL
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
PLEASE NOTE: This record has been retired. It has been superseded by: https://environment.data.gov.uk/dataset/f81508d3-cf5a-44ed-ae7e-452be665af84 This dataset is a product of a national assessment of flood risk for England produced using local expertise. It is produced using the Risk of Flooding from Rivers and Sea data which shows the chance of flooding from rivers and/or the sea, based on cells of 50m. Each cell is allocated one of four flood risk categories, taking into account flood defences and their condition.
This dataset uses OS address data and Royal Mail postcode data to show how many properties are in each of four flood risk categories in each postcode, based simply on the category allocated to the cell that each property is in.
Facebook
TwitterThese tables show Civil Service headcounts at 31 March 2019, and Civil Service leavers between 1 April 2018 and 31 March 2019, by organisation, postcode, grade, and leaving cause.
Facebook
TwitterData set is for private consumption for the competition.
According to IBEF “Domestic automobiles production increased at 2.36% CAGR between FY16-20 with 26.36 million vehicles being manufactured in the country in FY20.Overall, domestic automobiles sales increased at 1.29% CAGR between FY16-FY20 with 21.55 million vehicles being sold in FY20”.The rise in vehicles on the road will also lead to multiple challenges and the road will be more vulnerable to accidents.Increased accident rates also leads to more insurance claims and payouts rise for insurance companies.
In order to pre-emptively plan for the losses, the insurance firms leverage accident data to understand the risk across the geographical units e.g. Postal code/district etc.
In this challenge, we are providing you the dataset to predict the “Accident_Risk_Index” against the postcodes.Accident_Risk_Index (mean casualties at a postcode) = sum(Number_of_casualities)/count(Accident_ID)
Working example:
Train Data (given)
Accident_ID Postcode Number_of_casualities
1 AL1 1JJ 2
2 AL1 1JP 3
3 AL1 3PS 2
4 AL1 3PS 1
5 AL1 3PS 1
Modelling Train Data (Rolled up at Postcode level)
Postcode Derived_feature1 Derived_feature2 Accident_risk_Index
AL1 1JJ _ _ 2
AL1 1JP _ _ 3
AL1 3PS _ _ 1.33
The participants are required to predict the 'Accident_risk_index' for the test.csv and against the postcode on the test data.
Then submit your 'my_submission_file.csv' on the submission tab of the hackathon page.
Pro-tip: The participants are required to perform feature engineering to first roll-up the train data at postcode level and create a column as “accident_risk_index” and optimize the model against postcode level.
Few Hypothesis to help you think: "More accidents happen in the later part of the day as those are office hours causing congestion"
"Postal codes with more single carriage roads have more accidents"
(***In the above hypothesis features such as office_hours_flag and #single _carriage roads can be formed)
Additionally, we are providing you with road network data (contains info on the nearest road to a postcode and it's characteristics) and population data (contains info about population at area level). This info are for augmentation of features, but not mandatory to use.
The provided dataset contains the following files:
train.csv & test.csv:
'Accident_ID', 'Police_Force', 'Number_of_Vehicles', 'Number_of_Casualties', 'Date', 'Day_of_Week', 'Time', ‘Local_Authority_(District)', 'Local_Authority_(Highway)', '1st_Road_Class', '1st_Road_Number', 'Road_Type', 'Speed_limit', '2nd_Road_Class', '2nd_Road_Number', 'Pedestrian_Crossing-Human_Control', 'Pedestrian_Crossing-Physical_Facilities', 'Light_Conditions', ‘'Weather_Conditions', 'Road_Surface_Conditions', 'Special_Conditions_at_Site', 'Carriageway_Hazards', 'Urban_or_Rural_Area', 'Did_Police_Officer_Attend_Scene_of_Accident', 'state', 'postcode', 'country'
population.csv:
'postcode', 'Rural Urban', 'Variable: All usual residents; measures: Value', 'Variable: Males; measures: Value', 'Variable: Females; measures: Value', ‘Variable: Lives in a household; measures: Value', ‘Variable: Lives in a communal establishment; measures: Value', 'Variable: Schoolchild or full-time student aged 4 and over at their non term-time address; measures: Value', 'Variable: Area (Hectares); measures: Value', 'Variable: Density (number of persons per hectare); measures: Value'
roads_network.csv:
'WKT', 'roadClassi', ‘roadFuncti', 'formOfWay', 'length', 'primaryRou', 'distance to the nearest point on rd', 'postcode’
Overview Swiss Re is one of the largest reinsurers in the world headquartered in Zurich with offices in over 25 countries. Swiss Re’s core expertise is in underwriting in life, health, as well as the property and casualty insurance space whereas its tech strategy focuses on developing smarter and innovative solutions for clients’ value chains by leveraging data and technology.
The company’s vision is to make the world more resilient. Swiss Re believes in applying fresh perspectives, knowledge and capital to anticipate and manage risk to create smarter solutions and help the world rebuild, renew and move forward.About 1300 professionals that work in the Swiss Re Global Business Solutions Center (BSC), Bangalore combine experience, expertise and out-of-the-box thinking to bring Swiss Re's core business to life by creating new business opportunities.
Facebook
Twitterhttps://www.ons.gov.uk/methodology/geography/licenceshttps://www.ons.gov.uk/methodology/geography/licences
This is the ONS Postcode Directory (ONSPD) for the United Kingdom as at February 2024 in Comma Separated Variable (CSV) and ASCII text (TXT) formats. This file contains the multi CSVs so that postcode areas can be opened in MS Excel. To download the zip file click the Download button. The ONSPD relates both current and terminated postcodes in the United Kingdom to a range of current statutory administrative, electoral, health and other area geographies. It also links postcodes to pre-2002 health areas, 1991 Census enumeration districts for England and Wales, 2001 Census Output Areas (OA) and Super Output Areas (SOA) for England and Wales, 2001 Census OAs and SOAs for Northern Ireland and 2001 Census OAs and Data Zones (DZ) for Scotland. It now contains 2021 Census OAs and SOAs for England, Wales and Northern Ireland. It helps support the production of area-based statistics from postcoded data. The ONSPD is produced by ONS Geography, who provide geographic support to the Office for National Statistics (ONS) and geographic services used by other organisations. The ONSPD is issued quarterly. (File size - 231 MB) Please note that this product contains Royal Mail, Gridlink, LPS (Northern Ireland), Ordnance Survey and ONS Intellectual Property Rights.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The Acorn geodemographic classification is a long-running classification developed by CACI Limited. Acorn operates by merging geography with demographics and details about consumer characteristics and behaviours. Supported by advanced AI methods, comprehensive input data, and detailed product literature, Acorn provides precise information and enables an in-depth understanding of the different types of consumers in every part of the country.
The current classification groups the entire United Kingdom population into 7 categories, 22 groups and 65 types. The data is available at unit postcode level. Further information may be found on the CACI ACORN microsite.
Use of the data requires approval from the data owner or their nominee and is restricted to those based at a Higher Education or Further Education institution. Please see the Data Access section for further information.
For the second edition (October 2024) data and documentation files for 2024 have been added to the study.
Facebook
TwitterOur International Zip Code Database offers comprehensive postal code data for spatial analysis, including postal and administrative areas for numerous countries worldwide. This global dataset contains accurate and up-to-date information on all administrative divisions, cities, and zip codes, making it an invaluable resource for various applications such as address capture and validation, map and visualization, reporting and business intelligence (BI), master data management, logistics and supply chain management, and sales and marketing. Our location data packages are available in various formats, including CSV, optimized for seamless integration with popular systems like Esri ArcGIS, Snowflake, QGIS, and more. Product features include fully and accurately geocoded data, multi-language support with address names in local and foreign languages, comprehensive city definitions, and the option to combine map data with UNLOCODE and IATA codes, time zones, and daylight saving times. Companies choose our location databases for their enterprise-grade service, reduction in integration time and cost by 30%, and weekly updates to ensure the highest quality.
Facebook
TwitterThe newGeoSure Insurance Product (newGIP) provides the potential insurance risk due to natural ground movement. It incorporates the combined effects of the 6 GeoSure hazards on (low-rise) buildings. This data is available as vector data, 25m gridded data or alternatively linked to a postcode database - the Derived Postcode Database. A series of GIS (Geographical Information System) maps show the most significant hazard areas. The ground movement, or subsidence, hazards included are landslides, shrink-swell clays, soluble rocks, running sands, compressible ground and collapsible deposits. The newGeoSure Insurance Product uses the individual GeoSure data layers and evaluates them using a series of processes including statistical analyses and expert elicitation techniques to create a derived product that can be used for insurance purposes such as identifying and estimating risk and susceptibility. The Derived Postcode Database (DPD) contains generalised information at a postcode level. The DPD is designed to provide a 'summary' value representing the combined effects of the GeoSure dataset across a postcode sector area. It is available as a GIS point dataset or a text (.txt) file format. The DPD contains a normalised hazard rating for each of the 6 GeoSure themes hazards (i.e. each GeoSure theme has been balanced against each other) and a combined unified hazard rating for each postcode in Great Britain. The combined hazard rating for each postcode is available as a standalone product. The Derived Postcode Database is available in a point data format or text file format. It is available in a range of GIS formats including ArcGIS (.shp), ArcInfo Coverages and MapInfo (.tab). More specialised formats may be available but may incur additional processing costs. The newGeoSure Insurance Product dataset has been created as vector data but is also available as a raster grid. This data is available in a range of GIS formats, including ArcGIS (.shp), ArcInfo coverage's and MapInfo (.tab). More specialised formats may be available but may incur additional processing costs. Data for the newGIP is provided for national coverage across Great Britain. The newGeoSure Insurance Product dataset is produced for use at 1:50 000 scale providing 50m ground resolution. This dataset has been specifically developed for the insurance of low-rise buildings. The GeoSure datasets have been developed to identify the potential hazard for low-rise buildings and those with shallow foundations of less than 2 m deep. The identification of ground instability and other geological hazards can assist regional planners; rapidly identifying areas with potential problems and aid local government offices in making development plans by helping to define land suited to different uses. Other users of these data may include developers, homeowners, solicitors, loss adjusters, the insurance industry, architects and surveyors.
Facebook
TwitterEnergy Performance Certificates (EPCs) are needed whenever a property is built, sold or rented. An EPC contains information about a property's energy use and typical energy costs and recommendations about how to reduce energy use and save money. An EPC gives a property an energy efficiency rating from A (most efficient) to G (least efficient) and it is valid for 10 years. The Standard Assessment Procedure (SAP) used to create the EPC is the methodology used by the Government to assess and compare the energy and environmental performance of dwellings. It aims to provide accurate and reliable assessments of dwelling energy performances that are needed to underpin energy and environmental policy initiatives. The data come from an IBM Fuel Poverty report and provide SAP/EPC energy rating by post code within the Glasgow Housing Association (GHA) stock register. The fields are: Post Code, Current Energy Efficiency Rating, Potential Energy Efficiency Rating, Current Environmental Impact Rating and Potential Environmental Impact Rating. Date extracted 2011-05-19. Data supplied by Glasgow Housing Association Licence: None
Facebook
Twitterhttps://www.ons.gov.uk/methodology/geography/licenceshttps://www.ons.gov.uk/methodology/geography/licences
This file contains the National Statistics Postcode Lookup (NSPL) for the United Kingdom as at May 2022 in Comma Separated Variable (CSV) and ASCII text (TXT) formats. To download the zip file click the Download button. The NSPL relates both current and terminated postcodes to a range of current statutory geographies via ‘best-fit’ allocation from the 2011 Census Output Areas (national parks and Workplace Zones are exempt from ‘best-fit’ and use ‘exact-fit’ allocations). It supports the production of area based statistics from postcoded data. The NSPL is produced by ONS Geography, who provide geographic support to the Office for National Statistics (ONS) and geographic services used by other organisations. The NSPL is issued quarterly. (File size - 196 MB).
Facebook
TwitterThis dataset has been superseded The newGeoSure Insurance Product (newGIP) provides the potential insurance risk due to natural ground movement. It incorporates the combined effects of the 6 GeoSure hazards on (low-rise) buildings. This data is available as vector data, 25m gridded data or alternatively linked to a postcode database – the Derived Postcode Database. A series of GIS (Geographical Information System) maps show the most significant hazard areas. The ground movement, or subsidence, hazards included are landslides, shrink-swell clays, soluble rocks, running sands, compressible ground and collapsible deposits. The newGeoSure Insurance Product uses the individual GeoSure data layers and evaluates them using a series of processes including statistical analyses and expert elicitation techniques to create a derived product that can be used for insurance purposes such as identifying and estimating risk and susceptibility. The Derived Postcode Database (DPD) contains generalised information at a postcode level. The DPD is designed to provide a ‘summary’ value representing the combined effects of the GeoSure dataset across a postcode sector area. It is available as a GIS point dataset or a text (.txt) file format. The DPD contains a normalised hazard rating for each of the 6 GeoSure themes hazards (i.e. each GeoSure theme has been balanced against each other) and a combined unified hazard rating for each postcode in Great Britain. The combined hazard rating for each postcode is available as a standalone product. The Derived Postcode Database is available in a point data format or text file format. It is available in a range of GIS formats including ArcGIS (.shp), ArcInfo Coverages and MapInfo (.tab). More specialised formats may be available but may incur additional processing costs. The newGeoSure Insurance Product dataset has been created as vector data but is also available as a raster grid. This data is available in a range of GIS formats, including ArcGIS (.shp), ArcInfo coverage’s and MapInfo (.tab). More specialised formats may be available but may incur additional processing costs. Data for the newGIP is provided for national coverage across Great Britain. The newGeoSure Insurance Product dataset is produced for use at 1:50 000 scale providing 50 m ground resolution. This dataset has been specifically developed for the insurance of low-rise buildings. The GeoSure datasets have been developed to identify the potential hazard for low-rise buildings and those with shallow foundations of less than 2 m deep. The identification of ground instability and other geological hazards can assist regional planners; rapidly identifying areas with potential problems and aid local government offices in making development plans by helping to define land suited to different uses. Other users of these data may include developers, homeowners, solicitors, loss adjusters, the insurance industry, architects and surveyors. Version 7 released June 2015.
Facebook
Twitterhttps://crystalroof.co.uk/api-terms-of-usehttps://crystalroof.co.uk/api-terms-of-use
This method returns total crime rates, crime rates by crime types, area ratings by total crime, and area ratings by crime type for small areas (Lower Layer Super Output Areas, or LSOAs) and Local Authority Districts (LADs). The results are determined by the inclusion of the submitted postcode/coordinates/UPRN within the corresponding LSOA or LAD.
All figures are annual (for the last 12 months).
The crime rates are calculated per 1,000 resident population derived from the census 2021.
The dataset is updated on a monthly basis, with a 3-month lag between the current date and the most recent data.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
According to IBEF “Domestic automobiles production increased at 2.36% CAGR between FY16-20 with 26.36 million vehicles being manufactured in the country in FY20.Overall, domestic automobiles sales increased at 1.29% CAGR between FY16-FY20 with 21.55 million vehicles being sold in FY20”.The rise in vehicles on the road will also lead to multiple challenges and the road will be more vulnerable to accidents.Increased accident rates also leads to more insurance claims and payouts rise for insurance companies.
In order to pre-emptively plan for the losses, the insurance firms leverage accident data to understand the risk across the geographical units e.g. Postal code/district etc.
In this challenge, we are providing you the dataset to predict the “Accident_Risk_Index” against the postcodes.Accident_Risk_Index (mean casualties at a postcode) = sum(Number_of_casualities)/count(Accident_ID)
The participants are required to predict the 'Accident_risk_index' for the test.csv and against the postcode on the test data.
Then submit your 'my_submission_file.csv' on the submission tab of the hackathon page.
Pro-tip: The participants are required to perform feature engineering to first roll-up the train data at postcode level and create a column as “accident_risk_index” and optimize the model against postcode level.
Few Hypothesis to help you think: "More accidents happen in the later part of the day as those are office hours causing congestion"
"Postal codes with more single carriage roads have more accidents"
(***In the above hypothesis features such as office_hours_flag and #single _carriage roads can be formed)
Additionally, we are providing you with road network data (contains info on the nearest road to a postcode and it's characteristics) and population data (contains info about population at area level). This info are for augmentation of features, but not mandatory to use.
The provided dataset contains the following files:
train.csv & test.csv:
'Accident_ID', 'Police_Force', 'Number_of_Vehicles', 'Number_of_Casualties', 'Date', 'Day_of_Week', 'Time', ‘Local_Authority_(District)', 'Local_Authority_(Highway)', '1st_Road_Class', '1st_Road_Number', 'Road_Type', 'Speed_limit', '2nd_Road_Class', '2nd_Road_Number', 'Pedestrian_Crossing-Human_Control', 'Pedestrian_Crossing-Physical_Facilities', 'Light_Conditions', ‘'Weather_Conditions', 'Road_Surface_Conditions', 'Special_Conditions_at_Site', 'Carriageway_Hazards', 'Urban_or_Rural_Area', 'Did_Police_Officer_Attend_Scene_of_Accident', 'state', 'postcode', 'country'
population.csv:
'postcode', 'Rural Urban', 'Variable: All usual residents; measures: Value', 'Variable: Males; measures: Value', 'Variable: Females; measures: Value', ‘Variable: Lives in a household; measures: Value', ‘Variable: Lives in a communal establishment; measures: Value', 'Variable: Schoolchild or full-time student aged 4 and over at their non term-time address; measures: Value', 'Variable: Area (Hectares); measures: Value', 'Variable: Density (number of persons per hectare); measures: Value'
roads_network.csv:
'WKT', 'roadClassi', ‘roadFuncti', 'formOfWay', 'length', 'primaryRou', 'distance to the nearest point on rd', 'postcode’