Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
New Zealand recorded 2282861 Coronavirus Cases since the epidemic began, according to the World Health Organization (WHO). In addition, New Zealand reported 2792 Coronavirus Deaths. This dataset includes a chart with historical data for New Zealand Coronavirus Cases.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
With the arrival of the COVID19 virus in New Zealand, the ministry of health is tracking new cases and releasing daily updates on the situation on their webpage: https://www.health.govt.nz/our-work/diseases-and-conditions/covid-19-novel-coronavirus/covid-19-current-cases and https://www.health.govt.nz/our-work/diseases-and-conditions/covid-19-novel-coronavirus/covid-19-current-cases/covid-19-current-cases-details. Much of the information given in these updates are not in a machine-friendly format. The objective of this dataset is to provide NZ Minstry of Health COVID19 data in easy-to-use format.
All data in this dataset has been acquired from the New Zealand Minstry of Health's 'COVID19 current cases' webpage, located here: https://www.health.govt.nz/our-work/diseases-and-conditions/covid-19-novel-coronavirus/covid-19-current-cases. The Ministry of Health updates their page daily, that will be the targeted update frequency for this dataset for the Daily Count of Cases
dataset. The Case Details
dataset which
includes travel details on each case will be updated weekly.
The mission of this project is to reliably convey data that the Ministry of Health has reported in the most digestable format. Enrichment of data is currently out of scope.
If you find any discrepancies between the Ministry of Health's data and this dataset, please provide your feedback as an issue on the git repo for this dataset: https://github.com/2kruman/COVID19-NZ-known-cases/issues.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This 6MB download is a zip file containing 5 pdf documents and 2 xlsx spreadsheets. Presentation on COVID-19 and the potential impacts on employment
May 2020Waka Kotahi wants to better understand the potential implications of the COVID-19 downturn on the land transport system, particularly the potential impacts on regional economies and communities.
To do this, in May 2020 Waka Kotahi commissioned Martin Jenkins and Infometrics to consider the potential impacts of COVID-19 on New Zealand’s economy and demographics, as these are two key drivers of transport demand. In addition to providing a scan of national and international COVID-19 trends, the research involved modelling the economic impacts of three of the Treasury’s COVID-19 scenarios, to a regional scale, to help us understand where the impacts might be greatest.
Waka Kotahi studied this modelling by comparing the percentage difference in employment forecasts from the Treasury’s three COVID-19 scenarios compared to the business as usual scenario.
The source tables from the modelling (Tables 1-40), and the percentage difference in employment forecasts (Tables 41-43), are available as spreadsheets.
Arataki - potential impacts of COVID-19 Final Report
Employment modelling - interactive dashboard
The modelling produced employment forecasts for each region and district over three time periods – 2021, 2025 and 2031. In May 2020, the forecasts for 2021 carried greater certainty as they reflected the impacts of current events, such as border restrictions, reduction in international visitors and students etc. The 2025 and 2031 forecasts were less certain because of the potential for significant shifts in the socio-economic situation over the intervening years. While these later forecasts were useful in helping to understand the relative scale and duration of potential COVID-19 related impacts around the country, they needed to be treated with care recognising the higher levels of uncertainty.
The May 2020 research suggested that the ‘slow recovery scenario’ (Treasury’s scenario 5) was the most likely due to continuing high levels of uncertainty regarding global efforts to manage the pandemic (and the duration and scale of the resulting economic downturn).
The updates to Arataki V2 were framed around the ‘Slower Recovery Scenario’, as that scenario remained the most closely aligned with the unfolding impacts of COVID-19 in New Zealand and globally at that time.
Find out more about Arataki, our 10-year plan for the land transport system
May 2021The May 2021 update to employment modelling used to inform Arataki Version 2 is now available. Employment modelling dashboard - updated 2021Arataki used the May 2020 information to compare how various regions and industries might be impacted by COVID-19. Almost a year later, it is clear that New Zealand fared better than forecast in May 2020.Waka Kotahi therefore commissioned an update to the projections through a high-level review of:the original projections for 2020/21 against performancethe implications of the most recent global (eg International monetary fund world economic Outlook) and national economic forecasts (eg Treasury half year economic and fiscal update)The treasury updated its scenarios in its December half year fiscal and economic update (HYEFU) and these new scenarios have been used for the revised projections.Considerable uncertainty remains about the potential scale and duration of the COVID-19 downturn, for example with regards to the duration of border restrictions, update of immunisation programmes. The updated analysis provides us with additional information regarding which sectors and parts of the country are likely to be most impacted. We continue to monitor the situation and keep up to date with other cross-Government scenario development and COVID-19 related work. The updated modelling has produced employment forecasts for each region and district over three time periods - 2022, 2025, 2031.The 2022 forecasts carry greater certainty as they reflect the impacts of current events. The 2025 and 2031 forecasts are less certain because of the potential for significant shifts over that time.
Data reuse caveats: as per license.
Additionally, please read / use this data in conjunction with the Infometrics and Martin Jenkins reports, to understand the uncertainties and assumptions involved in modelling the potential impacts of COVID-19.
COVID-19’s effect on industry and regional economic outcomes for NZ Transport Agency [PDF 620 KB]
Data quality statement: while the modelling undertaken is high quality, it represents two point-in-time analyses undertaken during a period of considerable uncertainty. This uncertainty comes from several factors relating to the COVID-19 pandemic, including:
a lack of clarity about the size of the global downturn and how quickly the international economy might recover differing views about the ability of the New Zealand economy to bounce back from the significant job losses that are occurring and how much of a structural change in the economy is required the possibility of a further wave of COVID-19 cases within New Zealand that might require a return to Alert Levels 3 or 4.
While high levels of uncertainty remain around the scale of impacts from the pandemic, particularly in coming years, the modelling is useful in indicating the direction of travel and the relative scale of impacts in different parts of the country.
Data quality caveats: as noted above, there is considerable uncertainty about the potential scale and duration of the COVID-19 downturn. Please treat the specific results of the modelling carefully, particularly in the forecasts to later years (2025, 2031), given the potential for significant shifts in New Zealand's socio-economic situation before then.
As such, please use the modelling results as a guide to the potential scale of the impacts of the downturn in different locations, rather than as a precise assessment of impacts over the coming decade.
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides the boundary of New Zealand’s continental shelf, which is the area of seabed around a large land mass where the sea is relatively shallow compared with the open ocean. The continental shelf is the seabed and subsoil of the submarine areas that extend beyond the territorial sea of a coastal state throughout the natural prolongation of its land territory to the outer edge of the continental margin. In New Zealand’s case, the continental margin extends beyond the Exclusive Economic Zone in many places and the outer limits have been established on the basis of the recommendations of the United Nations Commission on the Limits of the Continental Shelf. Under UNCLOS, New Zealand exercises sovereign rights over the continental shelf for the purpose of exploring it and exploiting its natural resources. Note: The boundary includes, where applicable, the delimitation of the boundaries of the continental shelf with Australia under the treaty of 25 July 2004. The delimitation of the maritime boundaries in the north with Fiji, Tonga and possibly France in respect of New Caledonia, have yet to be settled by treaty. Maritime Boundary Definitions: http://www.linz.govt.nz/hydro/nautical-info/maritime-boundaries/definitions#zones Further References: http://www.linz.govt.nz/hydro/nautical-info/maritime-boundaries
Project Tycho datasets contain case counts for reported disease conditions for countries around the world. The Project Tycho data curation team extracts these case counts from various reputable sources, typically from national or international health authorities, such as the US Centers for Disease Control or the World Health Organization. These original data sources include both open- and restricted-access sources. For restricted-access sources, the Project Tycho team has obtained permission for redistribution from data contributors. All datasets contain case count data that are identical to counts published in the original source and no counts have been modified in any way by the Project Tycho team. The Project Tycho team has pre-processed datasets by adding new variables, such as standard disease and location identifiers, that improve data interpretability. We also formatted the data into a standard data format.
Each Project Tycho dataset contains case counts for a specific condition (e.g. measles) and for a specific country (e.g. The United States). Case counts are reported per time interval. In addition to case counts, datasets include information about these counts (attributes), such as the location, age group, subpopulation, diagnostic certainty, place of acquisition, and the source from which we extracted case counts. One dataset can include many series of case count time intervals, such as "US measles cases as reported by CDC", or "US measles cases reported by WHO", or "US measles cases that originated abroad", etc.
Depending on the intended use of a dataset, we recommend a few data processing steps before analysis: - Analyze missing data: Project Tycho datasets do not include time intervals for which no case count was reported (for many datasets, time series of case counts are incomplete, due to incompleteness of source documents) and users will need to add time intervals for which no count value is available. Project Tycho datasets do include time intervals for which a case count value of zero was reported. - Separate cumulative from non-cumulative time interval series. Case count time series in Project Tycho datasets can be "cumulative" or "fixed-intervals". Cumulative case count time series consist of overlapping case count intervals starting on the same date, but ending on different dates. For example, each interval in a cumulative count time series can start on January 1st, but end on January 7th, 14th, 21st, etc. It is common practice among public health agencies to report cases for cumulative time intervals. Case count series with fixed time intervals consist of mutually exclusive time intervals that all start and end on different dates and all have identical length (day, week, month, year). Given the different nature of these two types of case count data, we indicated this with an attribute for each count value, named "PartOfCumulativeCountSeries".
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This Project Tycho dataset includes a CSV file with COVID-19 data reported in NEW ZEALAND: 2019-12-30 - 2021-07-31. It contains counts of cases and deaths. Data for this Project Tycho dataset comes from: "COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University", "European Centre for Disease Prevention and Control Website", "World Health Organization COVID-19 Dashboard". The data have been pre-processed into the standard Project Tycho data format v1.1.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This New Zealand Point Cloud Classification Deep Learning Package will classify point clouds into building and background classes. This model is optimized to work with New Zealand aerial LiDAR data.The classification of point cloud datasets to identify Building is useful in applications such as high-quality 3D basemap creation, urban planning, and planning climate change response.Building could have a complex irregular geometrical structure that is hard to capture using traditional means. Deep learning models are highly capable of learning these complex structures and giving superior results.This model is designed to extract Building in both urban and rural area in New Zealand.The Training/Testing/Validation dataset are taken within New Zealand resulting of a high reliability to recognize the pattern of NZ common building architecture.Licensing requirementsArcGIS Desktop - ArcGIS 3D Analyst extension for ArcGIS ProUsing the modelThe model can be used in ArcGIS Pro's Classify Point Cloud Using Trained Model tool. Before using this model, ensure that the supported deep learning frameworks libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.Note: Deep learning is computationally intensive, and a powerful GPU is recommended to process large datasets.The model is trained with classified LiDAR that follows the The model was trained using a training dataset with the full set of points. Therefore, it is important to make the full set of points available to the neural network while predicting - allowing it to better discriminate points of 'class of interest' versus background points. It is recommended to use 'selective/target classification' and 'class preservation' functionalities during prediction to have better control over the classification and scenarios with false positives.The model was trained on airborne lidar datasets and is expected to perform best with similar datasets. Classification of terrestrial point cloud datasets may work but has not been validated. For such cases, this pre-trained model may be fine-tuned to save on cost, time, and compute resources while improving accuracy. Another example where fine-tuning this model can be useful is when the object of interest is tram wires, railway wires, etc. which are geometrically similar to electricity wires. When fine-tuning this model, the target training data characteristics such as class structure, maximum number of points per block and extra attributes should match those of the data originally used for training this model (see Training data section below).OutputThe model will classify the point cloud into the following classes with their meaning as defined by the American Society for Photogrammetry and Remote Sensing (ASPRS) described below: 0 Background 6 BuildingApplicable geographiesThe model is expected to work well in the New Zealand. It's seen to produce favorable results as shown in many regions. However, results can vary for datasets that are statistically dissimilar to training data.Training dataset - Auckland, Christchurch, Kapiti, Wellington Testing dataset - Auckland, WellingtonValidation/Evaluation dataset - Hutt City Dataset City Training Auckland, Christchurch, Kapiti, Wellington Testing Auckland, Wellington Validating HuttModel architectureThis model uses the SemanticQueryNetwork model architecture implemented in ArcGIS Pro.Accuracy metricsThe table below summarizes the accuracy of the predictions on the validation dataset. - Precision Recall F1-score Never Classified 0.984921 0.975853 0.979762 Building 0.951285 0.967563 0.9584Training dataThis model is trained on classified dataset originally provided by Open TopoGraphy with < 1% of manual labelling and correction.Train-Test split percentage {Train: 75~%, Test: 25~%} Chosen this ratio based on the analysis from previous epoch statistics which appears to have a descent improvementThe training data used has the following characteristics: X, Y, and Z linear unitMeter Z range-137.74 m to 410.50 m Number of Returns1 to 5 Intensity16 to 65520 Point spacing0.2 ± 0.1 Scan angle-17 to +17 Maximum points per block8192 Block Size50 Meters Class structure[0, 6]Sample resultsModel to classify a dataset with 23pts/m density Wellington city dataset. The model's performance are directly proportional to the dataset point density and noise exlcuded point clouds.To learn how to use this model, see this story
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
New Zealand NZ: Tuberculosis Case Detection Rate: All Forms data was reported at 87.000 % in 2016. This stayed constant from the previous number of 87.000 % for 2015. New Zealand NZ: Tuberculosis Case Detection Rate: All Forms data is updated yearly, averaging 87.000 % from Dec 2000 (Median) to 2016, with 17 observations. The data reached an all-time high of 87.000 % in 2016 and a record low of 87.000 % in 2016. New Zealand NZ: Tuberculosis Case Detection Rate: All Forms data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s New Zealand – Table NZ.World Bank.WDI: Health Statistics. Tuberculosis case detection rate (all forms) is the number of new and relapse tuberculosis cases notified to WHO in a given year, divided by WHO's estimate of the number of incident tuberculosis cases for the same year, expressed as a percentage. Estimates for all years are recalculated as new information becomes available and techniques are refined, so they may differ from those published previously.; ; World Health Organization, Global Tuberculosis Report.; Weighted average;
ps-places-metadata-v1.01
This dataset comprises a pair of layers, (points and polys) which attempt to better locate "populated places" in NZ. Populated places are defined here as settled areas, either urban or rural where densitys of around 20 persons per hectare exist, and something is able to be seen from the air.
The only liberally licensed placename dataset is currently LINZ geographic placenames, which has the following drawbacks: - coordinates are not place centers but left most label on 260 series map - the attributes are outdated
This dataset necessarily involves cleaving the linz placenames set into two, those places that are poplulated, and those unpopulated. Work was carried out in four steps. First placenames were shortlisted according to the following criterion:
- all places that rated at least POPL in the linz geographic places layer, ie POPL, METR or TOWN or USAT were adopted.
- Then many additional points were added from a statnz meshblock density analysis.
- Finally remaining points were added from a check against linz residential polys, and zenbu poi clusters.
Spelling is broadly as per linz placenames, but there are differences for no particular reason. Instances of LINZ all upper case have been converted to sentance case. Some places not presently in the linz dataset are included in this set, usually new places, or those otherwise unnamed. They appear with no linz id, and are not authoritative, in some cases just wild guesses.
Density was derived from the 06 meshblock boundarys (level 2, geometry fixed), multipart conversion, merging in 06 usually resident MB population then using the formula pop/area*10000. An initial urban/rural threshold level of 0.6 persons per hectare was used.
Step two was to trace the approx extent of each populated place. The main purpose of this step was to determine the relative area of each place, and to create an intersection with meshblocks for population. Step 3 involved determining the political center of each place, broadly defined as the commercial center.
Tracing was carried out at 1:9000 for small places, and 1:18000 for large places using either bing or google satellite views. No attempt was made to relate to actual town 'boundarys'. For example large parks or raceways on the urban fringe were not generally included. Outlying industrial areas were included somewhat erratically depending on their connection to urban areas.
Step 3 involved determining the centers of each place. Points were overlaid over the following layers by way of a base reference:
a. original linz placenames b. OSM nz-locations points layer c. zenbu pois, latest set as of 5/4/11 d. zenbu AllSuburbsRegions dataset (a heavily hand modified) LINZ BDE extract derived dataset courtesy Zenbu. e. LINZ road-centerlines, sealed and highway f. LINZ residential areas, g. LINZ building-locations and building footprints h. Olivier and Co nz-urban-north and south
Therefore in practice, sources c and e, form the effective basis of the point coordinates in this dataset. Be aware that e, f and g are referenced to the LINZ topo data, while c and d are likely referenced to whatever roading dataset google possesses. As such minor discrepencys may occur when moving from one to the other.
Regardless of the above, this place centers dataset was created using the following criteria, in order of priority:
To be clear the coordinates are manually produced by eye without any kind of computation. As such the points are placed approximately perhaps plus or minus 10m, but given that the roads layers are not that flash, no attempt was made to actually snap the coordinates to the road junctions themselves.
The final step involved merging in population from SNZ meshblocks (merge+sum by location) of popl polys). Be aware that due to the inconsistent way that meshblocks are defined this will result in inaccurate populations, particular small places will collect population from their surrounding area. In any case the population will generally always overestimate by including meshblocks that just nicked the place poly. Also there are a couple of dozen cases of overlapping meshblocks between two place polys and these will double count. Which i have so far made no attempt to fix.
Merged in also tla and regions from SNZ shapes, a few of the original linz atrributes, and lastly grading the size of urban areas according to SNZ 'urban areas" criteria. Ie: class codes:
Note that while this terminology is shared with SNZ the actual places differ owing to different decisions being made about where one area ends an another starts, and what constiutes a suburb or satellite. I expect some discussion around this issue. For example i have included tinwald and washdyke as part of ashburton and timaru, but not richmond or waikawa as part of nelson and picton. Im open to discussion on these.
No attempt has or will likely ever be made to locate the entire LOC and SBRB data subsets. We will just have to wait for NZFS to release what is thought to be an authoritative set.
Shapefiles are all nztm. Orig data from SNZ and LINZ was all sourced in nztm, via koordinates, or SNZ. Satellite tracings were in spherical mercator/wgs84 and converted to nztm by Qgis. Zenbu POIS were also similarly converted.
Shapefile: Points id : integer unique to dataset name : name of popl place, string class : urban area size as above. integer tcode : SNZ tla code, integer rcode : SNZ region code, 1-16, integer area : area of poly place features, integer in square meters. pop : 2006 usually resident popluation, being the sum of meshblocks that intersect the place poly features. Integer lid : linz geog places id desc_code : linz geog places place type code
Shapefile: Polygons gid : integer unique to dataset, shared by points and polys name : name of popl place, string, where spelling conflicts occur points wins area : place poly area, m2 Integer
Clarification about the minorly derived nature of LINZ and google data needs to be sought. But pending these copyright complications, the actual points data is essentially an original work, released as public domain. I retain no copyright, nor any responsibility for data accuracy, either as is, or regardless of any changes that are subsequently made to it.
Peter Scott 16/6/2011
v1.01 minor spelling and grammar edits 17/6/11
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This New Zealand English Call Center Speech Dataset for the Healthcare industry is purpose-built to accelerate the development of English speech recognition, spoken language understanding, and conversational AI systems. With 30 Hours of unscripted, real-world conversations, it delivers the linguistic and contextual depth needed to build high-performance ASR models for medical and wellness-related customer service.
Created by FutureBeeAI, this dataset empowers voice AI teams, NLP researchers, and data scientists to develop domain-specific models for hospitals, clinics, insurance providers, and telemedicine platforms.
The dataset features 30 Hours of dual-channel call center conversations between native New Zealand English speakers. These recordings cover a variety of healthcare support topics, enabling the development of speech technologies that are contextually aware and linguistically rich.
The dataset spans inbound and outbound calls, capturing a broad range of healthcare-specific interactions and sentiment types (positive, neutral, negative).
These real-world interactions help build speech models that understand healthcare domain nuances and user intent.
Every audio file is accompanied by high-quality, manually created transcriptions in JSON format.
Each conversation and speaker includes detailed metadata to support fine-tuned training and analysis.
This dataset can be used across a range of healthcare and voice AI use cases:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This New Zealand Point Cloud Classification Deep Learning Package will classify point clouds into tree and background classes. This model is optimized to work with New Zealand aerial LiDAR data.The classification of point cloud datasets to identify Trees is useful in applications such as high-quality 3D basemap creation, urban planning, forestry workflows, and planning climate change response.Trees could have a complex irregular geometrical structure that is hard to capture using traditional means. Deep learning models are highly capable of learning these complex structures and giving superior results.This model is designed to extract Tree in both urban and rural area in New Zealand.The Training/Testing/Validation dataset are taken within New Zealand resulting of a high reliability to recognize the pattern of NZ common building architecture.Licensing requirementsArcGIS Desktop - ArcGIS 3D Analyst extension for ArcGIS ProUsing the modelThe model can be used in ArcGIS Pro's Classify Point Cloud Using Trained Model tool. Before using this model, ensure that the supported deep learning frameworks libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.Note: Deep learning is computationally intensive, and a powerful GPU is recommended to process large datasets.InputThe model is trained with classified LiDAR that follows the LINZ base specification. The input data should be similar to this specification.Note: The model is dependent on additional attributes such as Intensity, Number of Returns, etc, similar to the LINZ base specification. This model is trained to work on classified and unclassified point clouds that are in a projected coordinate system, in which the units of X, Y and Z are based on the metric system of measurement. If the dataset is in degrees or feet, it needs to be re-projected accordingly. The model was trained using a training dataset with the full set of points. Therefore, it is important to make the full set of points available to the neural network while predicting - allowing it to better discriminate points of 'class of interest' versus background points. It is recommended to use 'selective/target classification' and 'class preservation' functionalities during prediction to have better control over the classification and scenarios with false positives.The model was trained on airborne lidar datasets and is expected to perform best with similar datasets. Classification of terrestrial point cloud datasets may work but has not been validated. For such cases, this pre-trained model may be fine-tuned to save on cost, time, and compute resources while improving accuracy. Another example where fine-tuning this model can be useful is when the object of interest is tram wires, railway wires, etc. which are geometrically similar to electricity wires. When fine-tuning this model, the target training data characteristics such as class structure, maximum number of points per block and extra attributes should match those of the data originally used for training this model (see Training data section below).OutputThe model will classify the point cloud into the following classes with their meaning as defined by the American Society for Photogrammetry and Remote Sensing (ASPRS) described below: 0 Background 5 Trees / High-vegetationApplicable geographiesThe model is expected to work well in the New Zealand. It's seen to produce favorable results as shown in many regions. However, results can vary for datasets that are statistically dissimilar to training data.Training dataset - Wellington CityTesting dataset - Tawa CityValidation/Evaluation dataset - Christchurch City Dataset City Training Wellington Testing Tawa Validating ChristchurchModel architectureThis model uses the PointCNN model architecture implemented in ArcGIS API for Python.Accuracy metricsThe table below summarizes the accuracy of the predictions on the validation dataset. - Precision Recall F1-score Never Classified 0.991200 0.975404 0.983239 High Vegetation 0.933569 0.975559 0.954102Training dataThis model is trained on classified dataset originally provided by Open TopoGraphy with < 1% of manual labelling and correction.Train-Test split percentage {Train: 80%, Test: 20%} Chosen this ratio based on the analysis from previous epoch statistics which appears to have a descent improvementThe training data used has the following characteristics: X, Y, and Z linear unitMeter Z range-121.69 m to 26.84 m Number of Returns1 to 5 Intensity16 to 65520 Point spacing0.2 ± 0.1 Scan angle-15 to +15 Maximum points per block8192 Block Size20 Meters Class structure[0, 5]Sample resultsModel to classify a dataset with 5pts/m density Christchurch city dataset. The model's performance are directly proportional to the dataset point density and noise exlcuded point clouds.To learn how to use this model, see this story
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Traffic information is crucial for managing transportation and city planning, but obtaining national-scale data is difficult due to privacy concerns. Consequently, most current traffic datasets have limitations in terms of time and location coverage, leading to a lack of comprehensive public access to national traffic data. To address this issue, a multi-source highway traffic dataset has been created, featuring 2042 sensors in New Zealand over a 9-year period with 15-minute intervals and accompanying metadata. The dataset includes data of both light-duty and heavy-duty vehicles, as well as weather information like temperature and precipitation. This dataset has diverse potential research applications such as traffic flow prediction and congestion management.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This Australian and New Zealand food category cost dataset was created to inform diet and economic modelling for low and medium socioeconomic households in Australia and New Zealand. The dataset was created according to the INFORMAS protocol, which details the methods to systematically and consistently collect and analyse information on the price of foods, meals and affordability of diets in different countries globally. Food categories were informed by the Food Standards Australian New Zealand (FSANZ) AUSNUT (AUStralian Food and NUTrient Database) 2011-13 database, with additional food categories created to account for frequently consumed and culturally important foods.
Methods The dataset was created according to the INFORMAS protocol [1], which detailed the methods to collect and analyse information systematically and consistently on the price of foods, meals, and affordability of diets in different countries globally.
Cost data were collected from four supermarkets in each country: Australia and New Zealand. In Australia, two (Coles Merrylands and Woolworths Auburn) were located in a low and two (Coles Zetland and Woolworths Burwood) were located in a medium metropolitan socioeconomic area in New South Wales from 7-11th December 2020. In New Zealand, two (Countdown Hamilton Central and Pak ‘n Save Hamilton Lake) were located in a low and two (Countdown Rototuna North and Pak ‘n Save Rosa Birch Park) in a medium socioeconomic area in the North Island, from 16-18th December 2020.
Locations in Australia were selected based on the Australian Bureau of Statistics Index of Relative Socio-Economic Advantage and Disadvantage (IRSAD) [2]. The index ranks areas from most disadvantaged to most advantaged using a scale of 1 to 10. IRSAD quintile 1 was chosen to represent low socio-economic status and quintile 3 for medium SES socio-economic status. Locations in New Zealand were chosen using the 2018 NZ Index of Deprivation and statistical area 2 boundaries [3]. Low socio-economic areas were defined by deciles 8-10 and medium socio-economic areas by deciles 4-6. The supermarket locations were chosen according to accessibility to researchers. Data were collected by five trained researchers with qualifications in nutrition and dietetics and/or nutrition science.
All foods were aggregated into a reduced number of food categories informed by the Food Standards Australian New Zealand (FSANZ) AUSNUT (AUStralian Food and NUTrient Database) 2011-13 database, with additional food categories created to account for frequently consumed and culturally important foods. Nutrient data for each food category can therefore be linked to the Australian Food and Nutrient (AUSNUT) 2011-13 database [4] and NZ Food Composition Database (NZFCDB) [5] using the 8-digit codes provided for Australia and New Zealand, respectively.
Data were collected for three representative foods within each food category, based on criteria used in the INFORMAS protocol: (i) the lowest non-discounted price was chosen from the most commonly available product size, (ii) the produce was available nationally, (iii) fresh produce of poor quality was omitted. One sample was collected per representative food product per store, leading to a total of 12 food price samples for each food category. The exception was for the ‘breakfast cereal, unfortified, sugars ≤15g/100g’ food category in the NZ dataset, which included only four food price samples because only one representative product per supermarket was identified.
Variables in this dataset include: (i) food category and description, (ii) brand and name of representative food, (iii) product size, (iv) cost per product, and (v) 8-digit code to link product to nutrient composition data (AUSNUT and NZFCDB).
References
Vandevijvere, S.; Mackay, S.; Waterlander, W. INFORMAS Protocol: Food Prices Module [Internet]. Available online: https://auckland.figshare.com/articles/journal_contribution/INFORMAS_Protocol_Food_Prices_Module/5627440/1 (accessed on 25 October).
2071.0 - Census of Population and Housing: Reflecting Australia - Stories from the Census, 2016 Available online: https://www.abs.gov.au/ausstats/abs@.nsf/Lookup/by Subject/2071.0~2016~Main Features~Socio-Economic Advantage and Disadvantage~123 (accessed on 10 December).
Socioeconomic Deprivation Indexes: NZDep and NZiDep, Department of Public Health. Available online: https://www.otago.ac.nz/wellington/departments/publichealth/research/hirp/otago020194.html#2018 (accessed on 10 December)
AUSNUT 2011-2013 food nutrient database. Available online: https://www.foodstandards.gov.au/science/monitoringnutrients/ausnut/ausnutdatafiles/Pages/foodnutrient.aspx (accessed on 15 November).
NZ Food Composition Data. Available online: https://www.foodcomposition.co.nz/ (accessed on 10 December)
Usage Notes The uploaded data includes an Excel spreadsheet where a separate worksheet is provided for the Australian food price database and New Zealand food price database, respectively. All cost data are presented to two decimal points, and the mean and standard deviation of each food category is presented. For some representative foods in NZ, the only NFCDB food code available was for a cooked product, whereas the product is purchased raw and cooked prior to eating, undergoing a change in weight between the raw and cooked versions. In these cases, a conversion factor was used to account for the weight difference between the raw and cooked versions, to ensure that nutrient information (on accessing from the NZFCDB) was accurate. This conversion factor was developed based on the weight differences between the cooked and raw versions, and checked for accuracy by comparing quantities of key nutrients in the cooked vs raw versions of the product.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This dataset provides detailed information on road surfaces from OpenStreetMap (OSM) data, distinguishing between paved and unpaved surfaces across the region. This information is based on road surface prediction derived from hybrid deep learning approach. For more information on Methods, refer to the paper
Roughly 0.2325 million km of roads are mapped in OSM in this region. Based on AI-mapped estimates the share of paved and unpaved roads is approximately 0.0993 and 0.0612 (in million kms), corressponding to 42.6968% and 26.3068% respectively of the total road length in the dataset region. 0.0721 million km or 30.9965% of road surface information is missing in OSM. In order to fill this gap, Mapillary derived road surface dataset provides an additional 0.0009 million km of information (corressponding to 1.2582% of total missing information on road surface)
It is intended for use in transportation planning, infrastructure analysis, climate emissions and geographic information system (GIS) applications.
This dataset provides comprehensive information on road and urban area features, including location, surface quality, and classification metadata. This dataset includes attributes from OpenStreetMap (OSM) data, AI predictions for road surface, and urban classifications.
AI features:
pred_class: Model-predicted class for the road surface, with values "paved" or "unpaved."
pred_label: Binary label associated with pred_class
(0 = paved, 1 = unpaved).
osm_surface_class: Classification of the surface type from OSM, categorized as "paved" or "unpaved."
combined_surface_osm_priority: Surface classification combining pred_label
and surface
(OSM) while prioritizing the OSM surface tag, classified as "paved" or "unpaved."
combined_surface_DL_priority: Surface classification combining pred_label
and surface
(OSM) while prioritizing DL prediction pred_label
, classified as "paved" or "unpaved."
n_of_predictions_used: Number of predictions used for the feature length estimation.
predicted_length: Predicted length based on the DL model’s estimations, in meters.
DL_mean_timestamp: Mean timestamp of the predictions used, for comparison.
OSM features may have these attributes(Learn what tags mean here):
name: Name of the feature, if available in OSM.
name:en: Name of the feature in English, if available in OSM.
name:* (in local language): Name of the feature in the local official language, where available.
highway: Road classification based on OSM tags (e.g., residential, motorway, footway).
surface: Description of the surface material of the road (e.g., asphalt, gravel, dirt).
smoothness: Assessment of surface smoothness (e.g., excellent, good, intermediate, bad).
width: Width of the road, where available.
lanes: Number of lanes on the road.
oneway: Indicates if the road is one-way (yes or no).
bridge: Specifies if the feature is a bridge (yes or no).
layer: Indicates the layer of the feature in cases where multiple features are stacked (e.g., bridges, tunnels).
source: Source of the data, indicating the origin or authority of specific attributes.
Urban classification features may have these attributes:
continent: The continent where the data point is located (e.g., Europe, Asia).
country_iso_a2: The ISO Alpha-2 code representing the country (e.g., "US" for the United States).
urban: Binary indicator for urban areas based on the GHSU Urban Layer 2019. (0 = rural, 1 = urban)
urban_area: Name of the urban area or city where the data point is located.
osm_id: Unique identifier assigned by OpenStreetMap (OSM) to each feature.
osm_type: Type of OSM element (e.g., node, way, relation).
The data originates from OpenStreetMap (OSM) and is augmented with model predictions using images downloaded from Mapillary in combination with the GHSU Global Human Settlement Urban Layer 2019 and AFRICAPOLIS2020 urban layer.
This dataset is one of many HeiGIT exports on HDX. See the HeiGIT website for more information.
We are looking forward to hearing about your use-case! Feel free to reach out to us and tell us about your research at communications@heigit.org – we would be happy to amplify your work.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This layer includes all Crown Land and Properties managed by LINZ which have been identified spatially and can include properties managed by LINZ on behalf of other agencies. The attributes in this dataset are derived from the National Property and Land Information System (NaPALIS), which is a centralised database for all Land Information New Zealand (LINZ) and Department of Conservation (DOC) administered land. The boundaries of many properties are linked to the applicable Landonline Primary Parcel(s), but in some cases the boundaries may have been drawn in as unsurveyed parcels to varying degrees of accuracy. As such please note that the boundaries are indicative only. The layer excludes any LINZ managed properties which do not have an identified location or extent. More information on Crown Property can be found under the Crown Property section on the LINZ Website. A subset of Crown Property can be found in the South Island Pastoral Leases layer. A table of Property associations to Primary Parcels is published in the LDS here. APIs and web services This dataset is available via ArcGIS Online and ArcGIS REST services, as well as our standard APIs. LDS APIs and OGC web services ArcGIS Online map services
New Zealand Tsunami Database includes information about tsunamis that have reached the coastline of New Zealand since humans arrived in this land until the present day. The database is primarily the work of historical seismologist Gaye Downes working for GNS Science who collected reports of tsunamis around New Zealand and, in many cases, carried out research to determine parameters of the source, travel time and impact associated with each event. Reports of tsunamis that make up the core of this database come from tide gauges, newspaper articles, harbour masters, records from ships, personal diary entries and Māori oral records. This primary source material is summarised in the database and complete transcriptions are held by the database custodian.
DOI: https://doi.org/10.21420/D6W9-0G74
Cite as: GNS Science. (2020). New Zealand Tsunami Database: Historical and Modern Records [Data set]. GNS Science. https://doi.org/10.21420/D6W9-0G74
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These data were derived as part of a case-study, which aimed to give a broad range of stakeholders involved in managing New Zealands agricultural landscape a voice in setting farmland biodiversity priorities that reflect the biodiversity outcomes that matter most to them and the management practices they consider most relevant to achieving those outcomes. This priority-setting process represented the first step in the development of an evidence-based tool for biodiversity assessments on New Zealand farms. For more information see: MacLeod, CJ, Brandt, A.J., Collins, K. & Dicks, LV (in press) Giving stakeholders a voice in governance: biodiversity priorities for New Zealands agriculture. People and Nature
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides insights into customer churn patterns and behaviors for Kiwibank, a leading New Zealand-owned financial institution. It includes demographic information (such as age, gender, geography), banking metrics (credit score, balance, products), and customer activity indicators. The dataset is suitable for predictive modeling tasks (e.g., predicting customer churn using machine learning algorithms like Naive Bayes, Random Forest, and Decision Tree) and clustering analysis (e.g., K-Means clustering to identify customer segments). Analyzing this dataset can help financial analysts, data scientists, and business strategists understand factors influencing customer retention and optimize strategies to improve customer satisfaction and loyalty. Key Features: Customer demographics: Age, gender, geography. Banking metrics: Credit score, balance, number of products. Customer activity: Tenure, usage of credit cards, activity level. Target variable: Churn (1 if the customer has churned, 0 otherwise). Potential Use Cases: Predictive modeling for customer churn prevention. Segmentation analysis to target marketing campaigns. Insights for enhancing customer retention strategies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Authors: L.G. Garrett1, M.S. Watt2, C.W. Ross3, G. Coker2, M.R. Davis2, J. Sanderman4, R. Parfitt3, J. Dando3, R. Simcock5, D.J. Palmer2, F. Dean1, S. Patel1, J.H. Bridson1, T. Carlin2, T. Payn1, B. Richardson1, A. Dunningham1, P.W. Clinton2.
Affiliation: 1 Scion, Private Bag 3020, Rotorua 3046, New Zealand; 2 Scion, PO Box 29237, Riccarton, Christchurch 8440, New Zealand; 3 Manaaki Whenua – Landcare Research, Private Bag 11052, Palmerston North, New Zealand; 4 Woodwell Climate Research Center, 149 Woods Hole Road, Falmouth, MA 02540, USA; 5 Manaaki Whenua – Landcare Research, Private Bag 92170, Auckland, New Zealand.
A soil dataset from the FR380 trial series spanning 35 Pinus radiata forest sites in New Zealand. The dataset underpins three existing publications by Watt et al. (2005; 2008) and Ross et al. (2009) which details sample sites and the time zero (i.e. time of tree planting) sample collection and testing method. The publication by Garrett et al., (2022) details the soil mid-infrared spectroscopy method and extension on soil chemistry testing using the same time zero samples.
The data is identified by an individual trial site ID and soil profile ID. Individual samples collected from site/soil profile are then identified by an individual soil horizon number and lab letter. Soil chemistry testing was undertaken at two laboratories, Manaaki Whenua - Landcare Research and Scion, which allocated individual lab ID’s. The linkage between these two sample ID’s, allocated to the same sample, are shown in the file ‘FR380_chemical’. The MIR spectra files use the Scion lab sample ID.
The data includes:
· File ‘FR380_sitedescription’: FR380 trial site description by trial ID and soil profile ID, including site location and description, soil classification, land use at time of trial installation and forest rotation number.
· File ‘’FR380_soilprofile’: FR380 trial site soil profile description by trial ID, soil profile ID and horizon number.
· File ‘FR380_chemical’: FR380 trial soil chemical properties by trial ID, soil profile ID, horizon number and lab letter, and induvial laboratory soil chemistry sample ID from both Manaaki Whenua - Landcare Research and Scion.
· File ‘FR380_particlesize: FR380 trial soil particle size properties by trial ID, soil profile ID and horizon number and lab letter.
· File ‘FR380_physical’: FR380 trial soil physical properties by trial ID, soil profile ID, horizon number and lab letter.
· Folder ‘FR380_MIR spectra’: FR380 trial soil Mid-Infrared spectra opus files by Scion sample ID.
· Folder ‘FR380_MIR spectra_csv’: FR380 trial soil Mid-Infrared spectra csv files by Scion sample ID.
· Folder ‘FR380_soil profile images’: FR380 trial soil profile image files by trial site ID.
Contact: Loretta Garrett (loretta.garrett@scionresearch.com)
Acknowledgments
Funding to publish the data came from the Tree-Root-Microbiome programme, which is funded by Ministry of Business, Innovation & Employment (MBIE) Endeavour Fund and in part by the New Zealand Forest Growers Levy Trust (C04X2002). Funding for the soil spectroscopy data and extension of soil chemical properties came from the Resilient Forest programme, which is funded by New Zealand Ministry of Business, Innovation & Employment (MBIE) Strategic Science Investment Fund, and in part by the New Zealand Forest Growers Levy Trust (C04X1703) and the Tree-Root-Microbiome programme (C04X2002). Funding for the sample collection and initial testing was provided from the Protecting and Enhancing the Environment through Forestry, which was funded by the New Zealand Foundation for Research, Science and Technology (C04X0304). Sites for the trial series were provided by numerous forest companies and private land owners, for which we are grateful. Individual laboratories who provided soil analyses are identified in the dataset and thanked.
References
Garrett LG, Sanderman J, Palmer DJ, Dean F, Patel S, Bridson JH, Carlin T (2022) Mid-infrared spectroscopy for planted forest soil and foliage nutrition predictions, New Zealand case study. Trees, Forests and People 8: 100280. https://doi.org/10.1016/j.tfp.2022.100280
Ross, C.W., Watt, M.S., Parfitt, R.L., Simcock, R., Dando, J., Coker, G., Clinton, P.W., Davis, M.R., 2009. Soil quality relationships with tree growth in exotic forests in New Zealand. Forest Ecology and Management 258, 2326-2334. https://doi.org/10.1016/j.foreco.2009.05.026
Watt, M.S., Coker, G., Clinton, P.W., Davis, M.R., Parfitt, R., Simcock, R., Garrett, L., Payn, T., Richardson, B., Dunningham, A., 2005. Defining sustainability of plantation forests through identification of site quality indicators influencing productivity—A national view for New Zealand. Forest Ecology and Management 216, 51-63. https://doi.org/10.1016/j.foreco.2005.05.064
Watt, M.S., Davis, M.R., Clinton, P.W., Coker, G., Ross, C., Dando, J., Parfitt, R.L., Simcock, R., 2008. Identification of key soil indicators influencing plantation productivity and sustainability across a national trial series in New Zealand. Forest Ecology and Management 256, 180-190. https://doi.org/10.1016/j.foreco.2008.04.024
Disclaimer
We make no warranties regarding the accuracy or integrity of the Data. We accept no liability for any direct, indirect, special, consequential or other losses or damages of whatsoever kind arising out of access to, or the use of the Data. We are in no way to be held responsible for the use that you put the Data to. You rely on the Data entirely at your own risk.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
New Zealand recorded 2282861 Coronavirus Cases since the epidemic began, according to the World Health Organization (WHO). In addition, New Zealand reported 2792 Coronavirus Deaths. This dataset includes a chart with historical data for New Zealand Coronavirus Cases.