Objective(s): Momentum for open access to research is growing. Funding agencies and publishers are increasingly requiring researchers make their data and research outputs open and publicly available. However, clinical researchers struggle to find real-world examples of Open Data sharing. The aim of this 1 hr virtual workshop is to provide real-world examples of Open Data sharing for both qualitative and quantitative data. Specifically, participants will learn: 1. Primary challenges and successes when sharing quantitative and qualitative clinical research data. 2. Platforms available for open data sharing. 3. Ways to troubleshoot data sharing and publish from open data. Workshop Agenda: 1. “Data sharing during the COVID-19 pandemic” - Speaker: Srinivas Murthy, Clinical Associate Professor, Department of Pediatrics, Faculty of Medicine, University of British Columbia. Investigator, BC Children's Hospital 2. “Our experience with Open Data for the 'Integrating a neonatal healthcare package for Malawi' project.” - Speaker: Maggie Woo Kinshella, Global Health Research Coordinator, Department of Obstetrics and Gynaecology, BC Children’s and Women’s Hospital and University of British Columbia This workshop draws on work supported by the Digital Research Alliance of Canada. Data Description: Presentation slides, Workshop Video, and Workshop Communication Srinivas Murthy: Data sharing during the COVID-19 pandemic presentation and accompanying PowerPoint slides. Maggie Woo Kinshella: Our experience with Open Data for the 'Integrating a neonatal healthcare package for Malawi' project presentation and accompanying Powerpoint slides. This workshop was developed as part of Dr. Ansermino's Data Champions Pilot Project supported by the Digital Research Alliance of Canada., NOTE for restricted files: If you are not yet a CoLab member, please complete our membership application survey to gain access to restricted files within 2 business days. Some files may remain restricted to CoLab members. These files are deemed more sensitive by the file owner and are meant to be shared on a case-by-case basis. Please contact the CoLab coordinator on this page under "collaborate with the pediatric sepsis colab."
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The DeGroot model has emerged as a credible alternative to the standard Bayesian model for studying learning on networks, offering a natural way to model naive learning in a complex setting. One unattractive aspect of this model is the assumption that the process starts with every node in the network having a signal. We study a natural extension of the DeGroot model that can deal with sparse initial signals. We show that an agent's social influence in this generalized DeGroot model is essentially proportional to the degree-weighted share of uninformed nodes who will hear about an event for the first time via this agent.This characterization result then allows us to relate network geometry to information aggregation.We show information aggregation preserves ``wisdom'' in the sense that initial signals are weighed approximately equally in a model of network formation that captures the sparsity, clustering, and small-worlds properties of real-world networks. We also identify an example of a network structure where essentially only the signal of a single agent is aggregated, which helps us pinpoint a condition on the network structure necessary for almost full aggregation. Simulating the modeled learning process on a set of real world networks, we find that there is on average 22.4% information loss in these networks. We also explore how correlation in the location of seeds can exacerbate aggregation failure. Simulations with real world network data show that with clustered seeding, information loss climbs to 34.4%. In this deposit, we include the codes and data to replicate all tables and figures.
In order to estimate the climate impact of highly absorbing black carbon (BC) aerosols, it is necessary to know their optical properties. The Lorentz-Mie theory, often used to calculate the optical properties of BC under the spherical morphological assumption, produces discrepancies when compared to measurements. In light of this, researchers are currently investigating the possibility of computing the optical properties of BC using a realistic fractal aggregate morphology. To determine the optical properties of such BC fractal aggregates, the Multiple Sphere T-Matrix method (MSTM) is used, which can take more than 24 hours for a single simulation depending on the aggregate properties. This study provides a highly accurate benchmark machine-learning algorithm that can be used to generate the optical properties of BC fractal aggregate in a fraction of a second. The machine learning algorithm was trained over an extensive database of physicochemical and optical properties of BC fractal aggregates. The extensive training data helped develop an ML algorithm that can accurately predict the optical properties of BC fractal aggregates with an average deviation of less than one percent from their actual values. Specifically, the ML algorithm provides the option to generate the optical properties in the visible spectrum using either kernel ridge regression (KRR) or artificial neural networks (ANN) for a BC fractal aggregate of desired physicochemical properties like size, morphology, and organic coating. The dataset of physicochemical and optical properties of BC fractal aggregates are provided here. The developed ML algorithm for predicting the optical properties of BC fractal aggregates (https://github.com/jaikrishnap/Machine-learning-for-prediction-of-BCFAs) is highly useful for real-world applications due to its wide parameter range, high accuracy, and low computational cost.
Contents
database_optical_properties_black_carbon_fractal_aggregtates.csv, data file, comma-separated values
database_header.txt, metadata, text
Citation for the database:
B., Romshoo, T., Müller, B., Patil, J., Michels, T., Kloft, M., and Pöhlker, M.: Database of physicochemical and optical properties of black carbon fractal aggregates, Dataset, https://doi.org/10.5281/zenodo.7523058, 2023.
https://datafinder.stats.govt.nz/license/attribution-4-0-international/https://datafinder.stats.govt.nz/license/attribution-4-0-international/
Statistical Area 1 2023 update
SA1 2023 is the first major update of the geography since it was first created in 2018. The update is to ensure SA1s are relevant and meet criteria before each five-yearly population and dwelling census. SA1 2023 contains 3,251 new SA1s. Updates were made to reflect real world changes including new subdivisions and motorways, improve the delineation of urban rural and other statistical areas and to ensure they meet population criteria by reducing the number of SA1s with small or large populations.
Description
This dataset is the definitive version of the annually released statistical area 1 (SA1) boundaries as at 1 January 2023, as defined by Stats NZ. This version contains 33,164 SA1s (33,148 digitised and 16 with empty or null geometries (non-digitised).
SA1 is an output geography that allows the release of more low-level data than is available at the meshblock level. Built by joining meshblocks, SA1s have an ideal size range of 100–200 residents, and a maximum population of about 500. This is to minimise suppression of population data in multivariate statistics tables.
The SA1 should:
form a contiguous cluster of one or more meshblocks,
be either urban, rural, or water in character,
be small enough to:
allow flexibility for aggregation to other statistical geographies,
allow users to aggregate areas into their own defined communities of interest,
form a nested hierarchy with statistical output geographies and administrative boundaries. It must:
be built from meshblocks,
either define or aggregate to define SA2s, urban rural areas, territorial authorities, and regional councils.
SA1s generally have a population of 100–200 residents, with some exceptions:
SA1s with nil or nominal resident populations are created to represent remote mainland areas, unpopulated islands, inland water, inlets, or oceanic areas.
Some SA1s in remote rural areas and urban industrial or business areas have fewer than 100 residents.
Some SA1s that contain apartment blocks, retirement villages, and large non-residential facilities (prisons, boarding schools, etc) have more than 500 residents.
SA1 numbering
SA1s are not named. SA1 codes have seven digits starting with a 7 and are numbered approximately north to south. Non-digitised codes start with 79.
As new SA1s are created, they are given the next available numeric code. If the composition of an SA1 changes through splitting or amalgamating different meshblocks, the SA1 is given a new code. The previous code no longer exists within that version and future versions of the SA1 classification.
Digitised and non-digitised SA1s
The digital geographic boundaries are defined and maintained by Stats NZ.
Aggregated from meshblocks, SA1s cover the land area of New Zealand, the water area to the 12-mile limit, the Chatham Islands, Kermadec Islands, sub-Antarctic islands, off-shore oil rigs, and Ross Dependency. The following 16 SA1s are held in non-digitised form.
7999901; New Zealand Economic Zone, 7999902; Oceanic Kermadec Islands,7999903; Kermadec Islands, 7999904; Oceanic Oil Rig Taranaki,7999905; Oceanic Campbell Island, 7999906; Campbell Island, 7999907; Oceanic Oil Rig Southland, 7999908; Oceanic Auckland Islands, 7999909; Auckland Islands, 7999910; Oceanic Bounty Islands, 7999911; Bounty Islands, 7999912; Oceanic Snares Islands, 7999913; Snares Islands, 7999914; Oceanic Antipodes Islands, 7999915; Antipodes Islands, 7999916; Ross Dependency.
For more information please refer to the Statistical standard for geographic areas 2023.
Generalised version
This generalised version has been simplified for rapid drawing and is designed for thematic or web mapping purposes.
Digital data
Digital boundary data became freely available on 1 July 2007.
To download geographic classifications in table formats such as CSV please use Ariā
https://datafinder.stats.govt.nz/license/attribution-4-0-international/https://datafinder.stats.govt.nz/license/attribution-4-0-international/
Statistical Area 2 2023 update
SA2 2023 is the first major update of the geography since it was first created in 2018. The update is to ensure SA2s are relevant and meet criteria before each five-yearly population and dwelling census. SA2 2023 contains 135 new SA2s. Updates were made to reflect real world change of population and dwelling growth mainly in urban areas, and to make some improvements to their delineation of communities of interest.
Description
This dataset is the definitive version of the annually released statistical area 2 (SA2) boundaries as at 1 January 2023 as defined by Stats NZ. This version contains 2,395 SA2s (2,379 digitised and 16 with empty or null geometries (non-digitised)).
SA2 is an output geography that provides higher aggregations of population data than can be provided at the statistical area 1 (SA1) level. The SA2 geography aims to reflect communities that interact together socially and economically. In populated areas, SA2s generally contain similar sized populations.
The SA2 should:
form a contiguous cluster of one or more SA1s,
excluding exceptions below, allow the release of multivariate statistics with minimal data suppression,
capture a similar type of area, such as a high-density urban area, farmland, wilderness area, and water area,
be socially homogeneous and capture a community of interest. It may have, for example:
form a nested hierarchy with statistical output geographies and administrative boundaries. It must:
SA2s in city council areas generally have a population of 2,000–4,000 residents while SA2s in district council areas generally have a population of 1,000–3,000 residents.
In major urban areas, an SA2 or a group of SA2s often approximates a single suburb. In rural areas, rural settlements are included in their respective SA2 with the surrounding rural area.
SA2s in urban areas where there is significant business and industrial activity, for example ports, airports, industrial, commercial, and retail areas, often have fewer than 1,000 residents. These SA2s are useful for analysing business demographics, labour markets, and commuting patterns.
In rural areas, some SA2s have fewer than 1,000 residents because they are in conservation areas or contain sparse populations that cover a large area.
To minimise suppression of population data, small islands with zero or low populations close to the mainland, and marinas are generally included in their adjacent land-based SA2.
Zero or nominal population SA2s
To ensure that the SA2 geography covers all of New Zealand and aligns with New Zealand’s topography and local government boundaries, some SA2s have zero or nominal populations. These include:
400001; New Zealand Economic Zone, 400002; Oceanic Kermadec Islands, 400003; Kermadec Islands, 400004; Oceanic Oil Rig Taranaki, 400005; Oceanic Campbell Island, 400006; Campbell Island, 400007; Oceanic Oil Rig Southland, 400008; Oceanic Auckland Islands, 400009; Auckland Islands, 400010 ; Oceanic Bounty Islands, 400011; Bounty Islands, 400012; Oceanic Snares Islands, 400013; Snares Islands, 400014; Oceanic Antipodes Islands, 400015; Antipodes Islands, 400016; Ross Dependency.
SA2 numbering and naming
Each SA2 is a single geographic entity with a name and a numeric code. The name refers to a geographic feature or a recognised place name or suburb. In some instances where place names are the same or very similar, the SA2s are differentiated by their territorial authority name, for example, Gladstone (Carterton District) and Gladstone (Invercargill City).
SA2 codes have six digits. North Island SA2 codes start with a 1 or 2, South Island SA2 codes start with a 3 and non-digitised SA2 codes start with a 4. They are numbered approximately north to south within their respective territorial authorities. To ensure the north–south code pattern is maintained, the SA2 codes were given 00 for the last two digits when the geography was created in 2018. When SA2 names or boundaries change only the last two digits of the code will change.
For more information please refer to the Statistical standard for geographic areas 2023.
Generalised version
This generalised version has been simplified for rapid drawing and is designed for thematic or web mapping purposes.
Macrons
Names are provided with and without tohutō/macrons. The column name for those without macrons is suffixed ‘ascii’.
Digital data
Digital boundary data became freely available on 1 July 2007.
To download geographic classifications in table formats such as CSV please use Ariā
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
With the proliferation of mobile crowdsensing (MCS) and crowdsourcing, new challenges are emerging every day. Although crowdsensing has become a popular sensing paradigm to aggregate sensor readings from a variety of sources, data inconsistency has arisen as a serious challenge. Truth discovery (TD) has been developed as an effective method for reducing data inconsistency and as a validity assessment for conflicting data from various sources. In addition, MCS applications and services are moving beyond a single individual participant to community groups and are influenced by group behavior. To address these challenges in this paper, we propose a novel Fog-assisted Group-based Truth Discovery Framework over MCS Data Streams, an efficient TD system for real-time applications. Specifically, we first initialized the weights for the weight update process in TD with the participants’ credibility level. Then, we developed a novel Two-layer Group-based Truth Discovery (TGTD) mechanism in which the first layer estimates the truth of the group’s members and the second layer estimates the aggregated truth for the groups. We have conducted extensive experiments over synthetic and real-world datasets to prove the effectiveness and efficiency of our framework. The results indicate that TGTD achieves superior truth discovery accuracy compared to current streaming truth discovery approaches, while maintaining a reasonable running time. The organization of the streaming process within the fog architecture simulation is identified as an area for further investigation and future work.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Objective(s): Momentum for open access to research is growing. Funding agencies and publishers are increasingly requiring researchers make their data and research outputs open and publicly available. However, clinical researchers struggle to find real-world examples of Open Data sharing. The aim of this 1 hr virtual workshop is to provide real-world examples of Open Data sharing for both qualitative and quantitative data. Specifically, participants will learn: 1. Primary challenges and successes when sharing quantitative and qualitative clinical research data. 2. Platforms available for open data sharing. 3. Ways to troubleshoot data sharing and publish from open data. Workshop Agenda: 1. “Data sharing during the COVID-19 pandemic” - Speaker: Srinivas Murthy, Clinical Associate Professor, Department of Pediatrics, Faculty of Medicine, University of British Columbia. Investigator, BC Children's Hospital 2. “Our experience with Open Data for the 'Integrating a neonatal healthcare package for Malawi' project.” - Speaker: Maggie Woo Kinshella, Global Health Research Coordinator, Department of Obstetrics and Gynaecology, BC Children’s and Women’s Hospital and University of British Columbia This workshop draws on work supported by the Digital Research Alliance of Canada. Data Description: Presentation slides, Workshop Video, and Workshop Communication Srinivas Murthy: Data sharing during the COVID-19 pandemic presentation and accompanying PowerPoint slides. Maggie Woo Kinshella: Our experience with Open Data for the 'Integrating a neonatal healthcare package for Malawi' project presentation and accompanying Powerpoint slides. This workshop was developed as part of Dr. Ansermino's Data Champions Pilot Project supported by the Digital Research Alliance of Canada., NOTE for restricted files: If you are not yet a CoLab member, please complete our membership application survey to gain access to restricted files within 2 business days. Some files may remain restricted to CoLab members. These files are deemed more sensitive by the file owner and are meant to be shared on a case-by-case basis. Please contact the CoLab coordinator on this page under "collaborate with the pediatric sepsis colab."