Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundAnonymization opens up innovative ways of using secondary data without the requirements of the GDPR, as anonymized data does not affect anymore the privacy of data subjects. Anonymization requires data alteration, and this project aims to compare the ability of such privacy protection methods to maintain reliability and utility of scientific data for secondary research purposes.MethodsThe French data protection authority (CNIL) defines anonymization as a processing activity that consists of using methods to make impossible any identification of people by any means in an irreversible manner. To answer project’s objective, a series of analyses were performed on a cohort, and reproduced on four sets of anonymized data for comparison. Four assessment levels were used to evaluate impact of anonymization: level 1 referred to the replication of statistical outputs, level 2 referred to accuracy of statistical results, level 3 assessed data alteration (using Hellinger distances) and level 4 assessed privacy risks (using WP29 criteria).Results87 items were produced on the raw cohort data and then reproduced on each of the four anonymized data. The overall level 1 replication score ranged from 67% to 100% depending on the anonymization solution. The most difficult analyses to replicate were regression models (sub-score ranging from 78% to 100%) and survival analysis (sub-score ranging from 0% to 100. The overall level 2 accuracy score ranged from 22% to 79% depending on the anonymization solution. For level 3, three methods had some variables with different probability distributions (Hellinger distance = 1). For level 4, all methods had reduced the privacy risk of singling out, with relative risk reductions ranging from 41% to 65%.ConclusionNone of the anonymization methods reproduced all outputs and results. A trade-off has to be find between context risk and the usefulness of data to answer the research question.
https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/MXM0Q2https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/MXM0Q2
In the publication [1] we implemented anonymization and synthetization techniques for a structured data set, which was collected during the HiGHmed Use Case Cardiology study [2]. We employed the data anonymization tool ARX [3] and the data synthetization framework ASyH [4] individually and in combination. We evaluated the utility and shortcomings of the different approaches by statistical analyses and privacy risk assessments. Data utility was assessed by computing two heart failure risk scores (Barcelona BioHF [5] and MAGGIC [6]) on the protected data sets. We observed only minimal deviations to scores from the original data set. Additionally, we performed a re-identification risk analysis and found only minor residual risks for common types of privacy threats. We could demonstrate that anonymization and synthetization methods protect privacy while retaining data utility for heart failure risk assessment. Both approaches and a combination thereof introduce only minimal deviations from the original data set over all features. While data synthesis techniques produce any number of new records, data anonymization techniques offer more formal privacy guarantees. Consequently, data synthesis on anonymized data further enhances privacy protection with little impacting data utility. We hereby share all generated data sets with the scientific community through a use and access agreement. [1] Johann TI, Otte K, Prasser F, Dieterich C: Anonymize or synthesize? Privacy-preserving methods for heart failure score analytics. Eur Heart J 2024;. doi://10.1093/ehjdh/ztae083 [2] Sommer KK, Amr A, Bavendiek, Beierle F, Brunecker P, Dathe H et al. Structured, harmonized, and interoperable integration of clinical routine data to compute heart failure risk scores. Life (Basel) 2022;12:749. [3] Prasser F, Eicher J, Spengler H, Bild R, Kuhn KA. Flexible data anonymization using ARX—current status and challenges ahead. Softw Pract Exper 2020;50:1277–1304. [4] Johann TI, Wilhelmi H. ASyH—anonymous synthesizer for health data, GitHub, 2023. Available at: https://github.com/dieterich-lab/ASyH. [5] Lupón J, de Antonio M, Vila J, Peñafiel J, Galán A, Zamora E, et al. Development of a novel heart failure risk tool: the Barcelona bio-heart failure risk calculator (BCN Bio-HF calculator). PLoS One 2014;9:e85466. [6] Pocock SJ, Ariti CA, McMurray JJV, Maggioni A, Køber L, Squire IB, et al. Predicting survival in heart failure: a risk score based on 39 372 patients from 30 studies. Eur Heart J 2013;34:1404–1413.
https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
BASE YEAR | 2024 |
HISTORICAL DATA | 2019 - 2024 |
REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
MARKET SIZE 2023 | 617.59(USD Billion) |
MARKET SIZE 2024 | 706.71(USD Billion) |
MARKET SIZE 2032 | 2077.2(USD Billion) |
SEGMENTS COVERED | Technology ,Deployment ,End User ,Anonymization Technique ,Regional |
COUNTRIES COVERED | North America, Europe, APAC, South America, MEA |
KEY MARKET DYNAMICS | 1 Growing demand for data privacy 2 Advancements in AI and facial recognition 3 Increase in video surveillance 4 Regulatory compliance 5 Expansion of cloudbased video anonymization solutions |
MARKET FORECAST UNITS | USD Billion |
KEY COMPANIES PROFILED | Microsoft ,Fourmilab ,Proofpoint ,LogRhythm ,SAS Institute ,FSecure ,Intermedia ,One Identity ,BeenVerified ,Oracle ,Image Scrubber ,IBM ,Splunk ,Axzon ,Digital Shadows |
MARKET FORECAST PERIOD | 2025 - 2032 |
KEY MARKET OPPORTUNITIES | 1 Growing adoption of video surveillance systems 2 Increasing demand from law enforcement and security agencies 3 Rising concerns over data privacy and security 4 Government regulations and compliance requirements 5 Advancements in AI and machine learning technologies |
COMPOUND ANNUAL GROWTH RATE (CAGR) | 14.43% (2025 - 2032) |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Recently big data and its applications had sharp growth in various fields such as IoT, bioinformatics, eCommerce, and social media. The huge volume of data incurred enormous challenges to the architecture, infrastructure, and computing capacity of IT systems. Therefore, the compelling need of the scientific and industrial community is large-scale and robust computing systems. Since one of the characteristics of big data is value, data should be published for analysts to extract useful patterns from them. However, data publishing may lead to the disclosure of individuals’ private information. Among the modern parallel computing platforms, Apache Spark is a fast and in-memory computing framework for large-scale data processing that provides high scalability by introducing the resilient distributed dataset (RDDs). In terms of performance, Due to in-memory computations, it is 100 times faster than Hadoop. Therefore, Apache Spark is one of the essential frameworks to implement distributed methods for privacy-preserving in big data publishing (PPBDP). This paper uses the RDD programming of Apache Spark to propose an efficient parallel implementation of a new computing model for big data anonymization. This computing model has three-phase of in-memory computations to address the runtime, scalability, and performance of large-scale data anonymization. The model supports partition-based data clustering algorithms to preserve the λ-diversity privacy model by using transformation and actions on RDDs. Therefore, the authors have investigated Spark-based implementation for preserving the λ-diversity privacy model by two designed City block and Pearson distance functions. The results of the paper provide a comprehensive guideline allowing the researchers to apply Apache Spark in their own researches.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Recently big data and its applications had sharp growth in various fields such as IoT, bioinformatics, eCommerce, and social media. The huge volume of data incurred enormous challenges to the architecture, infrastructure, and computing capacity of IT systems. Therefore, the compelling need of the scientific and industrial community is large-scale and robust computing systems. Since one of the characteristics of big data is value, data should be published for analysts to extract useful patterns from them. However, data publishing may lead to the disclosure of individuals’ private information. Among the modern parallel computing platforms, Apache Spark is a fast and in-memory computing framework for large-scale data processing that provides high scalability by introducing the resilient distributed dataset (RDDs). In terms of performance, Due to in-memory computations, it is 100 times faster than Hadoop. Therefore, Apache Spark is one of the essential frameworks to implement distributed methods for privacy-preserving in big data publishing (PPBDP). This paper uses the RDD programming of Apache Spark to propose an efficient parallel implementation of a new computing model for big data anonymization. This computing model has three-phase of in-memory computations to address the runtime, scalability, and performance of large-scale data anonymization. The model supports partition-based data clustering algorithms to preserve the λ-diversity privacy model by using transformation and actions on RDDs. Therefore, the authors have investigated Spark-based implementation for preserving the λ-diversity privacy model by two designed City block and Pearson distance functions. The results of the paper provide a comprehensive guideline allowing the researchers to apply Apache Spark in their own researches.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
For the purpose of research on data intermediaries and data anonymisation, it is necessary to test these processes with realistic video data containing personal data. For this purpose, the TreuMoDa project, funded by the German Federal Ministry of Education and Research (BMBF), has created a dataset of different traffic scenes containing identifiable persons.
This video data was collected at the Autonomous Driving Test Area Baden-Württemberg. On the one hand, it should be possible to recognise people in traffic, including their line of sight. On the other hand, it should be usable for the demonstration and evaluation of anonymisation techniques.
The legal basis for the publication of this data set the consent given by the participants as documented in the file Consent.pdf (all purposes) in accordance with Art. 6 1 (a) and Art. 9 2 (a) GDPR. Any further processing is subject to the GDPR.
We make this dataset available for non-commercial purposes such as teaching, research and scientific communication. Please note that this licence is limited by the provisions of the GDPR. Anyone downloading this data will become an independent controller of the data. This data has been collected with the consent of the identifiable individuals depicted.
Any consensual use must take into account the purposes mentioned in the uploaded consent forms and in the privacy terms and conditions provided to the participants (see Consent.pdf). All participants consented to all three purposes, and no consent was withdrawn at the time of publication. KIT is unable to provide you with contact details for any of the participants, as we have removed all links to personal data other than that contained in the published images.
ABSTRACT: Context: Large Language Models (LLMs) have revolutionized natural language generation and understanding. However, they raise significant data privacy concerns, especially when sensitive data is processed and stored by third parties. Goal: This paper investigates the perception of software development teams members regarding data privacy when using LLMs in their professional activities. Additionally, we examine the challenges faced and the practices adopted by these practitioners. Method: We conducted a survey with 78 ICT practitioners from five regions of the country. Results: Software development teams members have basic knowledge about data privacy and LGPD, but most have never received formal training on LLMs and possess only basic knowledge about them. Their main concerns include the leakage of sensitive data and the misuse of personal data. To mitigate risks, they avoid using sensitive data and implement anonymization techniques. The primary challenges practitioners face are ensuring transparency in the use of LLMs and minimizing data collection. Software development teams members consider current legislation inadequate for protecting data privacy in the context of LLM use. Conclusions: The results reveal a need to improve knowledge and practices related to data privacy in the context of LLM use. According to software development teams members, organizations need to invest in training, develop new tools, and adopt more robust policies to protect user data privacy. They advocate for a multifaceted approach that combines education, technology, and regulation to ensure the safe and responsible use of LLMs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Recently big data and its applications had sharp growth in various fields such as IoT, bioinformatics, eCommerce, and social media. The huge volume of data incurred enormous challenges to the architecture, infrastructure, and computing capacity of IT systems. Therefore, the compelling need of the scientific and industrial community is large-scale and robust computing systems. Since one of the characteristics of big data is value, data should be published for analysts to extract useful patterns from them. However, data publishing may lead to the disclosure of individuals’ private information. Among the modern parallel computing platforms, Apache Spark is a fast and in-memory computing framework for large-scale data processing that provides high scalability by introducing the resilient distributed dataset (RDDs). In terms of performance, Due to in-memory computations, it is 100 times faster than Hadoop. Therefore, Apache Spark is one of the essential frameworks to implement distributed methods for privacy-preserving in big data publishing (PPBDP). This paper uses the RDD programming of Apache Spark to propose an efficient parallel implementation of a new computing model for big data anonymization. This computing model has three-phase of in-memory computations to address the runtime, scalability, and performance of large-scale data anonymization. The model supports partition-based data clustering algorithms to preserve the λ-diversity privacy model by using transformation and actions on RDDs. Therefore, the authors have investigated Spark-based implementation for preserving the λ-diversity privacy model by two designed City block and Pearson distance functions. The results of the paper provide a comprehensive guideline allowing the researchers to apply Apache Spark in their own researches.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Immunization Agenda 2030 (IA2030) 1st Movement Full Learning Cycle (FLC 2022) – “How are you doing?” Value Creation Stories Survey (Version 1.0)
Education researchers interested in the application of the “value creation stories” (VCS) conceptual framework elaborated by Etienne Wenger et al. in the study of communities of practice and other types of digital communities.
The Geneva Learning Foundation 18 avenue Louis Casaï CH-1209 Geneva, Switzerland research@learning.foundation
Reda Sadki, The Geneva Learning Foundation (TGLF) reda@learning.foundation
Bridges to Development University of South Australia Centre for Change and Complexity in Learning (C3L)
Wellcome, Bill & Melinda Gates Foundation (BMGF)
The Geneva Learning Foundation, 2023. Value Creation Stories (VCS) weekly feedback survey, 2022 Full Learning Cycle (FLC) of the Movement for Immunization Agenda 2030 (IA2030) (Version 1.0). [Data Set]. The Geneva Learning Foundation. DOI: 10.5281/zenodo.7763922
This file is IA2030_FLC_2022_Value_Creation_Stories.README.md
IA2030-EN_FLC_2022_Value_Creation_Stories-questions_mapping.csv : List of the survey’s questions and their code in English as well as their unit. (21 questions) - Version 1: Geneva Learning Foundation, 31 March 2023.
IA2030-EN_FLC_2022_Value_Creation_Stories.csv : Dataset Response of participants that replied in English. (n: 2101, obs:5601) - Version 1: Geneva Learning Foundation, 31 March 2023.
IA2030-FR_FLC_2022_Value_Creation_Stories-questions_mapping.csv: List of the survey’s questions and their code in English as well as their unit. (21 questions) - Version 1: Geneva Learning Foundation, 31 March 2023.
IA2030-FR_FLC_2022_Value_Creation_Stories-Google_translation.csv: Dataset Response of participants that replied in French translated to English using “Google Translate” (n: 1585, obs:4493) - Version 1: Geneva Learning Foundation, 31 March 2023.
IA2030-FR_FLC_2022_Value_Creation_Stories.csv: Dataset Response of participants that replied in French (n: 1585, obs:4493) - Version 1: Geneva Learning Foundation, 31 March 2023. Relationship between files: The questions codes data set are the same code as the column variables and can be connected.
The questions codes data set are the same code as the column variables and can be connected.
This is a subset of data collected by The Geneva Learning Foundation (TGLF) during the 1st IA2030 Full Learning Cycle (FLC). The complete data set is more comprehensive, and includes: demographic information (gender, country), health system information (respondent’s health system level), respondents’ analyses of challenges and priorities.
Additional data sets for the first Full Learning Cycle (FLC) of the Movement for Immunization Agenda 2030 (IA2030) are available from The Geneva Learning Foundation (TGLF) Insights Unit insights@learning.foundation
The Geneva Learning Foundation publishes data sets in relation to its Immunization Agenda 2030 (IA2030) Movement learning programme in the Zenodo open repository community: https://zenodo.org/communities/ia2030/
This survey had two goals in the context of TGLF’s IA2030 Movement Full Learning Cycle programme (2022): 1. Provide an asynchronous mechanism for support between peers (participants) and from the TGLF team; and 2. collect and measure programme participants’ value creation stories (VCS) during the programme.
Martin de Laat’s “value creation stories” (VCS) has been used primarily in small-scale, qualitative studies of communities of practice, online forums, and education activities.
This data set includes both quantitative (Likert) and qualitative (open text) responses to the VCS questions, collected over a period of four months (7 March – 20 June 2022) from a cohort that began with 6,185 participants on the start date.
The target population were participants of the Geneva Learning Foundation’s Movement for Immunization Agenda 2030 (IA2030) learning programme. The initial cohort admitted to the programme was 6,185 individuals from 99 countries. Only participants who were formally admitted to the programme received the invitation to complete the survey.
Programme participants were free to choose if and when to report (self-selection), and their responses were not checked against any other measures (self-reporting).
Data collection period: 7 March 2022 – 20 June 2022
Between 7 March and 20 June 2023, participants in the Geneva Learning Foundation’s “Immunization Agenda 2030” (IA2030) Movement Full Learning Cycle (FLC) were asked to respond to a questionnaire titled “How are you doing?”.
Participants received a personalized email with the request to share feedback about their experience during the week. The link to share feedback was also included in other reminder and information emails sent in response to participant needs.
The first survey was launched on the 11 of March 2022 and the last at 17 of March 2022, totalizing 15 requests. Participants could answer the survey at any time and as many times that they wished.
The group of 6,185 participants grew over the course of the Cycle, as additional participants were able to join the initiative throughout the four-month period.
7 March 2023 until 20 June 2023
No requests for responses were sent during TGLF’s “Term break” between 16-30 April 2022.
The survey included Likert scale questions and qualitative open texts based the conceptual framework for Value Creation Stories (VCS) developed by Wenger et. al. (2011). There are no derived or calculated variables. Items are Likert scale, multiple choice, and open text.
All the responses done before or after the FLC period (7 March – 20 June 2022) were excluded of the sample.
This data set is made available on Zenodo.org in the Zenodo community “Movement for Immunization Agenda 2030 (IA2030)” https://zenodo.org/communities/ia2030/
Requests for additional information should be addressed to research@learning.foundation.
This is a subset of data collected by The Geneva Learning Foundation (TGLF) during the 1st IA2030 Full Learning Cycle (FLC).
The complete data set is more comprehensive, and includes: demographic information (gender, country), health system information (respondent’s health system level), respondents’ analyses of challenges and priorities.
The Geneva Learning Foundation publishes data sets in relation to its Immunization
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
1. Home range dynamics and movement are central to a species' ecology and strongly mediate both intra- and interspecific interactions. Numerous methods have been introduced to describe animal home ranges, but most lack predictive ability and cannot capture effects of dynamic environmental patterns, such as the impacts of air and water flow on movement.
2. Here, we develop a practical, multi-stage approach for statistical inference into the behavioral mechanisms underlying how habitat and dynamic energy landscapes---in this case how airflow increases or decreases the energetic efficiency of flight---shape animal home ranges based around central places. We validated the new approach using simulations, then applied it to a sample of 12 adult golden eagles (Aquila chrysaetos) tracked with satellite telemetry.
3. The application to golden eagles revealed effects of habitat variables that align with predicted behavioral ecology. Further, we found that males and females partition their home ranges dynamically based on uplift. Specifically, changes in wind and sun angle drove differential space use between sexes, especially later in the breeding season when energetic demands of growing nestlings require both parents to forage more widely.
4. This method is easily implemented using widely available programming languages and is based on a hierarchical multistate Ornstein-Uhlenbeck space use process that incorporates habitat and energy landscapes. The underlying mathematical properties of the model allow straightforward computation of predicted utilization distributions, permitting estimation of home range size and visualization of space use patterns under varying conditions.
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Within the project WEA-Acceptance¹, extensive measurement campaigns were carried out, which included the recording of acoustic, meteorological and turbine-specific data. Acoustic quantities were measured at several distances to the wind turbine and under various atmospheric and turbine conditions. In the project WEA-Acceptance-Data², the acquired measurements are stored in a structured and anonymized form and provided for research purposes. Besides the data and its documentation, first evaluations as well as reference data sets for chosen scenarios are published.
In this version of the data platform, a specification 2.0, an anonymized data set and three use cases are published. The specification contains the concept of the data platform, which is primarily based on the FAIR (Findable, Accessible, Interoperable, and Reusable) principle. The data set consists of turbine-specific, meteorological and acoustic data recorded over one month. Herein, the data were corrected, conditioned and anonymized so that relevant outliers are marked and erroneous data are removed in the data set. The acoustic data includes anonymized sound pressure levels and one-third octave spectra averaged over ten minutes as well as audio data. In addition, the metadata and an overview of data availability are uploaded. As examples for the application of the data, three use cases are also published. Important information such as the approach for data anonymization is briefly described in the ReadMe file.
For further information about the measurements, it is referred to "Martens, S., Bohne, T., and Rolfes, R.: An evaluation method for extensive wind turbine sound measurement data and its application, Proceedings of Meetings on Acoustics, Acoustical Society of America, 41, 040001, https://doi.org/10.1121/2.0001326, 2020.
¹The project WEA-Acceptance (FKZ 0324134A) was funded by the German Federal Ministry for Economic Affairs and Energy (BMWi).
²The project WEA-Acceptance-Data (FKZ 03EE3062) was funded by the German Federal Ministry for Economic Affairs and Energy (BMWi).
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This data pertains to the farmer's survey in the Kamadhiya catchment in Gujarat India. In December 2021, 492 farmers distributed across 24 villages were interviewed in the Kamadhiya catchment. The study sample was selected through a multistage random sampling procedure. First, 24 villages from a total of 88 villages lying within the Kamadhiya catchment were selected using regularly distributed sampling. Thereafter, in each village, 20-22 farmers were selected for the survey using proportionate random sampling. The survey questionnaire consisted of two parts, 1) farmers' socio-economic characteristics and 2) farmers' perception of check dam impacts and sociopsychological questions regarding the maintenance of CDs; 3) farmers' cropping and irrigation practices and 4) farmers' perception and adoption of AWM practices (drip and borewell). For more description and information on the study area, please refer to the below publication:
Alam, M.F., McClain, M.E., Sikka, A., Daniel, D., Pande, S., 2022. Benefits, equity, and sustainability of community rainwater harvesting structures: An assessment based on farm scale social survey. Front. Environ. Sci. 10. https://doi.org/10.3389/fenvs.2022.1043896
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Questionnaires, raw data, and syntax files for the three studies of the following paper: Sommet, N., & Elliot. A. J., Jamieson, J. P., Butera, F. (2018). Income inequality, perceived competitiveness, and approach-avoidance motivation. To appear in Journal of Personality.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This supplementary material provides the anonymized raw dataset of the asymmetry anaylsis. (XLSX)
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundAnonymization opens up innovative ways of using secondary data without the requirements of the GDPR, as anonymized data does not affect anymore the privacy of data subjects. Anonymization requires data alteration, and this project aims to compare the ability of such privacy protection methods to maintain reliability and utility of scientific data for secondary research purposes.MethodsThe French data protection authority (CNIL) defines anonymization as a processing activity that consists of using methods to make impossible any identification of people by any means in an irreversible manner. To answer project’s objective, a series of analyses were performed on a cohort, and reproduced on four sets of anonymized data for comparison. Four assessment levels were used to evaluate impact of anonymization: level 1 referred to the replication of statistical outputs, level 2 referred to accuracy of statistical results, level 3 assessed data alteration (using Hellinger distances) and level 4 assessed privacy risks (using WP29 criteria).Results87 items were produced on the raw cohort data and then reproduced on each of the four anonymized data. The overall level 1 replication score ranged from 67% to 100% depending on the anonymization solution. The most difficult analyses to replicate were regression models (sub-score ranging from 78% to 100%) and survival analysis (sub-score ranging from 0% to 100. The overall level 2 accuracy score ranged from 22% to 79% depending on the anonymization solution. For level 3, three methods had some variables with different probability distributions (Hellinger distance = 1). For level 4, all methods had reduced the privacy risk of singling out, with relative risk reductions ranging from 41% to 65%.ConclusionNone of the anonymization methods reproduced all outputs and results. A trade-off has to be find between context risk and the usefulness of data to answer the research question.