33 datasets found

D
Data De-identification and Pseudonymity Software Report
marketresearchforecast.com
doc, pdf, ppt
Updated Mar 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Research Forecast (2025). Data De-identification and Pseudonymity Software Report [Dataset]. https://www.marketresearchforecast.com/reports/data-de-identification-and-pseudonymity-software-30730
Explore at:
ppt, doc, pdfAvailable download formats
Dataset updated
Mar 9, 2025
Dataset authored and provided by
Market Research Forecast
License
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Data De-identification and Pseudonymization Software market is experiencing robust growth, projected to reach $1941.6 million in 2025 and exhibiting a Compound Annual Growth Rate (CAGR) of 7.3%. This expansion is driven by increasing regulatory compliance needs (like GDPR and CCPA), heightened concerns regarding data privacy and security breaches, and the burgeoning adoption of cloud-based solutions. The market is segmented by deployment (cloud-based and on-premises) and application (large enterprises and SMEs). Cloud-based solutions are gaining significant traction due to their scalability, cost-effectiveness, and ease of implementation, while large enterprises dominate the application segment due to their greater need for robust data protection strategies and larger budgets. Key market players include established tech giants like IBM and Informatica, alongside specialized providers such as Very Good Security and Anonomatic, indicating a dynamic competitive landscape with both established and emerging players vying for market share. Geographic expansion is also a key driver, with North America currently holding a significant market share, followed by Europe and Asia Pacific. The forecast period (2025-2033) anticipates continued growth fueled by advancements in artificial intelligence and machine learning for enhanced de-identification techniques, and the increasing demand for data anonymization across various sectors like healthcare, finance, and government. The restraining factors, while present, are not expected to significantly hinder the market’s overall growth trajectory. These limitations might include the complexity of implementing robust de-identification solutions, the potential for re-identification risks despite advanced techniques, and the ongoing evolution of privacy regulations necessitating continuous adaptation of software capabilities. However, ongoing innovation and technological advancements are anticipated to mitigate these challenges. The continuous development of more sophisticated algorithms and solutions addresses re-identification vulnerabilities, while proactive industry collaboration and regulatory guidance aim to streamline implementation processes, ultimately fostering continued market expansion. The increasing adoption of data anonymization across diverse sectors, coupled with the expanding global digital landscape and related data protection needs, suggests a positive outlook for sustained market growth throughout the forecast period.
h
Anonymize or Synthesize? – Privacy-Preserving Methods for Heart Failure...
heidata.uni-heidelberg.de
pdf, tsv, txt
Updated Nov 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tim Ingo Johann; Tim Ingo Johann; Karen Otte; Karen Otte; Fabian Prasser; Fabian Prasser; Christoph Dieterich; Christoph Dieterich (2024). Anonymize or Synthesize? – Privacy-Preserving Methods for Heart Failure Score Analytics [data] [Dataset]. http://doi.org/10.11588/DATA/MXM0Q2
Explore at:
tsv(197975), tsv(190296), tsv(191831), pdf(640128), tsv(107100), txt(3421), tsv(286102), tsv(106632)Available download formats
Unique identifier
https://doi.org/10.11588/DATA/MXM0Q2
Dataset updated
Nov 20, 2024
Dataset provided by
heiDATA
Authors
Tim Ingo Johann; Tim Ingo Johann; Karen Otte; Karen Otte; Fabian Prasser; Fabian Prasser; Christoph Dieterich; Christoph Dieterich
License
https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/MXM0Q2https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/MXM0Q2
Description
In the publication [1] we implemented anonymization and synthetization techniques for a structured data set, which was collected during the HiGHmed Use Case Cardiology study [2]. We employed the data anonymization tool ARX [3] and the data synthetization framework ASyH [4] individually and in combination. We evaluated the utility and shortcomings of the different approaches by statistical analyses and privacy risk assessments. Data utility was assessed by computing two heart failure risk scores (Barcelona BioHF [5] and MAGGIC [6]) on the protected data sets. We observed only minimal deviations to scores from the original data set. Additionally, we performed a re-identification risk analysis and found only minor residual risks for common types of privacy threats. We could demonstrate that anonymization and synthetization methods protect privacy while retaining data utility for heart failure risk assessment. Both approaches and a combination thereof introduce only minimal deviations from the original data set over all features. While data synthesis techniques produce any number of new records, data anonymization techniques offer more formal privacy guarantees. Consequently, data synthesis on anonymized data further enhances privacy protection with little impacting data utility. We hereby share all generated data sets with the scientific community through a use and access agreement. [1] Johann TI, Otte K, Prasser F, Dieterich C: Anonymize or synthesize? Privacy-preserving methods for heart failure score analytics. Eur Heart J 2024;. doi://10.1093/ehjdh/ztae083 [2] Sommer KK, Amr A, Bavendiek, Beierle F, Brunecker P, Dathe H et al. Structured, harmonized, and interoperable integration of clinical routine data to compute heart failure risk scores. Life (Basel) 2022;12:749. [3] Prasser F, Eicher J, Spengler H, Bild R, Kuhn KA. Flexible data anonymization using ARX—current status and challenges ahead. Softw Pract Exper 2020;50:1277–1304. [4] Johann TI, Wilhelmi H. ASyH—anonymous synthesizer for health data, GitHub, 2023. Available at: https://github.com/dieterich-lab/ASyH. [5] Lupón J, de Antonio M, Vila J, Peñafiel J, Galán A, Zamora E, et al. Development of a novel heart failure risk tool: the Barcelona bio-heart failure risk calculator (BCN Bio-HF calculator). PLoS One 2014;9:e85466. [6] Pocock SJ, Ariti CA, McMurray JJV, Maggioni A, Køber L, Squire IB, et al. Predicting survival in heart failure: a risk score based on 39 372 patients from 30 studies. Eur Heart J 2013;34:1404–1413.
A sample medical dataset.
plos.figshare.com
xls
Updated May 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Farough Ashkouti; Keyhan Khamforoosh (2023). A sample medical dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0285212.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0285212.t001
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Farough Ashkouti; Keyhan Khamforoosh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Recently big data and its applications had sharp growth in various fields such as IoT, bioinformatics, eCommerce, and social media. The huge volume of data incurred enormous challenges to the architecture, infrastructure, and computing capacity of IT systems. Therefore, the compelling need of the scientific and industrial community is large-scale and robust computing systems. Since one of the characteristics of big data is value, data should be published for analysts to extract useful patterns from them. However, data publishing may lead to the disclosure of individuals’ private information. Among the modern parallel computing platforms, Apache Spark is a fast and in-memory computing framework for large-scale data processing that provides high scalability by introducing the resilient distributed dataset (RDDs). In terms of performance, Due to in-memory computations, it is 100 times faster than Hadoop. Therefore, Apache Spark is one of the essential frameworks to implement distributed methods for privacy-preserving in big data publishing (PPBDP). This paper uses the RDD programming of Apache Spark to propose an efficient parallel implementation of a new computing model for big data anonymization. This computing model has three-phase of in-memory computations to address the runtime, scalability, and performance of large-scale data anonymization. The model supports partition-based data clustering algorithms to preserve the λ-diversity privacy model by using transformation and actions on RDDs. Therefore, the authors have investigated Spark-based implementation for preserving the λ-diversity privacy model by two designed City block and Pearson distance functions. The results of the paper provide a comprehensive guideline allowing the researchers to apply Apache Spark in their own researches.
Geospatial and Information Substitution and Anonymization Tool (GISA)
osti.gov
Updated Jul 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Geospatial and Information Substitution and Anonymization Tool (GISA) [Dataset]. https://www.osti.gov/biblio/1992880
Explore at:
Unique identifier
https://doi.org/10.18141/1992880
Dataset updated
Jul 31, 2023
Dataset provided by
United States Department of Energyhttp://energy.gov/
National Energy Technology Laboratoryhttps://netl.doe.gov/
Description
The Geospatial and Information Substitution and Anonymization Tool (GISA) incorporates techniques for obfuscating identifiable information from point data or documents, while simultaneously maintaining chosen variables to enable future use and meaningful analysis. This approach promotes collaboration and data sharing while also reducing the risk of exposure to sensitive information. GISA can be used in a number of different ways, including the anonymization of point spatial data, batch replacement/removal of user-specified terms from file names and from within file content, and aid with the selection and redaction of images and terms based on recommendations using natural language processing. Version 1 of the tool, published here, has updated functionality and enhanced capabilities to the beta version published in 2023. Please see User Documentation for further information on capabilities, as well as a guide for how to download and use the tool. If there are any feedback you would like to provide for the tool, please reach out with your feedback to edxsupport@netl.doe.gov. Disclaimer: This project was funded by the United States Department of Energy, National Energy Technology Laboratory, in part, through a site support contract. Neither the United States Government nor any agency thereof, nor any of their employees, nor the support contractor, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof. The Geospatial and Information Substitution and Anonymization Tool (GISA) was developed jointly through the U.S. DOE Office of Fossil Energy and Carbon Management’s EDX4CCS Project, in part, from the Bipartisan Infrastructure Law.
f
Data from: Summary of baseline characteristics.
plos.figshare.com
figshare.com
xls
Updated Feb 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Pau; Camille Bachot; Charles Monteil; Laetitia Vinet; Mathieu Boucher; Nadir Sella; Romain Jegou (2025). Summary of baseline characteristics. [Dataset]. http://doi.org/10.1371/journal.pdig.0000735.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pdig.0000735.t001
Dataset updated
Feb 3, 2025
Dataset provided by
PLOS Digital Health
Authors
David Pau; Camille Bachot; Charles Monteil; Laetitia Vinet; Mathieu Boucher; Nadir Sella; Romain Jegou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundAnonymization opens up innovative ways of using secondary data without the requirements of the GDPR, as anonymized data does not affect anymore the privacy of data subjects. Anonymization requires data alteration, and this project aims to compare the ability of such privacy protection methods to maintain reliability and utility of scientific data for secondary research purposes.MethodsThe French data protection authority (CNIL) defines anonymization as a processing activity that consists of using methods to make impossible any identification of people by any means in an irreversible manner. To answer project’s objective, a series of analyses were performed on a cohort, and reproduced on four sets of anonymized data for comparison. Four assessment levels were used to evaluate impact of anonymization: level 1 referred to the replication of statistical outputs, level 2 referred to accuracy of statistical results, level 3 assessed data alteration (using Hellinger distances) and level 4 assessed privacy risks (using WP29 criteria).Results87 items were produced on the raw cohort data and then reproduced on each of the four anonymized data. The overall level 1 replication score ranged from 67% to 100% depending on the anonymization solution. The most difficult analyses to replicate were regression models (sub-score ranging from 78% to 100%) and survival analysis (sub-score ranging from 0% to 100. The overall level 2 accuracy score ranged from 22% to 79% depending on the anonymization solution. For level 3, three methods had some variables with different probability distributions (Hellinger distance = 1). For level 4, all methods had reduced the privacy risk of singling out, with relative risk reductions ranging from 41% to 65%.ConclusionNone of the anonymization methods reproduced all outputs and results. A trade-off has to be find between context risk and the usefulness of data to answer the research question.
D
Data Masking Software Report
archivemarketresearch.com
doc, pdf, ppt
Updated Mar 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AMA Research & Media LLP (2025). Data Masking Software Report [Dataset]. https://www.archivemarketresearch.com/reports/data-masking-software-57502
Explore at:
ppt, pdf, docAvailable download formats
Dataset updated
Mar 14, 2025
Dataset provided by
AMA Research & Media LLP
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Data Masking Software market is experiencing robust growth, driven by increasing regulations around data privacy (like GDPR and CCPA), the expanding adoption of cloud computing, and the surging need for secure data sharing across organizations. The market size in 2025 is estimated at $2.5 billion, exhibiting a Compound Annual Growth Rate (CAGR) of 15% during the forecast period (2025-2033). This significant growth is fueled by several key factors, including the rising demand for data anonymization and pseudonymization techniques across various sectors like banking, healthcare, and retail. Companies are increasingly investing in data masking solutions to protect sensitive customer information during testing, development, and collaboration, thus mitigating the risk of data breaches and regulatory penalties. The diverse application segments, including Banking, Financial Services, and Insurance (BFSI), Healthcare and Life Sciences, and Retail and Ecommerce, contribute significantly to market expansion. Furthermore, the shift towards cloud-based solutions offers scalability and cost-effectiveness, further accelerating market adoption. The market segmentation reveals a strong preference for cloud-based solutions, driven by their inherent flexibility and ease of deployment. Within the application segments, the BFSI sector is currently leading due to stringent regulatory compliance needs and the large volume of sensitive customer data handled. However, growth in the healthcare and life sciences sector is expected to accelerate significantly as more institutions embrace digital transformation and the handling of patient data becomes increasingly regulated. Geographic growth is robust across North America and Europe, with Asia-Pacific showing significant potential for future expansion due to growing digitalization and increasing awareness of data security issues. While the market faces certain restraints such as the complexity of implementing data masking solutions and the high initial investment costs, the long-term benefits of robust data protection and compliance outweigh these challenges, driving consistent market expansion.
f
pone.0285212.t004 - A distributed computing model for big data anonymization...
plos.figshare.com
xls
Updated May 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Farough Ashkouti; Keyhan Khamforoosh (2023). pone.0285212.t004 - A distributed computing model for big data anonymization in the networks [Dataset]. http://doi.org/10.1371/journal.pone.0285212.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0285212.t004
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Farough Ashkouti; Keyhan Khamforoosh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
pone.0285212.t004 - A distributed computing model for big data anonymization in the networks
C
Cloud Data Desensitization Report
marketresearchforecast.com
doc, pdf, ppt
Updated Mar 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Research Forecast (2025). Cloud Data Desensitization Report [Dataset]. https://www.marketresearchforecast.com/reports/cloud-data-desensitization-30079
Explore at:
pdf, doc, pptAvailable download formats
Dataset updated
Mar 8, 2025
Dataset authored and provided by
Market Research Forecast
License
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The cloud data desensitization market is experiencing robust growth, driven by increasing regulatory compliance needs (like GDPR and CCPA), the rising volume of sensitive data stored in the cloud, and the expanding adoption of cloud computing across diverse sectors. The market, estimated at $5 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033, reaching approximately $15 billion by 2033. Key growth drivers include the escalating need to protect sensitive data from breaches and unauthorized access, particularly within healthcare (medical research data), finance (financial risk assessment), and government (government statistics). The cloud-based delivery model offers scalability and cost-effectiveness, further fueling market expansion. While strong security measures are integral to the success of this technology, challenges remain regarding the balance between data usability and robust security protocols. Integration complexities with existing infrastructure and the potential for unforeseen vulnerabilities represent key restraints. Market segmentation reveals a strong preference for cloud-based solutions, given their inherent flexibility and scalability. The application segments, medical research data, financial risk assessment, and government statistics, are currently leading the market, primarily due to the highly sensitive nature of the data involved. Leading vendors like Micro Focus, IBM, Thales, Google Cloud, and others are actively shaping the market landscape through continuous innovation and the introduction of advanced data masking and tokenization techniques. Regional analysis indicates strong growth in North America and Europe, driven by stringent data privacy regulations and a high concentration of organizations handling sensitive data. However, increasing adoption in the Asia-Pacific region, fueled by rapid digital transformation, is expected to significantly boost market growth in the coming years. The forecast period of 2025-2033 presents a significant opportunity for market expansion, driven by increased data security awareness and evolving technological advancements.
f
Data from: S1 Data -
plos.figshare.com
xlsx
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Farough Ashkouti; Keyhan Khamforoosh (2023). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0285212.s001
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0285212.s001
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Farough Ashkouti; Keyhan Khamforoosh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Recently big data and its applications had sharp growth in various fields such as IoT, bioinformatics, eCommerce, and social media. The huge volume of data incurred enormous challenges to the architecture, infrastructure, and computing capacity of IT systems. Therefore, the compelling need of the scientific and industrial community is large-scale and robust computing systems. Since one of the characteristics of big data is value, data should be published for analysts to extract useful patterns from them. However, data publishing may lead to the disclosure of individuals’ private information. Among the modern parallel computing platforms, Apache Spark is a fast and in-memory computing framework for large-scale data processing that provides high scalability by introducing the resilient distributed dataset (RDDs). In terms of performance, Due to in-memory computations, it is 100 times faster than Hadoop. Therefore, Apache Spark is one of the essential frameworks to implement distributed methods for privacy-preserving in big data publishing (PPBDP). This paper uses the RDD programming of Apache Spark to propose an efficient parallel implementation of a new computing model for big data anonymization. This computing model has three-phase of in-memory computations to address the runtime, scalability, and performance of large-scale data anonymization. The model supports partition-based data clustering algorithms to preserve the λ-diversity privacy model by using transformation and actions on RDDs. Therefore, the authors have investigated Spark-based implementation for preserving the λ-diversity privacy model by two designed City block and Pearson distance functions. The results of the paper provide a comprehensive guideline allowing the researchers to apply Apache Spark in their own researches.
Trust in the government on the use of data for the StopCovid app in France...
statista.com
Updated Mar 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). Trust in the government on the use of data for the StopCovid app in France May 2020 [Dataset]. https://www.statista.com/statistics/1118467/stopcovid-app-trust-france/
Explore at:
Dataset updated
Mar 10, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
May 7, 2020
Area covered
France
Description
The French express high expectations in terms of information in the event of the development of other similar applications, primarily on the anonymization of data (84 percent) and the methods of control, in particular by the user himself (81 percent).

StopCovid is a project that is part of the state of health emergency linked to the coronavirus epidemic. This project would consist of a smartphone application intended to limit the spread of the virus by identifying the transmission chains through the collection of somewhat personal infomation of French app users. In general, French people were rather in favor of the app .
Livelihoods Programme Monitoring Beneficiary Survey in 2017 - Chad
microdata.worldbank.org
catalog.ihsn.org
Updated May 21, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United Nations High Commissioner for Refugees (UNHCR) (2021). Livelihoods Programme Monitoring Beneficiary Survey in 2017 - Chad [Dataset]. https://microdata.worldbank.org/index.php/catalog/4002
Explore at:
Dataset updated
May 21, 2021
Dataset provided by
United Nations High Commissioner for Refugeeshttp://www.unhcr.org/
Authors
United Nations High Commissioner for Refugees (UNHCR)
Time period covered
2017
Area covered
Chad
Description
Abstract

Since 2014, UNHCR has undertaken a comprehensive revision of the framework for monitoring UNHCR Livelihoods and Economic Inclusion programs. Since 2017, mobile data collection (survey) tools have been rolled out globally, including in Chad. The participating operations conducted a household survey to a sample of beneficiaries of each livelihoods project implemented by UNHCR and its partner. The dataset consists of baseline (331 observations) and endline data (308 observations) from the same sample beneficiaries, in order to compare before and after the project implementation and thus to measure the impact.

Geographic coverage

Amboko Amnabak Belom Djabal Doholo Dosseye Gondje Koloma Moyo

Analysis unit

Household

Kind of data

Sample survey data [ssd]

Sampling procedure

The sample size for this dataset is: Baseline data : 331 Endline data : 308 Total : 639

The sampling was conducted by each participating operation based on general sampling guidance provided as the following;

At least 100 randomly selected beneficiaries for each project

Representativeness of sub-groups (gender, camp, etc.) should be kept as much as possible

Baseline and endline beneficiaries should be the same

Sampling deviation

Some operations may deviate from the sampling guidance due to local constraints such as logistical and security obstacles.

Mode of data collection

Computer Assisted Personal Interview [capi]

Research instrument

The survey questionnaire used to collect the survey consists of five sections: Partner Information, General Information on Beneficiary, Access to Agricultural Production Enabled and Enhanced, Access to Self-Employment/ Business Facilitated, and Access to Wage Employment Facilitated.

Cleaning operations

The dataset presented here has undergone light checking, cleaning, harmonization of localized information, and restructuring (data may still contain errors) as well as anonymization (includes removal of direct identifiers and sensitive variables, and grouping values of select variables). Empty values can occur for several reasons (e.g. no occurrence of agricultural interventions among the beneficiaries will result in empty variables for the agricultural module). Local suppression did not lead to empty variables.

Response rate

Information not available
Data from "Auditory tests for characterizing hearing deficits in listeners...
zenodo.org
data.niaid.nih.gov
bin, pdf, zip
Updated Jul 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raul Sanchez-Lopez; Raul Sanchez-Lopez; Michal Fereczkowski; Michal Fereczkowski; Mouhamad El-Haj-Ali; Mouhamad El-Haj-Ali; Federica Bianchi; Federica Bianchi; Oscar Cañete; Oscar Cañete; Mengfan Wu; Mengfan Wu; Tobias Neher; Tobias Neher; Torsten Dau; Torsten Dau; Sébastien Santurette; Sébastien Santurette (2024). Data from "Auditory tests for characterizing hearing deficits in listeners with various hearing abilities: The BEAR test battery" [Dataset]. http://doi.org/10.5281/zenodo.4923009
Explore at:
bin, pdf, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4923009
Dataset updated
Jul 19, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Raul Sanchez-Lopez; Raul Sanchez-Lopez; Michal Fereczkowski; Michal Fereczkowski; Mouhamad El-Haj-Ali; Mouhamad El-Haj-Ali; Federica Bianchi; Federica Bianchi; Oscar Cañete; Oscar Cañete; Mengfan Wu; Mengfan Wu; Tobias Neher; Tobias Neher; Torsten Dau; Torsten Dau; Sébastien Santurette; Sébastien Santurette
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains raw and processed data used and described in:

R. Sanchez-Lopez, S.G. Nielsen, M. El-Haj-Ali, F. Bianchi, M, Fereckzowski, O. Cañete, M. Wu, T. Neher, T. Dau and S. Santurette (under review). ``Auditory tests for characterizing hearing deficits in listeners with various hearing abilities: The BEAR test battery,''. submitted to Frontiers in Neuroscience

[Preprint available in medRxiv: https://doi.org/10.1101/2020.02.17.20021949]

One aim of the Better hEAring Rehabilitation (BEAR) project is to define a new clinical profiling tool, a test-battery, for individualized hearing loss characterization. Whereas the loss of sensitivity can be efficiently assessed by pure-tone audiometry, it still remains a challenge to address supra-threshold hearing deficits using appropriate clinical diagnostic tools. In contrast to the classical attenuation-distortion model (Plomp, 1986), the proposed BEAR approach is based on the hypothesis that any listener’s hearing can be characterized along two dimensions reflecting largely independent types of perceptual distortions. Recently, a data-driven approach (Sanchez-Lopez et al., 2018) provided evidence consistent with the existence of two independent sources of distortion, and thus different auditory profiles. Eleven tests were selected for the clinical test battery, based on their feasibility, time efficiency and related evidence from the literature. The proposed tests were divided into five categories: audibility, speech perception, binaural-processing abilities, loudness perception, and spectro-temporal resolution. Seventy-five listeners with symmetric, mild-to-severe sensorineural hearing loss were selected from a clinical population of hearing-aid users. The participants completed all tests in a clinical environment and did not receive systematic training for any of the tasks. The analysis of the results focused on the ability of each test to pinpoint individual differences among the participants, relationships among the different tests, and determining their potential use in clinical settings. The results might be valuable for hearing-aid fitting and clinical auditory profiling.

Please cite this article when using the data

The Dataset BEAR3 has also been used in:

Sanchez-Lopez R, Fereczkowski M, Neher T, Santurette S, Dau T. Robust Data-Driven Auditory Profiling Towards Precision Audiology. Trends in Hearing. January 2020. doi:10.1177/2331216520973539

Sanchez-Lopez, R., Fereczkowski, M., Neher, T., Santurette, S., & Dau, T. (2020). Robust auditory profiling: Improved data-driven method and profile definitions for better hearing rehabilitation. Proceedings of the International Symposium on Auditory and Audiological Research, 7, 281-288. Retrieved from https://proceedings.isaar.eu/index.php/isaarproc/article/view/2019-32

and

Sanchez Lopez, R., Nielsen, S. G., Cañete, O., Fereczkowski, M., Wu, M., Neher, T., Dau, T., & Santurette, S. (2019). A clinical test battery for Better hEAring Rehabilitation (BEAR): Towards the prediction of individual auditory deficits and hearing-aid benefit. In Proceedings of the 23rd International Congress on Acoustics (pp. 3841-3848). Deutsche Gesellschaft für Akustik e.V.. https://doi.org/10.18154/RWTH-CONV-239177

Description of the files:

BEAR2.xlsx: Anonymized raw data obtained using the BEAR test battery.

BEAR2_YNH.xlsx: Additional anonymized raw data obtained using the BEAR test battery with young normal-hearing listeners.

BEAR3.xlsx: Anonymized processed data for statistical data analysis.

BEAR3_Results_AProfiling.xlsx: BEAR3 dataset including the profiles, probabilities to belong to each of the four profiles and estimated degree of Distortion type-I and Distortion type-II.

BEAR_Reliability.xlsx: Anonymized raw data similar to BEAR2 for the reliability study.

DataParticipants.xlsx: Anonymized basic data associated with the participants: Gender, Age, PTA, etc.

TestBatteryMethods_v1.1.pdf: Documentation of the test methods. Protocol included and corrections.

Reliability_v1.0.pdf: Detailed explanation about the test-retest reliability study carried out with a subset of the participants.

* The participant IDs in each of the files has been assigned randomly to ensure the anonymization of the data. The pseudo-anonymized data might be shared under request by direct correspondence with the authors.
D
Data Masking Market Report
promarketreports.com
doc, pdf, ppt
Updated Dec 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pro Market Reports (2024). Data Masking Market Report [Dataset]. https://www.promarketreports.com/reports/data-masking-market-8810
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Dec 26, 2024
Dataset authored and provided by
Pro Market Reports
License
https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Data Masking Market can be segmented into various product categories, including:Type: Dynamic data masking, static data masking, and tokenization.Component: Software, services, and appliances.Business Function: Sales and marketing, human resources, legal, finance, and operations. Recent developments include: Sept 2020: Atlantech Online announced they had lit Anthem Row with fiber. The tenants on 700 K Street, NW, and 800 K Street, can now enjoy high-speed Internet bandwidth at affordable prices. Atlantech's Hosted PBX Service service can be utilized by tenants adding to the company's legacy., Oct 2020: Vonage has joined forces with Hacktoberfest to promote and honor contributions made to the Open Source community. As part of their collaboration, Vonage will provide access to their GitHub repositories, code snippets, and demos, supporting and encouraging developers in their Open Source endeavors. Key drivers for this market are: The growing use of cloud computing and big data analytics has expanded the need for secure data handling practices, . Potential restraints include: Slow Adoption Rate Of Machine Learning, Deep Learning And Neural Networks, Lack Of Technical Expertise In Complex Algorithm. Notable trends are: Increasing volume of data generated globally and the rising concerns about data breaches, cyber threats, and privacy regulations. .
The global Data Masking Market size is USD 18.43 billion in 2024 and will...
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated Jan 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2025). The global Data Masking Market size is USD 18.43 billion in 2024 and will expand at a compound annual growth rate (CAGR) of 18.51% from 2024 to 2031. [Dataset]. https://www.cognitivemarketresearch.com/data-masking-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Jan 15, 2025
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
According to Cognitive Market Research, the global Data Masking Market size will be USD 18.43 billion in 2024 and will expand at a compound annual growth rate (CAGR) of 18.51% from 2024 to 2031. Market Dynamics of Data Masking Market

Key Drivers for Data Masking Market

Increasing Data Breaches and Cybersecurity Threats- One of the main reasons for the Data Masking Market growth is the escalating frequency and sophistication of data breaches and cybersecurity threats that drive the demand for data masking solutions. By obfuscating sensitive information in non-production environments, data masking helps mitigate the risk of unauthorized access and data exposure, safeguarding organizations against potential security breaches and reputational damage. The compliance requirements for data privacy and protection drive masking are anticipated to drive the Data Masking market’s expansion in the years ahead.

Key Restraints for Data Masking Market

The compliance complexities hinder data masking implementation in regulated industries. The challenges in maintaining data usability while ensuring effective masking impact the market growth.

Introduction of the Data Masking Market

Data masking is the increasing emphasis on data privacy and regulatory compliance. With stringent data protection regulations such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), organizations are under pressure to safeguard sensitive information from unauthorized access and disclosure. Data masking techniques enable organizations to anonymize or pseudonymize sensitive data while preserving its utility for testing, development, or analytics purposes. As the consequences of data breaches and non-compliance become more severe, businesses across industries are investing in data masking solutions to mitigate risks, maintain regulatory compliance, and protect their reputation, thus driving the growth of the data masking market.
Consensual videos of potentially re-identifiable individuals recorded at the...
zenodo.org
data.niaid.nih.gov
csv, pdf, txt, zip
Updated Jul 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vivien Geenen; Till Riedel; Till Riedel; Vivien Geenen (2024). Consensual videos of potentially re-identifiable individuals recorded at the Autonomous Driving Test Area Baden-Württemberg (raw images with location and IMU data). [Dataset]. http://doi.org/10.5281/zenodo.7805961
Explore at:
csv, zip, txt, pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7805961
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Vivien Geenen; Till Riedel; Till Riedel; Vivien Geenen
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Area covered
Baden-Württemberg
Description
For the purpose of research on data intermediaries and data anonymisation, it is necessary to test these processes with realistic video data containing personal data. For this purpose, the Treumoda project, funded by the German Federal Ministry of Education and Research (BMBF), has created a dataset of different traffic scenes containing identifiable persons.

This video data was collected at the Autonomous Driving Test Area Baden-Württemberg. On the one hand, it should be possible to recognise people in traffic, including their line of sight. On the other hand, it should be usable for the demonstration and evaluation of anonymisation techniques.

The legal basis for the publication of this data set the consent given by the participants as documented in the file Consent.pdf (all purposes) in accordance with Art. 6 1 (a) and Art. 9 2 (a) GDPR. Any further processing is subject to the GDPR.

We make this dataset available for non-commercial purposes such as teaching, research and scientific communication. Please note that this licence is limited by the provisions of the GDPR. Anyone downloading this data will become an independent controller of the data. This data has been collected with the consent of the identifiable individuals depicted.

Any consensual use must take into account the purposes mentioned in the uploaded consent forms and in the privacy terms and conditions provided to the participants (see Consent.pdf). All participants consented to all three purposes, and no consent was withdrawn at the time of publication. KIT is unable to provide you with contact details for any of the participants, as we have removed all links to personal data other than that contained in the published images.

Global Video Anonymization Market Research Report: By Technology (Software,...

wiseguyreports.com

Updated Aug 10, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

wWiseguy Research Consultants Pvt Ltd (2024). Global Video Anonymization Market Research Report: By Technology (Software, Hardware, Cloud-based), By Deployment (On-premises, Cloud), By End User (Media and entertainment, Healthcare, Financial services, Government), By Anonymization Technique (Face blurring, Object redaction, Voice modulation, Background replacement) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/cn/reports/video-anonymization-market

Explore at:

Dataset updated

Aug 10, 2024

Dataset authored and provided by

wWiseguy Research Consultants Pvt Ltd

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Jan 8, 2024

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2024
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2023	617.59(USD Billion)
MARKET SIZE 2024	706.71(USD Billion)
MARKET SIZE 2032	2077.2(USD Billion)
SEGMENTS COVERED	Technology ,Deployment ,End User ,Anonymization Technique ,Regional
COUNTRIES COVERED	North America, Europe, APAC, South America, MEA
KEY MARKET DYNAMICS	1 Growing demand for data privacy 2 Advancements in AI and facial recognition 3 Increase in video surveillance 4 Regulatory compliance 5 Expansion of cloudbased video anonymization solutions
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	Microsoft ,Fourmilab ,Proofpoint ,LogRhythm ,SAS Institute ,FSecure ,Intermedia ,One Identity ,BeenVerified ,Oracle ,Image Scrubber ,IBM ,Splunk ,Axzon ,Digital Shadows
MARKET FORECAST PERIOD	2025 - 2032
KEY MARKET OPPORTUNITIES	1 Growing adoption of video surveillance systems 2 Increasing demand from law enforcement and security agencies 3 Rising concerns over data privacy and security 4 Government regulations and compliance requirements 5 Advancements in AI and machine learning technologies
COMPOUND ANNUAL GROWTH RATE (CAGR)	14.43% (2025 - 2032)

h
pii-masking-43k
huggingface.co
Updated Jul 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ai4Privacy (2023). pii-masking-43k [Dataset]. http://doi.org/10.57967/hf/0824
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57967/hf/0824
Dataset updated
Jul 1, 2023
Dataset authored and provided by
Ai4Privacy
Description
Purpose and Features

The purpose of the model and dataset is to remove personally identifiable information (PII) from text, especially in the context of AI assistants and LLMs. The model is a fine-tuned version of "Distilled BERT", a smaller and faster version of BERT. It was adapted for the task of token classification based on the largest to our knowledge open-source PII masking dataset, which we are releasing simultaneously. The model size is 62 million parameters. The… See the full description on the dataset page: https://huggingface.co/datasets/ai4privacy/pii-masking-43k.
d
Updated PTSS dataset for the FORAS project - Dataset - B2FIND
b2find.dkrz.de
Updated Feb 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Updated PTSS dataset for the FORAS project - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/76e82de6-ce29-5f10-8df7-a0fe1a17c489
Explore at:
Dataset updated
Feb 5, 2025
Description
This updated labeled dataset builds upon the initial systematic review by van de Schoot et al. (2018; DOI: 10.1080/00273171.2017.1412293), which included studies on post-traumatic stress symptom (PTSS) trajectories up to 2016, sourced from the Open Science Framework (OSF). As part of the FORAS project - Framework for PTSS trajectORies: Analysis and Synthesis (funded by the Dutch Research Council, grant no. 406.22.GO.048 and pre-registered at PROSPERO under ID CRD42023494027), we extended this dataset to include publications between 2016 and 2023. In total, the search identified 10,594 de-duplicated records obtained via different search methods, each published with their own search query and result: Exact replication of the initial search: OSF.IO/QABW3 Comprehensive database search: OSF.IO/D3UV5 Snowballing: OSF.IO/M32TS Full-text search via Dimensions data: OSF.IO/7EXC5 Semantic search via OpenAlex: OSF.IO/M32TS Humans (BC, RN) and AI (Bron et al., 2024) have screened the records, and disagreements have been solved (MvZ, BG, RvdS). Each record was screened separately for Title, Abstract, and Full-text inclusion and per inclusion criteria. A detailed screening logbook is available at OSF.IO/B9GD3, and the entire process is described in https://doi.org/10.31234/osf.io/p4xm5. A description of all columns/variables and full methodological details is available in the accompanying codebook. Important Notes: Duplicates: To maintain consistency and transparency, duplicates are left in the dataset and are labeled with the same classification as the original records. A filter is provided to allow users to exclude these duplicates as needed. Anonymized Data: The dataset "...._anonymous" excludes DOIs, OpenAlex IDs, titles, and abstracts to ensure data anonymization during the review process. The complete dataset, including all identifiers, is uploaded under embargo and will be publicly available on 01-10-2025. This dataset serves not only as a valuable resource for researchers interested in systematic reviews of PTSS trajectories and facilitates reproducibility and transparency in the research process but also for data scientists who would like to mimic the screening process using different machine learning and AI models.
g
Schooling data from the University of Paris 13
gimi9.com
data.europa.eu
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Schooling data from the University of Paris 13 [Dataset]. https://gimi9.com/dataset/eu_58e34f7dc751df5d2777388c
Explore at:
Description
This is a dataset updated annually the description below relates to the first year of online release, since updates have taken place in 2018 (data 2008-2017) and 2019 (data 2009-2018). Paris 13 University recorded data on student registration in its information system (Apogee software) for each academic year between 2006(-2007) and 2015(-2016). These data relate to the diplomas prepared, the steps to achieve this, the scheme (if it concerns initial training or apprenticeship), the relevant components (UFR, IUT, etc.), and the origin of students (type of baccalaureate, academy of origin, nationality). Each entry concerns the main enrollment of a student at the university for a year. The attributes of this data are as follows. — CODE_INDIVIDU Hidden Data — ANNEE_INSCRIPTION Year of registration:2006 for 2006-2007, etc. — LIB_DIPLOME Diploma Name — LEVEAU_DANS_LE_DIPLOME 1, 2,... for master 1, license 2, etc. — LEVEAU_APRES_BAC 1, 2,... for Bac+ 1, Bac+ 2,... — LIBELLE_DISCIPLINE_DIPLOME Attachment of the diploma to a discipline — CODE_SISE_DIPLOME Student Tracking Information System Code — CODE_ETAPE Internal code of a stage (year, course) of diploma — LIBELLE_COURT_ETAPE Short name of step — LIBELLE_LONG_ETAPE More intelligible name of the step — LIBELLE_COURT_COMPOSANT Name of component (UFR, IUT etc.) — CODE_COMPOSANT Number code of component (unused) — REGROUPEMENT_BAC Type of Bac (L, ES, S, techno STMG, techno ST2S,...) — LIBELLE_ACADEMIE_BAC Academy of Bac (Creteil, Versailles, foreigner,...) — Continent Deduced of nationality which is masked data — LIBELLE_REGIME Initial training, continuing, pro, learning Paris 13 University publishes part of this dataset through several resources, while respecting the anonymity of its students. Starting from 213,289 entries that correspond to all enrolments of the 106,088 individuals who studied at Paris 13 University during the ten academic years between 2006(2007) and 2015(-2016), we selected several resources each corresponding to a part of the data. To produce each resource we chose a small number of attributes, then removed a small proportion of the inputs, in order to satisfy a k-anonymisation constraint with k = 5, i.e. to ensure that, in each resource, each entry appears at least 5 times identical (otherwise the input is deleted). The four resources produced are materialised by the following files. — The file ‘up13_etapes.csv’ concerns the diploma steps, it contains the attributes “CODE_ETAPE”, “LIBELLE_COURT_ETAPE”, “LIBELLE_LONG_ETAPE”, “NIVEAU_APRES_BAC”, “LIBELLE_COURT_COMPOSANTE”, “LIBELLE_DISCIPLINE_DIPLOME”, “CODE_SISE_DIPLOME”, “NIVEAU_DANS_LE_DIPLOME” and its anonymisation causes a loss of 918 entries. — The file ‘up13_Academie.csv’ concerns the Bac Academy and it contains the attributes “LIBELLE_ACADEMIE_BAC”, “NIVEAU_APRES_BAC”, “NIVEAU_DANS_DIPLOME”, “CONTINENT”, “LIBELLE_REGIME”, “LIB_DIPLOME”, “LIBELLE_COURT_COMPOSANTE” and its anoymisation causes the loss of 7525 entries. — The file ‘up13_Bac.csv’ concerns the type of Bac and the level reached after the Bac, it contains the columns “REGROUPEMENT_BAC”, “NIVEAU_APRES_BAC”, “LIBELLE_REGIME”, “CONTINENT”, “LIBELLE_COURT_COMPOSANTE”, “LIB_DIPLOME”, “NIVEAU_DANS_LE_DIPLOME” and its anonymisation causes the loss of 3,933 entries. — The file ‘up13_annees_etapes.csv’ concerns enrolment in the diploma stages year after year, it contains the columns “ANNEE_INSCRIPTION”, “LIBELLE_COURT_COMPOSANTE”, “NIVEAU_APRES_BAC”, “LIB_DIPLOME”, “CODE_ETAPE” and its anonymisation causes the loss of 3,532 entries. Other tables extracted from the same initial data and constructed using the same method of anonymisation can be provided on request (specify the desired columns). A second set of resources offers the follow-up of students year after year, from degree stage to degree stage. In this dataset, we call trace such tracking when the registration year has been forgotten and only the sequence remains. And we call cursus a data describing this succession of steps over the years. For anonymisation we have grouped the traces or the same paths and as soon as there were less than 10 we do not indicate their number, or, what amounts to the same, we put this number to 1 (the information being that there is at least one student who left this trace or followed this course). This leads to forgetting a number of too specific study paths and keeping only one as a witness. Starting from 106,088 trails or tracks, we produce the following resources. — The file ‘up13_traces.csv’ contains the sequence of diploma step codes (a trace) and anonymisation makes us forget 10 089 traces. — The file ‘up13_traces_wt_etape.csv’ contains similar traces, but without the step code. That is to say, only the diploma, the level after baccalaureate and the component concerned remain. Anonymisation makes us forget 4,447 traces. — The file ‘up13_traces_bac_wt_etape.csv’ contains the same data as in the file ‘up13_traces_wt_etape.csv’ but also with the Bac type. Anonymisation makes us forget 8,067 traces. — The file ‘up13_cursus_wt_etape.csv’ contains the same data as in the file ‘up13_traces_wt_etape.csv’ with the additional registration years. Anonymisation makes us forget 8,324 courses.
Global Sound Masking Systems Market Industry Best Practices 2025-2032
statsndata.org
excel, pdf
Updated Feb 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stats N Data (2025). Global Sound Masking Systems Market Industry Best Practices 2025-2032 [Dataset]. https://www.statsndata.org/report/sound-masking-systems-market-5508
Explore at:
excel, pdfAvailable download formats
Dataset updated
Feb 2025
Dataset authored and provided by
Stats N Data
License
https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order
Area covered
Global
Description
The Sound Masking Systems market has emerged as a critical segment within the broader acoustic solutions industry, serving to improve speech privacy and enhance comfort in various environments, particularly in open office spaces, healthcare facilities, and educational institutions. Sound masking technology works by

Facebook

Twitter

Click to copy link

Link copied

Cite

Market Research Forecast (2025). Data De-identification and Pseudonymity Software Report [Dataset]. https://www.marketresearchforecast.com/reports/data-de-identification-and-pseudonymity-software-30730

Data De-identification and Pseudonymity Software Report

Explore at:

ppt, doc, pdfAvailable download formats

Dataset updated

Mar 9, 2025

Dataset authored and provided by

Market Research Forecast

License

https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

Time period covered

2025 - 2033

Area covered

Global

Variables measured

Market Size

Description

The Data De-identification and Pseudonymization Software market is experiencing robust growth, projected to reach $1941.6 million in 2025 and exhibiting a Compound Annual Growth Rate (CAGR) of 7.3%. This expansion is driven by increasing regulatory compliance needs (like GDPR and CCPA), heightened concerns regarding data privacy and security breaches, and the burgeoning adoption of cloud-based solutions. The market is segmented by deployment (cloud-based and on-premises) and application (large enterprises and SMEs). Cloud-based solutions are gaining significant traction due to their scalability, cost-effectiveness, and ease of implementation, while large enterprises dominate the application segment due to their greater need for robust data protection strategies and larger budgets. Key market players include established tech giants like IBM and Informatica, alongside specialized providers such as Very Good Security and Anonomatic, indicating a dynamic competitive landscape with both established and emerging players vying for market share. Geographic expansion is also a key driver, with North America currently holding a significant market share, followed by Europe and Asia Pacific. The forecast period (2025-2033) anticipates continued growth fueled by advancements in artificial intelligence and machine learning for enhanced de-identification techniques, and the increasing demand for data anonymization across various sectors like healthcare, finance, and government. The restraining factors, while present, are not expected to significantly hinder the market’s overall growth trajectory. These limitations might include the complexity of implementing robust de-identification solutions, the potential for re-identification risks despite advanced techniques, and the ongoing evolution of privacy regulations necessitating continuous adaptation of software capabilities. However, ongoing innovation and technological advancements are anticipated to mitigate these challenges. The continuous development of more sophisticated algorithms and solutions addresses re-identification vulnerabilities, while proactive industry collaboration and regulatory guidance aim to streamline implementation processes, ultimately fostering continued market expansion. The increasing adoption of data anonymization across diverse sectors, coupled with the expanding global digital landscape and related data protection needs, suggests a positive outlook for sustained market growth throughout the forecast period.

Clear search

Close search

Google apps

Main menu

Data De-identification and Pseudonymity Software Report

Anonymize or Synthesize? – Privacy-Preserving Methods for Heart Failure...

A sample medical dataset.

Geospatial and Information Substitution and Anonymization Tool (GISA)

Data from: Summary of baseline characteristics.

Data Masking Software Report

pone.0285212.t004 - A distributed computing model for big data anonymization...

Cloud Data Desensitization Report

Data from: S1 Data -

Trust in the government on the use of data for the StopCovid app in France...

Livelihoods Programme Monitoring Beneficiary Survey in 2017 - Chad

Abstract

Geographic coverage

Analysis unit

Kind of data

Sampling procedure

Sampling deviation

Mode of data collection

Research instrument

Cleaning operations

Response rate

Data from "Auditory tests for characterizing hearing deficits in listeners...

Data Masking Market Report

The global Data Masking Market size is USD 18.43 billion in 2024 and will...

Consensual videos of potentially re-identifiable individuals recorded at the...

Global Video Anonymization Market Research Report: By Technology (Software,...

pii-masking-43k

Updated PTSS dataset for the FORAS project - Dataset - B2FIND

Schooling data from the University of Paris 13

Global Sound Masking Systems Market Industry Best Practices 2025-2032

Data De-identification and Pseudonymity Software Report